Skipping rows when importing Excel to SQL using SSIS 2008

I need to import sheets that look like this:

March Orders ***Empty Row Week Order # Date Cust # 3.1 271356 3/3/10 010572 3.1 280353 3/5/10 022114 3.1 290822 3/5/10 010275 3.1 291436 3/2/10 010155 3.1 291627 3/5/10 011840 

The column headings are actually row 3. I can use Excel Sourch to import them, but I don’t know how to indicate that the information starts on row 3.

I solved the problem, but came up empty.

+6
sql-server-2008 excel ssis
source share
4 answers

look:

links have more detailed information, but I have included some texts from pages (just in case, links remain dead)

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/97144bb2-9bb9-4cb8-b069-45c29690dfeb

IN:

While we are uploading a text file to SQL Server via SSIS, we have a provision to skip any number of leading lines from the source and load the data on the SQL server. Is there any provision to do the same for an Excel file.

For me, the source Excel file has some description in the leading 5 lines, I want to skip it and start loading data from line 6. Please tell us about it.

BUT:

The easiest way would be to give each row a number (a bit like an identifier in SQL Server), and then use conditional split to filter everything where the number <= 5

http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/947fa27e-e31f-4108-a889-18acebce9217

Q:

  • Is it possible that while importing data from an Excel spreadsheet into a DB, skip the first six rows?

  • Excel data is also divided into sections with headers. Is it possible, for example, to skip every 12th line?

A:

  • YES, YOU CAN. In fact, you can do this very easily if you know the column numbers that will be imported from your Excel file. In your data flow task, you will need to set the "OpenRowset" Custom Property for your Excel connection (right-click your Excel connection) Properties; in the Properties window, find OpenRowset in the Custom Properties section). To ignore the first 5 rows in Sheet1 and import the AM columns, you must enter the following value for OpenRowset: Sheet1 $ A6: M (notification, I did not specify a row number for column M. You can enter a row number if you want, but in my case the number of lines can vary from one iteration to the next)

  • AGAIN, YES YOU CAN. You can import data using conditional split. You must set up a conditional split to look for something in each line that uniquely identifies it as a header line; skip lines that match this "header logic". Another option is to import all the rows, and then delete the header lines with the SQL script in the database ... like a cursor that deletes every 12th row. Or you could add an identification field with a seed / increment of 1/1, and then delete all lines with line numbers that are perfectly divisible by 12. Something like that ...

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/847c4b9e-b2d7-4cdf-a193-e4ce14986ee2

IN:

I have an SSIS package that imports data from an Excel file starting from the 7th row.

Unlike the same operation with the csv file ('Header Rows to Skip' in Connection Manager Editor), I cannot find a way to ignore the first 6 lines of the Excel file connection.

I assume that the answer may be in one of the data stream Transformation objects, but I am not very familiar with it.

BUT:

Question Log in and rate 1 Log in to vote rbhro, in fact there were 2 fields in the top 5 lines with some data that, it seems to me, were prevented by the importer completely ignoring these lines.

Anyway, I found a solution to my problem.

In my original Excel object, I used "SQL Command" as the "Data Access" Mode '(it pops up when I double-click on the Excel Source object). From there, I was able to create a query (button "Build query"), so that only the records that I need are enough. Something like this: SELECT F4, F5, F6 FROM [Spreadsheet $] WHERE (F4 IS NOT) AND (F4 <> 'TheHeaderFieldName')

Note. At first I tried ISNUMERIC instead of "IS NOT NULL", but which for some reason was not supported.

In my particular case, I was only interested in lines in which F4 was not NULL (and, fortunately, F4 did not contain any garbage in the first 5 lines). I could skip the entire title bar (line 6) with the second WHERE clause.

So, perfectly clean your data source. All I had to do was now add a data conversion object between the source and the target (everything needed to convert from unicode to table), and it worked.

+10
source share

My first suggestion is not to accept a file in this format. The Excel files that you want to import should always start with column header lines. Send it back to the person who provides it and let them know their format. This works most of the time.

We provide our customers and sellers with recommendations on how files should be formatted before we can process them, and they should be as consistent as possible. People often don’t know that such files pose a problem during processing (next month it may have six lines before entering data), and they need to be educated so that Excel files start with column headers, do not have empty lines in the middle of the data and without repeating the headings several times and, most importantly, they should have the same columns with the same column headings in the same order every time. If they cannot provide this, then you probably do not have something that will work for automatic import, since you will receive a file in differnt format each time depending on the mood of the person who supports the Excel spreadsheet. By the way, we try very hard to never get any data from Excel (it only works for a while, but if they have data in the database, they can usually be placed). They should also be aware that any changes they make to the spreadsheet format will lead to a change in the import package and that they will be charged for these development changes (provided that they are external clients, not internal). These changes must be notified in advance and the development time is planned, the file with the wrong format will fail and will be returned to them for correction, if not.

If this does not work, I can suggest you open the file, delete the first two lines and save the text file in the data stream. Then write a data stream that will process the text file. SSIS has done the disgusting job of supporting Excel, and all you can do to get the file in a different format will make life easier in the long run.

+1
source share

You can simply use the OpenRowset property, which you can find in the Excel source properties. See here for more details:

SSIS: read and export Excel data from nth string

Sincerely.

+1
source share

My first suggestion is not to accept a file in this format. The Excel files that you want to import should always start with column header lines. Send it back to the person who provides it and let them know their format. This works most of the time.

Not quite right.

SSIS forces you to use the format, and quite often it does not work correctly with excel

If you can't change the format, consider using our advanced ETL processor.

You can skip rows or fields, and you can check the data the way you want.

http://www.dbsoftlab.com/etl-tools/advanced-etl-processor/overview.html

Heaven is the limit

0
source share

All Articles