First of all, my solution is based on @DrHouseofSQL and @Bhouse answers, so you should read @DrHouseofSQL answer first, then @BHouse answer, then continue this answer.
Problem
Note: the page name will be dynamic, and the column position may change (for example: the column "ABC may exist in the first or second row or ...
This situation is a bit complicated and can be solved using the following workaround:
Solution Overview
- Add a script task before a data flow task that imports data
- You need to use a script task to open the excel file and get the Worksheet name and title bar
- Create a query and save it in a variable
- , ( ,
Delay Validation true)
- SSIS (.. @[User:: strQuery])
- , Excel (, @[User:: ExcelFilePath])
- script
@[User::strQuery] ReadWrite @[User::ExcelFilePath] ReadOnly Variable ( script) - script Language VB.Net script script:
: System.Data.OleDb
15 excel, , , 15 . , A I
m_strExcelPath = Dts.Variables.Item("ExcelFilePath").Value.ToString
Dim strSheetname As String = String.Empty
Dim intFirstRow As Integer = 0
m_strExcelConnectionString = Me.BuildConnectionString()
Try
Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)
If OleDBCon.State <> ConnectionState.Open Then
OleDBCon.Open()
End If
'Get all WorkSheets
m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
New Object() {Nothing, Nothing, Nothing, "TABLE"})
'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones
For Each schRow As DataRow In m_dtschemaTable.Rows
strSheetname = schRow("TABLE_NAME").ToString
If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then
Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "A1:I15]", OleDBCon)
Dim dtTable As New DataTable("Table1")
cmd.CommandType = CommandType.Text
Using daGetDataFromSheet As New OleDbDataAdapter(cmd)
daGetDataFromSheet.Fill(dtTable)
For intCount As Integer = 0 To 15
If Not String.IsNullOrEmpty(dtTable.Rows(intCount)(0).ToString) Then
'+1 because datatable is zero based indexed, +1 because we want to start from the second row
intFirstRow = intCount + 2
End If
Next
End Using
If intFirstRow = 0 Then Throw New Exception("header not found")
End Using
'when the first correct sheet is found there is no need to check others
Exit For
End If
Next
OleDBCon.Close()
End Using
Catch ex As Exception
Throw New Exception(ex.Message, ex)
End Try
Dts.Variables.Item("strQuery").Value = "SELECT * FROM [" & strSheetname & "A" & intFirstRow.ToString & ":I]"
Dts.TaskResult = ScriptResults.Success
End Sub
- Excel excel, ( , )
Select * from [Sheet1$A2:I] @[User::strQuery]- Excel, SQL
@[User::strQuery] - , @BHouse
@BHouse
- DataFlow Task
Delay Validation True - DataFlow
1:
OP: sometimes excel with empty data will come.(i.e) we have only header row not not data... in that case it fails entire task
:
excel ( ), :
. script:
m_strExcelPath = Dts.Variables.Item("ExcelFilePath").Value.ToString
Dim strSheetname As String = String.Empty
Dim intFirstRow As Integer = 0
m_strExcelConnectionString = Me.BuildConnectionString()
Try
Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)
If OleDBCon.State <> ConnectionState.Open Then
OleDBCon.Open()
End If
'Get all WorkSheets
m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
New Object() {Nothing, Nothing, Nothing, "TABLE"})
'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones
For Each schRow As DataRow In m_dtschemaTable.Rows
strSheetname = schRow("TABLE_NAME").ToString
If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then
Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "A1:I15]", OleDBCon)
Dim dtTable As New DataTable("Table1")
cmd.CommandType = CommandType.Text
Using daGetDataFromSheet As New OleDbDataAdapter(cmd)
daGetDataFromSheet.Fill(dtTable)
For intCount As Integer = 0 To 15
If Not String.IsNullOrEmpty(dtTable.Rows(intCount)(0).ToString) Then
'+1 because datatable is zero based indexed, +1 because we want to start from the second row
intFirstRow = intCount + 2
End If
Next
End Using
End Using
'when the first correct sheet is found there is no need to check others
Exit For
End If
Next
OleDBCon.Close()
End Using
Catch ex As Exception
Throw New Exception(ex.Message, ex)
End Try
If intFirstRow = 0 OrElse _
intFirstRow > dtTable.Rows.Count Then
Dts.Variables.Item("ImportFile").Value = False
Else
Dts.Variables.Item("ImportFile").Value = True
End If
Dts.Variables.Item("strQuery").Value = "SELECT * FROM [" & strSheetname & "A" & intFirstRow.ToString & ":I]"
Dts.TaskResult = ScriptResults.Success
End Sub
2:
OP: is there any other work around available to process the data flow task without skipping all data flow task,Actually one of the task will log the filename and data count and all, which are missing here
:
- DATA FLOW
- script
@[User::ImportFile] == False ( ) - DataFlow script
- ,
- , ,
Data Flow Task Execute SQL Task,