SSIS reading LF as a terminator when it is set as CRLF

using SSIS 2012. My file connection manager for files I have a delimited file where the line separator is CRLF , but when it processes the file, I have a text column with LF . This forces him to read that as a line terminator calling it. Any ideas?

+7
sql-server flat-file ssis etl ssis-2012
source share
4 answers

Thanks for all the suggestions. it turned out that the provider changed the encoding of the file from Ascii to unicode. changing the package to read the correct coding did the trick.

+1
source share

Before answering, I do not think that the column contains only LF , because if the line separator is CRLF , it will not consider it as a separator. So this is probably CRLF , but I will give a solution for two cases (CRLF or LF)

Decision

You can fix this situation with the following steps:

  • First, in the flat file connection manager, add only one column (type DT_STR and length 4000 ), so you will consider each row as one column.
  • In the data flow task, you must add a Script component that corrects the file structure. and divide the row into columns.

Simple test

I will review a flat file with the following contents

 ID;name;DOB;Notes;ClassID{CRLF} 1;John;2001-01-01;;1{CRLF} 2;Moh;2002-01-01;Very cool{LF} Genius;2{CRLF} 3;Ali;2000-01-01;Calm;2{CRLF} 
  • First I will add a flat file connection manager with the following parameters:
    • Line Separator = {CRLF}
    • Headline Separator = {CRLF}

enter image description here

  1. In the DataFlow Task, I will add Flat File Source , 2 x Script Component , OLEDB Destination

  2. In the first component of the Script, I will mark Column0 as an input, and I will add 5 columns of output ID,Name,DOB,Notes,ClassID , and I will set the synchronous output to None

enter image description here

  1. In the first component of Script, I will write a Script that stores each line in a memory variable and assigns it to the output line when the line is complete and the other line is present.

     Dim strLine As String = String.Empty Dim strDelimiter As String = ";" Public Sub EmptyMemoryVariables() strLine = String.Empty End Sub Public Sub AssignMemoryVariablesToOutput() With Output0Buffer .AddRow() .NewRow = strLine End With End Sub Public Function AreVariablesEmpty() As Boolean If strLine = "" Then Return True Else Return False End If End Function Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer) Dim strColumns As String() = Row.Column0.Split(CChar(strDelimiter)) If strColumns.Length = 5 Then If Not AreVariablesEmpty() Then AssignMemoryVariablesToOutput() EmptyMemoryVariables() End If strLine = Row.Column0 AssignMemoryVariablesToOutput() EmptyMemoryVariables() Else If strLine.Split(CChar(strDelimiter)).Length = 5 Then AssignMemoryVariablesToOutput() EmptyMemoryVariables() End If strLine &= Row.Column0 End If 
  2. In the second Script COmponent, I will split each row into columns

enter image description here

  Dim strDelimiter As String = ";" Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer) Dim strColumns As String() = Row.NewRow.Split(CChar(strDelimiter)) Row.ID = strColumns(0) Row.NAME = strColumns(1) Row.DOB = strColumns(2) Row.NOTES = strColumns(3) Row.CLASSID = strColumns(4) End Sub 

Important note: the code provided is not optimal, it may require more checks or it may be simpler and better, but I'm trying to give you what you might think in order to solve this problem.

+3
source share

I have no experience with SSIS, but as an ETL developer I have come across this many times. Therefore, my suggestions may not help you solve the problem, but hopefully point you in the right direction.

  • If the problem field contains a text qualifier (single or double quotation mark usually), and SSIS supports its use.
  • Also, if there is an option to force SSIS to use the other end of the record separator other than LF (CRLF in this case), I would use it (I hope there is no CRLF in the text of the problem field)
  • If the problem field is not the last field, you can count the number of de-delimiters by reading the entire record as a separate field with LF delimiters to identify and filter out problem records (if there are few) and try to stitch them together.
  • If possible, read the file as a single record (if SSIS has an option) and replace all LF if CR is the agreed end of the record separator from the source
+2
source share

In your component of the file connection manager you have a property in which I forgot its name, in it you can set the line separator ( {CR}{LF} , {LF} , {CR} , ... etc. )

Try setting this property. I think it will work.

0
source share

All Articles