How to resume data migration from the place where the error occurred in ssis?

I am transferring data from an Oracle database to a SQL Server 2008 r2 database using SSIS. My problem is that at some point the package crashes, say, about 40,000 lines out of 100,000 lines. What can I do so that the next time I run the package after fixing errors or something else, I want it to be restarted from line 40,001 of the line, that is, the line where the error occurred.

I tried using a breakpoint in SSIS, but the problem is that they only work between different tasks in the control flow. I want something that can work with transmitted strings.

+4
source share
1 answer

There is no native magic, I know it will โ€œknowโ€ that it failed on line 40,000, and when it reboots, it should start streaming line 40,001. You are right that control points are not the answer and have many of their own problems (they cannot serialize object types, restart cycles, etc.).

How you can solve the problem with good design. If your package was created in anticipation of failure, you should be able to handle these scripts.

There are two approaches that I am familiar with. The first approach is to add a search conversion to the data stream between your source and your destination. The purpose of this is to determine which records exist in the target system. If no match is found, then only those lines will be sent to the destination. This is a very common pattern and will also allow you to detect changes between the source and destination (if necessary). The downside is that you will always transfer the full set of data from the source system and then filter the rows in the data stream. If it failed on line 99.999 out of 1,000,000, you still need to pass all 1,000,000 lines back to SSIS to find the one that was not sent.

Another approach is to use a dynamic filter in the WHERE clause of your source. If you can make assumptions about how rows are inserted in order, then you can structure your SSIS package to look like an Execute SQL Task , where you run a query like SELECT COALESCE(MAX(SomeId), 0) +1 AS startingPoint FROM dbo.MyTable in the Destination database, and then assign it to the SSIS variable (@ [User :: StartingId]). Then you use the expression in your source select statement to be something like "SELECT * FROM dbo.MyTable T WHERE T.SomeId > " + (DT_WSTR, 10) @[User::StartingId] Now that the data stream begins , it will start work from the moment of the last data loading. The task of this approach is to find those scenarios in which you know that the data was not inserted out of order.

Let me know if you have questions, you need something better to explain, pictures, etc. In addition, the above code is free, so there may be syntax errors, but the logic must be correct.

+3
source

All Articles