I approach this a bit, I found that the two-step process is easier to use and use.
The first step is to download the data.
I use two staging tables.
Sort of:
staging_header id Integer Unique Primary Key run_number Integer Unique run_name String staging_data: id Integer Unique Primary Key staging_header_id Integer element1 String element2 String element3 String uploaded? Boolean
So, I load testimport.csv directly into these load tables that support multiple runs if you make a unique run_number number (sequence, etc.)
Now you have data in sql and are available in rails.
Now write the code to really populate the application tables from this download area.
It will also help you deal with the problem of speed. Rails will only insert a few records per second, so you want to be able to reload, pause, etc.
This will also help with validation. Initially, you just want to load data regardless of any restrictions (not null, unique, etc.).
After loading into the queue, you can be more selective and apply validations as you wish.
source share