How to transfer a CSV file to Sqlite3 (or MySQL)? - Python

I use Python to save data line by line ... but it is very slow!

The CSV contains 70 million lines, and with my script I can just save 1 thousand seconds.


This is what my script looks like

reader = csv.reader(open('test_results.csv', 'r')) for row in reader: TestResult(type=row[0], name=row[1], result=row[2]).save() 

I believe that for testing, I might have to think about MySQL or PostgreSQL.

Any idea or tips? This is the first time I've come across such huge amounts of data. :)

+7
python django mysql sqlite csv
source share
2 answers

To import MySQL:

 mysqlimport [options] db_name textfile1 [textfile2 ...] 

To import SQLite3:

ref How to import a .sql or .csv file into SQLite?

+4
source share

I don't know if this will make a big enough difference, but since you are dealing with ORM Django, I can suggest the following:

  • Make sure DEBUG is False in your Django settings file, as otherwise you save each individual request in memory.
  • Put your logic in the main function and wrap it in the django.db.transactions.commit_on_success decorator. This will prevent the need for each row in its own transaction, which will significantly speed up the process.
  • If you know that all lines in the file do not exist in the database, add force_insert = True to your save () method call. This will halve the number of calls for sqlite.

These suggestions are likely to make a big difference if you find yourself using a client-server database engine.

+3
source share

All Articles