How to write a Pandas Dataframe model for Django

I used pandas in python and I usually write dataframe to my db table as below. Now I go to Django, how can I write the same data file to a table through the MyModel model? Help is really appreciated.

# Original pandas code engine = create_engine('postgresql://myuser: mypassword@localhost :5432/mydb', echo=False) mydataframe.to_sql('mytable', engine,if_exists='append',index=True) 
+6
source share
2 answers

Use your own pandas code next to the Django model that maps to the same SQL table

I do not know any explicit support for writing a pandas data frame to a Django model. However, in a Django application, you can still use your own code to read or write to the database, in addition to using ORM (for example, through your Django model).

And given that you most likely have data in the database previously written by pandas' to_sql , you can continue to use the same database and the same pandas code and just create a Django model that can access this table

eg. if your pandas code was written to mytable SQL mytable , just create this model:

 class MyModel(Model): class Meta: db_table = 'mytable' # This tells Django where the SQL table is managed = False # Use this if table already exists # and doesn't need to be managed by Django field_1 = ... field_2 = ... 

Now you can use this model from Django simultaneously with existing pandas code (possibly in one Django application)

Django Database Settings

To get the same database credentials in the pandas SQL function, just read the fields from the Django settings, for example:

 from django.conf import settings user = settings.DATABASES['default']['USER'] password = settings.DATABASES['default']['PASSWORD'] database_name = settings.DATABASES['default']['NAME'] # host = settings.DATABASES['default']['HOST'] # port = settings.DATABASES['default']['PORT'] database_url = 'postgresql://{user}:{password}@localhost:5432/{database_name}'.format( user=user, password=password, database_name=database_name, ) engine = create_engine(database_url, echo=False) 

The alternative is not recommended as it is ineffective

I really don’t see a way to read a data row row by row, and then instantiate the model and save it, which is very slow. You can get away with some insertion operations in the package, but why bother, since pandas' to_sql already does this for us. And reading Django requests into the pandas framework is simply inefficient when pandas can do it faster for us too.

 # Doing it like this is slow for index, row in df.iterrows(): model = MyModel() model.field_1 = row['field_1'] model.save() 
+6
source

I am experiencing the same exercise right now. The approach I made is to create a list of new objects from a DataFrame, and then bulk create them:

bulk_create (objs, batch_size = None)

This method effectively inserts the provided list of objects into the database ( usually only 1 query , regardless of the number of objects)

An example might look like this:

 # Not able to iterate directly over the DataFrame df_records = df.to_dict('records') model_instances = [MyModel( field_1=record['field_1'], field_2=record['field_2'], ) for record in df_records] MyModel.objects.bulk_create(model_instances) 
+5
source

All Articles