Export data from Google App Engine to csv

This old answer points to a link to the Google App Engine Documentation , but this link now refers to backing up GAE data, not downloading.

So how to load all the data into csv? Data is small, i.e. 1 GB

+7
python google-app-engine csv
source share
2 answers

I tried several different export approaches to csv using the steps described here and here . But I could not work. So here is what I did (my largest table was about 2 GB). This works relatively fast, although it seems like a lot of steps ... better than fighting the random code that Google might have changed in a few hours too:

  • Go to Cloud Storage and create 2 new buckets "data_backup" and "data_export". You can skip this if you already have a bucket for storing things.
  • Go to My Console> Google Datastore> Admin> Open Datastore Admin for the data store you are trying to convert.
  • Disable the entity or objects that you want to create, and click Reserve Objects. I did it one at a time, since I had only 5 tables for export, and not for checking all 5 at once.
  • Specify the Google Storage bucket (gs) you want to save to
  • Now go to Google Big Query (I have never used this before, but it was a cake to go).
  • Click the down arrow and select "Create a new dataset" and give it a name.
  • Then click the down arrow next to the new dataset that you just created, and select Create New Table. Follow the steps to import by selecting "Back up your cloud storage" in the "Select Data" step. Then select any backup you want to import into Big Query so you can export it to csv in the next step.
  • After importing the table (which was pretty quick for mine), click the down arrow next to the table name and select "Export." You can directly export it to csv, and you can save it to the google storage you created for export, and then download from there.

Here are some suggestions:

  • If your data has a nested relationship, you will have to export it to JSON, not CSV (they also offer the avro format no matter what it is)
  • I used json2csv to convert exported JSON files that could not be saved as csv. It works a little slower on large tables, but does it.
  • I had to split a 2GB file into 2 files due to python memory error in json2csv. I used gsplit to separate files and disabled the option under Other Properties> Tags and Headers> Do Not Add Gsplit Tags ... (this made sure Gsplit did not add any data to the shared files).

As I said, it was pretty fast, although these are a few steps. Hope this helps someone avoid a heap of time spent converting weird backup file formats or running code that can no longer work.

+13
source share

You can use appcfg.py to download Kind data in csv format.

$ appcfg.py download_data --help

Usage: appcfg.py [options] download_data p>

Loading objects from the data warehouse.

The command 'download_data' downloads the data warehouse objects and writes their file in CSV format or by the developer.

+3
source share

All Articles