Reading in a file using google datalab

I am trying to use Google Datalab to read in a file on ipython laptop, base pd.read_csv (), since I cannot find the file path. I use it locally and also uploaded it to Google Cloud Storage in a bucket.

I ran the following commands to figure out where I

os.getcwd() 

gives '/ content / myemail@gmail.com '

 os.listdir('/content/ myemail@gmail.com ') 

gives ['.git', '.gitignore', 'datalab', 'Hello World.ipynb', '.ipynb_checkpoints']

+6
source share
2 answers

The following reads the contents of the object in the string variable text :

 %%storage read --object "gs://path/to/data.csv" --variable text 

Then

 from cStringIO import StringIO mydata = pd.read_csv(StringIO(text)) mydata.head() 

We hope that Pandas will support the "gs://" URLs (as it does for s3:// at present, to allow reading directly from the Google Cloud storage).

I found the following documents really useful:

https://github.com/GoogleCloudPlatform/datalab/tree/master/content/datalab/tutorials

Hope this helps (just start with Datalab too, so maybe someone will have a cleaner method soon).

+9
source

You can also run BigQuery queries directly against CSV files in the cloud by creating a FederatedTable wrapper object. This is described here:

https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Using%20External%20Tables%20from%20BigQuery.ipynb

+1
source

All Articles