we need to export a csv file containing data from a model from admin Django that runs on Heroku. Therefore, we created an action in which we created csv and returned it in response. This worked fine until our client started exporting huge amounts of data and we ran into a 30 second timeout of a web worker.
To get around this problem, we thought about streaming csv to the client instead of first creating it in memory and sending it in one piece. The trigger was this information:
Cedar supports long polls and streaming responses. Your application has an initial 30 second window to respond with one byte back to the client. After each byte is sent (either received from> the client, or sent by your application), you reset the rolling 55 second window. If no data is sent within 55 seconds, your connection will be terminated.
So we checked something like this to check it out:
import cStringIO as StringIO import csv, time def csv(request): csvfile = StringIO.StringIO() csvwriter = csv.writer(csvfile) def read_and_flush(): csvfile.seek(0) data = csvfile.read() csvfile.seek(0) csvfile.truncate() return data def data(): for i in xrange(100000): csvwriter.writerow([i,"a","b","c"]) time.sleep(1) data = read_and_flush() yield data response = HttpResponse(data(), mimetype="text/csv") response["Content-Disposition"] = "attachment; filename=test.csv" return response
The HTTP download header looks like this (from FireBug):
HTTP/1.1 200 OK Cache-Control: max-age=0 Content-Disposition: attachment; filename=jobentity-job2.csv Content-Type: text/csv Date: Tue, 27 Nov 2012 13:56:42 GMT Expires: Tue, 27 Nov 2012 13:56:41 GMT Last-Modified: Tue, 27 Nov 2012 13:56:41 GMT Server: gunicorn/0.14.6 Vary: Cookie Transfer-Encoding: chunked Connection: keep-alive
"Transfer-encoding: chunked" means that Cedar actually transfers the streams that we assume.
The problem is that the csv download is still interrupted after 30 seconds by these lines in the Heroku log:
2012-11-27T13:00:24+00:00 app[web.1]: DEBUG: exporting tasks in csv-stream for job id: 56, 2012-11-27T13:00:54+00:00 app[web.1]: 2012-11-27 13:00:54 [2] [CRITICAL] WORKER TIMEOUT (pid:5) 2012-11-27T13:00:54+00:00 heroku[router]: at=info method=POST path=/admin/jobentity/ host=myapp.herokuapp.com fwd= dyno=web.1 queue=0 wait=0ms connect=2ms service=29480ms status=200 bytes=51092 2012-11-27T13:00:54+00:00 app[web.1]: 2012-11-27 13:00:54 [2] [CRITICAL] WORKER TIMEOUT (pid:5) 2012-11-27T13:00:54+00:00 app[web.1]: 2012-11-27 13:00:54 [12] [INFO] Booting worker with pid: 12
This should work conceptually, right? Is there something we missed?
We greatly appreciate your help. Tom