Reading an HTTP Streaming Response Using the Python Request Library

Question

Reading an HTTP Streaming Response Using the Python Request Library

I am trying to use the event flow provided by Kubernetes . api using the requests module. I came across what looks like a buffering problem: the requests module seems to be lagging behind a single event.

I have some code that looks something like this:

 r = requests.get('http://localhost:8080/api/v1beta1/watch/services', stream=True) for line in r.iter_lines(): print 'LINE:', line

Since Kubernetes emits event notifications, this code will only display the last event that is issued when a new event arrives, which makes it almost completely useless for code that should respond to add / remove events.

I solved this by spawning curl in a subprocess instead of using the requests library:

 p = subprocess.Popen(['curl', '-sfN', 'http://localhost:8080/api/watch/services'], stdout=subprocess.PIPE, bufsize=1) for line in iter(p.stdout.readline, b''): print 'LINE:', line

This works, but at the cost of some flexibility. Is there a way to avoid this buffering issue using the requests library?

+7

python stream python-requests kubernetes

larsks Jan 25 '15 at 16:57

source share

1 answer

larsks · Answer 1 · 2015-01-26T18:09:37+0000

This behavior is due to a failure to implement the iter_lines method in the requests library.

iter_lines over the contents of the response in chunk_size blocks of data using the iter_content iterator. If there are fewer than chunk_size bytes of data available for reading from a remote server (which will usually be the case when reading the last line of output), the read operation will be blocked until chunk_size bytes of data are available.

I wrote my own iter_lines procedure, which works correctly:

 import os def iter_lines(fd, chunk_size=1024): '''Iterates over the content of a file-like object line-by-line.''' pending = None while True: chunk = os.read(fd.fileno(), chunk_size) if not chunk: break if pending is not None: chunk = pending + chunk pending = None lines = chunk.splitlines() if lines and lines[-1]: pending = lines.pop() for line in lines: yield line if pending: yield(pending)

This works because os.read will return less chunk_size bytes of data, rather than waiting for the buffer to fill.

Reading an HTTP Streaming Response Using the Python Request Library

More articles: