Why is it more efficient to use read memory as a stream or string?

We use HTTPClient to implement the REST API.

We read the server response using:

method = new PostMethod(url); HttpClient client = new HttpClient(); int statusCode = client.executeMethod(method); String responseBody = method.getResponseBodyAsString(); 

When we do this, we get this warning:

 Dec 9, 2009 7:41:11 PM org.apache.commons.httpclient.HttpMethodBase getResponseBody WARNING: Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended. 

docs keep saying:

HttpClient is capable of efficiently streaming request / response. Large subjects can be represented or received without buffering in memory. This is especially important if multiple HTTP methods can be executed simultaneously. As long as there are convenient ways to solve objects, such as strings or bytes of arrays, their use is not recommended. If they are not used carefully, they can easily lead to a violation of the memory conditions, since they involve buffering the entire object in memory.

So my question is: if you need a complete answer like String (i.e. stored in a database or parsed using the DOM), why is it more efficient to use a stream in a stream?

+7
java stream
source share
4 answers

It’s more efficient to use a stream rather than an entire object as a string, because the latter means that

  • the entire contents of the response must be read before it is returned to your code, and
  • The control cannot be returned to your code until the server sends the entire response.

If you have processed the response as a stream, then what you are actually doing is processing it N bytes at a time. This means that you can start processing the first response segment while the remote server is still sending back the next data segment. Therefore, it makes sense as an access method if your use case allows you to process data as it is received.

If you need the whole answer differently as a String, then the whole efficiency of the stream method has nothing to do with you - because even if you read the answer in pieces, you still need to wait for the whole answer - and all this is contained in one line - before than you can process it.

Stream utilization is only available to you if you have a use case where you can start processing the response before you have the whole body of the response.

+13
source share

The whole process is inefficient in working with memory. If you read from a stream and put it on a line, you simply split the process into two parts so that the HttpClient class does not notice it.

If you really need a whole line, you can ignore this warning. Then you need to make sure that it does not use too much memory for each request, so that the server cannot be easily knocked down by a DoS attack.

+4
source share

your question confuses the point.

if you absolutely need the whole answer as a string, then do it,

but if you can get away from it at all, use threads.

when you load the entire response into a string, the entire response body is present in memory at a time.

using threads, only a small portion of the response is stored in memory at a time.

The documentation says that, especially with several large queries at the same time, loading the entire request body into a string will require a large amount of memory.

+1
source share

If you understand org.w3c.Document (or, even better, org.jdom.Document ), it is very simple to use the stream directly. Example:

 org.apache.http.HttpResponse hr = httpClient.execute(httpRequest); org.apache.http.HttpEntity he = hr.getEntity(); org.jdom.input.SAXBuilder builder = new SAXBuilder(); org.jdom.Document document = builder.build(he.getContent()); 
0
source share

All Articles