Consider the simple global WSGI hello program.
def application(environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output]
From the perspective of WSGI servers, there are two key phases in which the WSGI application processes the request.
The first is a call to the actual WSGI application, in which it returns the result.
Secondly, this is the action of the WSGI server to use the result, which requires that it be some form of repeated lines.
In the New Relic section, WSGI / Application tracks the time spent in the first step. "WSGI / Response" tracks the second phase and how much time is spent consuming lines from the returned iterative file.
To understand why WSGI / Response can show a lot of time, you need to understand a little deeper what is actually happening.
The first obvious thing is that after each line is obtained from iterable, the WSGI server must write this line back to the HTTP client making the request. He does not need to wait for the client to receive it, but he must do enough to ensure that delivery will continue in parallel after she proceeds to receive the next line from the iterable.
Thus, the time recorded in "WSGI / Response" covers not only the time spent on each item from the iterable returned by the WSGI application, but the time taken to re-respond the response to the HTTP client.
While the time taken to record the answer, several questions may arise.
It is that very large answers can take some time to write. That a slow client or network can make this time longer if the underlying web server blocks at any time. And finally, a WSGI application can make things worse if it produces a large number of small lines, rather than a single line or at least fewer large lines.
The worst-case scenario is a WSGI application in which there is an error, as a result of which it returns a string object as a result, not a list or iterable that gives the string. This is bad, because each individual character in the string will be written one at a time, with the corresponding appearance, so that it is written back to the client. This can cause an excessive amount of overhead and increase time.
When using Apache / mod_wsgi, warning messages should be included in the Apache error log if the WSGI application is failing. Some other WSGI servers, such as uWSGI, silently fix the error as an optimization, although technically this violates the WSGI specification and how it says that the result should be processed. A WSGI server quietly fixing this is bad practice because it gives a false sense of security that it is working fine, but when you upgrade to a WSGI server that conforms to the WSGI specification, performance degrades.
To determine which of these reasons may be the cause, the New Relic Python agent also writes its own metrics for each response about the number of bytes returned in the response and the number of individual lines. When you have a trace trace with a slow transaction, they will be shown in the trace summary under "WSGI / Output / Bytes" and "WSGI / Output / Calls / yield". There is also "WSGI / Output / Time", which records the time from the first byte sent to the last byte sent. If this helps to get an idea for them throughout the WSGI application, they can also be charted using a special panel.
Now, as above, another problem may also come into play, where the returned iterability is not just a list, but a generator or a custom iterative.
In this case, βWSGI / Responseβ also records the time that the WSGI application takes to generate each row. Therefore, if a WSGI application generates a response slowly because it somehow computes it on demand, it can also lead to an increase in the time recorded in the WSGI / Response section.
Thus, in general, there are many events that are recorded in the "WSGI / Response" section, the main things are:
- The time taken to ensure that all lines make up the response from the iterable returned by the WSGI application.
- The time it took to send a response to the HTTP client that made the request.
In some cases, especially with WSGI applications that transmit to the server or give very large responses, this time included in the response time can be a problem and distort the average values ββfor the response time of the entire web application.
Using percentile representations for response times rather than averages can help isolate these outliers, but in other cases a little extra work may be required.
In these special cases, what can be done is to use the Python API API in the handlers for such affected web transactions to prematurely stop recording the transaction so as not to calculate the excessive amount of time it takes to deliver the response. There are also options to completely turn off monitoring for specific web transactions, or web transactions can be marked to be recorded as a background task, thereby eliminating any impact that they have on average Internet transactions for response time.
The functions of the three API agents for these purposes:
When using Apache / mod_wsgi, you also have the option to apply them through the Apache configuration file. For example, you can flag specific URLs to ignore using:
<Location /some/sub/url> SetEnv newrelic.ignore_transaction true </Location>
As for improving the visibility of what is happening. If time is associated with large responses or slow HTTP clients, you may not be much more than looking at recording transaction metrics compared to slow transaction samples for output bytes, the number of calls to output, etc.
If the iterable is actually a generator that does the work, then you can get a better idea using the key transactions and X-Ray sessions in New Relic to get a dump of the stream profile. This will help you narrow down your time. For better visibility, you can apply additional additional function traces to your code, so that additional functions called during the WSGI application are also tracked and shown during a performance breakdown.