I would take:
- worst-case latency like T8 - T1 also accelerates
- processing time T6 - T3 also has a response time, since you can start processing from the first byte and still process to the last byte.
If you cannot start processing the message on the server until you get the last byte, you should also use the last byte for the delay, otherwise its inconsistency.
I would suggest that the server is more tuned for performance than the client, that is, it can start processing from the first package, but the client may need the entire message to do something useful (it depends on the client)
Peter Lawrey
source share