TCP uses what is called a sliding window. Basically, the amount of buffer space, X, the receiver must reassemble from order packets. The sender can send X bytes for the last confirmed byte, the sequence number is N, say. Thus, you can fill the channel between the sender and the recipient with X unconfirmed bytes under the assumption that the packets are likely to get there, and if not, the recipient will inform you without confirming the missing packets. In each response packet, the receiver sends a cumulative acknowledgment, saying: "I have all the bytes up to byte X". This allows multiple packages at once.
Imagine that the client sends 3 packets, X, Y and Z, starting with the sequence number N. Due to the fact that routing is performed first Y, then Z, and then X. Y and Z will be buffered in the destination stack and when X then the receiver will receive N + (cumulative lengths X, Y and Z). This will mean the beginning of a sliding window, allowing the client to send additional packets.
Perhaps with selective acknowledgment, in order to receive a part of the sliding window and ask the sender to retransmit only the lost parts. In the classic Y pattern, the sender will have to resend Y and Z. Selective acknowledgment means that the sender can simply resend Y. See the wikipedia page .
As for speed, one thing that can slow you down is DNS. This adds an extra round if the IP is not cached before you can even request the image in question. If this is not a general site, this may be so.
TCP Illustrated Volume 1, Richard Stevens is huge if you want to know more. The name sounds funny, but packet diagrams and annotated arrows from one node to another really make it easier to understand. This is one of those books from which you can learn, and then ultimately save the link. This is one of my three books on network projects. alt text http://ecx.images-amazon.com/images/I/21NraFSkMOL._SL500_AA300_.jpg
source share