Can anyone point out the flaws in this code? I am extracting HTML with TcpClient. NetworkStream.Read () never ends when talking to the IIS server. If I use the Fiddler proxy instead, it works fine, but when I go directly to the target server, the .read () loop will not exit until the connection exceptions fail, for example, “the remote server closed the connection” .
internal TcpClient Client { get; set; } /// bunch of other code here... try { NetworkStream ns = Client.GetStream(); StreamWriter sw = new StreamWriter(ns); sw.Write(request); sw.Flush(); byte[] buffer = new byte[1024]; int read=0; try { while ((read = ns.Read(buffer, 0, buffer.Length)) > 0) { response.AppendFormat("{0}", Encoding.ASCII.GetString(buffer, 0, read)); } } catch
Update
In the debugger, I immediately see the whole answer going through , and is added to my StringBuilder (response). It just seems that the connection does not close when the server sends a response, or my code does not detect it.
Conclusion As mentioned here, it is best to use the protocol offers (in the case of HTTP, the Content-Length header) to determine when the transaction is completed. However, I found that not all pages have content lengths. So now I am using a hybrid solution:
For ALL transactions, set the Connection request header to “close” so that the server is not encouraged to open the socket. This improves the chances that the server will close the connection when it answers your request.
If Content-Length set, use it to determine when the request will complete.
Otherwise, set the NetworkStream RequestTimeout property to a large but reasonable value, for example, 1 second. Then, loop on NetworkStream.Read() until a) time out, or b) you read fewer bytes than you requested.
Thanks to everyone for their excellent and detailed answers.
David Lively
source share