Note. Let me come up with the length of this question, I had to add a lot of information to it. I hope that this does not make too many people simply abandon it and make assumptions. Please read in full. Thanks.
I have a stream of data coming in through a socket. This data is row oriented.
I am using APM (Async programming method) .NET (BeginRead, etc.). This eliminates the use of thread-based I / O, as Async I / O is buffer-based. You can repackage the data and send it to a stream, such as a memory stream, but there are problems there too.
The problem is that my input stream (which I do not control) does not give me any information about how long the stream is. This is just a stream of newlines, looking like this:
COMMAND\n ...Unpredictable number of lines of data...\n END COMMAND\n ....repeat....
Thus, using APM, and since I don’t know how long this data set will work, it is likely that data blocks will cross buffer boundaries requiring multiple reads, but these multiple reads will also span multiple data blocks.
Example:
Byte buffer[1024] = ".................blah\nThis is another l" [another read] "ine\n.............................More Lines..."
My first thought was to use StringBuilder and just add buffer lines to SB. This works to some extent, but it was difficult for me to extract data blocks. I tried using StringReader to read new data, but there was no way to find out if the full string was successful or not, since StringReader returns a partial string at the end of the last block added, and then returns null references. There is no way to find out if the resulting complete full row of data was received.
Example:
// Note: no newline at the end StringBuilder sb = new StringBuilder("This is a line\nThis is incomp.."); StringReader sr = new StringReader(sb); string s = sr.ReadLine(); // returns "This is a line" s = sr.ReadLine(); // returns "This is incomp.."
Worse, if I just keep adding data, the buffers are getting bigger and bigger, and since it can work for weeks or months at a time, this is not a good solution.
My next thought was to remove data blocks from SB when I read them. This required writing my own ReadLine function, but then I got stuck locking the data while reading and writing. In addition, large blocks of data (which may consist of hundreds of reads and megabytes of data) require scanning the entire buffer looking for new lines. It is inefficient and rather ugly.
I am looking for something that has the simplicity of a StreamReader / Writer with the convenience of async I / O.
My next thought was to use a MemoryStream and write data blocks to a memory stream, and then attach the StreamReader to the stream and use ReadLine, but again I have problems understanding if the last read in the buffer is a complete line or not, plus it’s even harder to remove the "obsolete" data from the stream.
I also thought about using a thread with synchronous reads. This has the advantage that with StreamReader it will always return the full line from ReadLine (), except in the event of a failure. However, this has problems with disconnecting the connection, and some types of network problems can cause blocking sockets to hang for a long period of time. I use async IO because I do not want to bind a thread during the life of a program that blocks data reception.
The connection is long. And the data will continue to flow over time. During an internal connection, a large stream of data occurs, and as soon as this stream is executed, the socket remains open, waiting for updates in real time. I do not know exactly when the initial stream "finished", as the only way to find out that more data is not sent immediately. This means that I can’t wait for the initial loading of data to complete before processing, I pretty much loop in real-time processing when it arrives.
So, can anyone suggest a good method to handle this situation so that it is not overly complicated? I really want it to be as simple and elegant as possible, but I continue to come up with ever more complex solutions because of all the extreme cases. I suppose that I want this is some kind of FIFO in which I can easily add additional data and at the same time output data from it that meet certain criteria (i.e. Lines with a terminating string character).