CSV Helper parser throw IOException. Unable to read data from transport connection

I use CSVHelper in my C # project and read a large csv data file (about 2000 entries) into memory.

https://github.com/JoshClose/CsvHelper

it works great if the records are less than 500, it always throws me. IOException at different stages depends on the network or the number increases. I am currently hosted on the Azure cloud platform, so reading from the blob storage to the server should not be a network issue.

CsvHelper.CsvParserException: A parsing error occurred. Row: '995' (1 based) ---> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size) --- End of inner exception stack trace --- at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.IO.StreamReader.ReadBuffer(Char[] userBuffer, Int32 userOffset, Int32 desiredChars, Boolean& readToUserBuffer) at System.IO.StreamReader.Read(Char[] buffer, Int32 index, Int32 count) at CsvHelper.CsvParser.GetChar(Int32& fieldStartPosition, Int32& rawFieldStartPosition, String& field, Boolean prevCharWasDelimiter, Int32& recordPosition, Int32& fieldLength, Boolean isPeek) in c:\Projects\CsvHelper\src\CsvHelper\CsvParser.cs:line 445 at CsvHelper.CsvParser.ReadLine() in c:\Projects\CsvHelper\src\CsvHelper\CsvParser.cs:line 247 at CsvHelper.CsvParser.Read() in c:\Projects\CsvHelper\src\CsvHelper\CsvParser.cs:line 108 --- End of inner exception stack trace --- at CsvHelper.CsvParser.Read() in c:\Projects\CsvHelper\src\CsvHelper\CsvParser.cs:line 136 at CsvHelper.CsvReader.Read() in c:\Projects\CsvHelper\src\CsvHelper\CsvReader.cs:line 173 

it pounces while (csv.read ())

  var wc = new WebClient(); using (var sourceStream = wc.OpenRead(fileUrl)) { using (var csv = new CsvReader(new StreamReader(sourceStream))) { while (csv.Read()) { try { //some reading operation } catch (Exception ex) { _logger.Error(ex); } } _logger.InfoFormat("Finished {0} reading data #{1}"); } } 

Where to set the streamreader timeout value?

+4
source share
1 answer

When working with cloud resources (be it Azure or any other cloud resource), you should not directly read the file. In the best case scenario, you must implement the retry logic to ensure that you have bypassed all transient errors (read about Transient errors here , here and there, or just search the Internet for the term “Transient Error”).

In your case, I suggest you wrap your calls using the CloudBlockBlob.DownloadToStream method. Thus, you can still use Stream to analyze the file, but you will also work to protect the Azure Blob API.NET library, which will take care of all temporary errors on your behalf.

Your code will look something like this:

 // get the CloudblockBlob object using(MemoryStream blobStream = new MemoryStream()) { blobObject.DownloadToStream(blobStream); using (var csv = new CsvReader(new StreamReader(blobStream))) { while (csv.Read()) { try { //some reading operation } catch (Exception ex) { _logger.Error(ex); } } _logger.InfoFormat("Finished {0} reading data #{1}"); } } 
+2
source

All Articles