Efficient way to read a large tab delimited text file?

I have a tab delimited txt file with 500K entries. I am using the code below to read data into a dataset. It works fine with 50K, but with a 500K it throws a "System.OutOfMemoryException type exception" was thrown. "

What is a more efficient way to read big data with tab delimiters? Or how to solve this problem? Please give me an example

public DataSet DataToDataSet(string fullpath, string file)
{
    string sql = "SELECT * FROM " + file; // Read all the data
    OleDbConnection connection = new OleDbConnection // Connection
                  ("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullpath + ";"
                   + "Extended Properties=\"text;HDR=YES;FMT=Delimited\"");
    OleDbDataAdapter ole = new OleDbDataAdapter(sql, connection); // Load the data into the adapter
    DataSet dataset = new DataSet(); // To hold the data
    ole.Fill(dataset); // Fill the dataset with the data from the adapter
    connection.Close(); // Close the connection
    connection.Dispose(); // Dispose of the connection
    ole.Dispose(); // Get rid of the adapter
    return dataset;
}
+5
source share
4 answers

Use stream with TextFieldParser- this way you won’t load the entire file into memory at a time.

+8
source

.

    public static IEnumerable<string> EnumerateLines(this FileInfo file)
    {
        using (var stream = File.Open(file.FullName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
        using (var reader = new StreamReader(stream))
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                yield return line;
            }
        }
    }

. , , .

+3

Have you tried TextReader?

  using (TextReader tr = File.OpenText(YourFile))
  {
      string strLine = string.Empty;
      string[] arrColumns = null;
      while ((strLine = tr.ReadLine()) != null)
      {
           arrColumns = strLine .Split('\t');
           // Start Fill Your DataSet or Whatever you wanna do with your data
      }
      tr.Close();
  }
0
source

I found FileHelpers

FileHelpers is a free and easy-to-use .NET library for importing / exporting data from fixed-length or delimited records in files, lines or streams.

Perhaps this may help.

0
source

All Articles