Uses a TStringList to load a huge text file in the best way in Delphi?

What is the best way to upload huge text file data to delphi? Is there any component that can load a text file superfast?

Let's say I have a text file containing a database and stored in a length length format. It contains 150 fields with at least 50 characters. 1. I need to load it into memory 2. I need to parse it and probably save it in memdataset for processing

My questions: 1. Is it enough if I use the TStringList.loadFromFile method? 2. Is there any other better component for managing a text file? 3. Should I use low level reading from a text file?

Thanks in advance.

+7
source share
4 answers

TStringList is never the best way to work with lots of text, but it is the easiest. If you have small files on hand, you can use TStringList without any problems. Even if you have large files (rather than huge files), you can implement a version of your algorithm using TStringList for testing purposes, because it is simple and easy to understand.

If your files are large, since they are probably since you call them "databases", you need to explore alternative technologies that will allow you to read only as much as you need from the database. Take a look:

  • TFileStream
  • Files with memory mapping.

Do not look at the old file-based API that is still available in Delphi, they are just old.

I will not go into details on how to access the text using these methods, because we recently had two similar questions on SO:

How can I read FIrst efficiently Multiple lines from many files in Delphi

and

Quick search to see if a string exists in large files using Delphi

+11
source

Since you have a fixed length with which you work, you can create a TList-based access class using TWriter and TReader, which take your records into account. You won't have the overhead of a TStringList (but it's not bad, but if you don't need it, why), and you can create your own access to the records in the class. Ultimately, it depends on what you are trying to do with the data as soon as you load it into memory. While TStringlist is easy to use, it is not as effective as "folding your own."

However, the efficiency of data manipulation may not be such a big problem, since you use text files to store the database. If you just need to read and make decisions based on the data in the file, a more flexible TList may be redundant.

+2
source

I recommend sticking with a TStringList if you find this convenient for your problem. Optimization is another thing to do later.

As with TStringList , the optimization is to declare a descendant class that overrides the TStrings.LoadFromStream method - you can do it as quickly as possible, taking into account the structure of your files.

+1
source

From your question it’s not entirely clear why you need to load the entire file into memory before you start creating a dataset in memory ... Do you combine the two questions? (i.e. because you need to create a dataset in memory, which, in your opinion, you first need to completely load the source data into memory? Or is there some preliminary preprocessing of the source file, which is possible only with the entire file loaded into memory (this is unlikely, and even if it is, there is no need for a navigation stream object such as TFileStream).

But I think the answer you are looking for is right there in the question ....

If you download this file in order to analyze it and fill in / initialize an additional data structure (data set) for further processing, then using the existing high-level data structure is unnecessary and potentially expensive (in terms of time)).

Use the lowest-level access tools that provide the capabilities you need.

In this case, TFileStream is likely to provide a better balance of usability.

+1
source

All Articles