.NET Strings vs. Streams - Profile and Memory Features

I need to pull large Unicode text strings (e.g. 200Mb) from a database (nvarchar) and store in memory for processing. that is, I need random access to all parts of the lines.

Looking at it in terms of strictly centralized memory, what are the pros and cons of using System.IO.MemoryStream and System.String as a representation in memory.

Some of the factors I'm trying to research are as follows:

  • How these objects operate in a [hypothetical] highly fragmented low memory environment
  • Immutability
  • Actual size in memory (if the stream is UTF8, we are almost halved)
  • Is there another object that I did not think about?

Am I looking for clarity and advice on these issues, as well as any other memory considerations that I have not thought of?

Note: there may be a better way to handle these strings, but for now, I'm really just asking that memory stores such an object.

+6
string memory-management stream memory
source share
2 answers

Looking at it in terms of strictly centralized memory, what are the pros and cons of using System.IO.MemoryStream and System.String as a representation in memory.

Some of the factors I'm trying to research are as follows:

  • How these objects operate in a [hypothetical] highly fragmented low memory environment

IMO, MemoryStream is only useful when encoding is trivial (e.g. ASCII, ISO-8859-X, etc.). If the encoding is UTF-8 and , you have non-ASCII characters, processing will be more difficult. Of course, a MemoryStream will almost certainly consume less memory, but otherwise not so much. Under the hood, a MemoryStream uses an array of bytes, which should also be allocated in an adjacent piece of memory.

  • Actual size in memory (if the stream is UTF8, we almost halved the size)

To the right, with pure ASCII characters, a MemoryStream will consume half what the equivalent string consumes.

  • Is there another object that I did not think about?
List<byte> // has a nicer interface for processing 

How are rows stored in a database? varchar or nvarchar?

Hi,

Andreas

+5
source share

Line memory downstream is pretty inconsequential. Lines are utf-16, so a small number may be involved, but because of the volumes involved, you are probably best off writing data to a scratched file.

To read data from a database, use streaming methods; those. use IDataReader (ExecuteReader), with it in serial mode and read fragments of bytes / characters. Do not try to read the entire column.

In addition, in SQL Server 2008, you want to see the file type.

Examples:

+4
source share

All Articles