Why is .NET creating new substrings instead of pointing to existing strings?

Question

Why is .NET creating new substrings instead of pointing to existing strings?

In short, using Reflector, it looks like String.Substring() allocates memory for each substring. Am I doing it right? I thought that this would not be needed, since the lines are immutable.

My main goal was to create an IEnumerable<string> Split(this String, Char) extension method that does not allocate extra memory.

+7

string c # .net memory string-interning

foson Jul 04 '09 at 15:42

source share

5 answers

It is impossible not to call inside .net using the String classes. You will need to pass references to the array that was modified, and make sure that no one squinted.

.Net will create a new line every time you ask for it. The only exception to this is interned strings, which are created by the compiler (and can be executed by you), which are put into memory once, and then pointers are set to string for memory and performance reasons.

+2

Spence Jul 04 '09 at 15:49

source share

Each string must have its own string data with how the String class is implemented.

You can create your own SubString structure that uses part of the string:

 public struct SubString { private string _str; private int _offset, _len; public SubString(string str, int offset, int len) { _str = str; _offset = offset; _len = len; } public int Length { get { return _len; } } public char this[int index] { get { if (index < 0 || index > len) throw new IndexOutOfRangeException(); return _str[_offset + index]; } } public void WriteToStringBuilder(StringBuilder s) { s.Write(_str, _offset, _len); } public override string ToString() { return _str.Substring(_offset, _len); } }

You can use it in other ways, such as comparison, which can also be bypassed without extracting the string.

+1

Guffa Jul 04 '09 at 16:08

source share

Because strings are immutable in .NET, each string operation that results in the creation of a new string object will allocate a new block of memory for the contents of the string.

In theory, it would be possible to reuse memory when extracting a substring, but this would make garbage collection very difficult: what if the original string is garbage collected? What will happen to the substring that divides its part?

Of course, nothing prevents the .NET BCL team from changing this behavior in future versions of .NET. This will not affect existing code.

0

Philippe leybaert Jul 04 '09 at 15:55

source share

Adding to the fact that strings are immutable, you should understand that the next fragment will generate multiple instances of String in memory.

 String s1 = "Hello", s2 = ", ", s3 = "World!"; String res = s1 + s2 + s3;

s1 + s2 => new instance of the string (temp1)

temp1 + s3 => new instance of the string (temp2)

res - link to temp2. A.

0

Babak naffas Jul 04 '09 at 19:41

source share

SingleNegationElimination · Accepted Answer · 2009-07-04T16:29:39+0000

One of the reasons that most languages with immutable strings create new substrings rather than referring to existing strings is because this will interfere with garbage collection later.

What happens if a string is used for its substring, but then a large string becomes inaccessible (except through a substring). A large string will be useless because it will invalidate the substring. What seemed like a good way to preserve memory in the short term would be a memory leak in the long term.

Why is .NET creating new substrings instead of pointing to existing strings?

More articles: