Insecure string creation from char []

I am working on high-performance code in which this design is part of a critical performance section.

This is what happens in some sections:

  • A string "scanned", and metadata is stored efficiently.
  • Based on these fragments, the metadata of the main line is divided into char[][] .
  • That char[][] was transferred to string[] .

Now I know that you can just call new string(char[]) , but then the result will need to be copied.

To avoid this redundant copy step, I think it should be possible to write directly to the string internal buffer. Despite the fact that it will be an unsafe operation (and I know that it brings a lot of consequences, such as overflow, forward compatibility).

I have seen several ways to achieve this, but none of them are satisfied.

Does anyone have any true suggestions on how to achieve this?

Additional Information:
The actual process does not include conversion to char[] necessarily, it is practically an operation with several substrings. Like 3 pointers and their lengths added.

StringBuilder has too much overhead for a small amount of concat.

EDIT:
Due to some vague aspects of what I'm asking, let me reformulate this.

Here's what happens:

  • The main row is indexed.
  • Parts of the main line are copied to char[] .
  • char[] converted to string .

What I would like to do is merge 2 and 3, resulting in:

  • The main row is indexed.
  • Parts of the main line are copied to string (and the GC can keep its hands off it during the process by using the fixed keyword correctly?)

And note that I cannot change the type of output from the string [], since this is an external library, and projects depend on it (backward compatibility).

+7
source share
4 answers

What happens if you run:

 string s = GetBuffer(); fixed (char* pch = s) { pch[0] = 'R'; pch[1] = 'e'; pch[2] = 's'; pch[3] = 'u'; pch[4] = 'l'; pch[5] = 't'; } 

I think the world will come to an end (or at least its managed part of .NET), but this is very close to what StringBuilder does.

Do you have profiler data to show that StringBuilder not fast enough for your purposes or is this an assumption?

+2
source

I think that what you are asking to do is โ€œcutโ€ the existing line in place into several smaller lines without redistributing the character arrays for the smaller lines. This will not work in a controlled world.

For some reason, think about what happens when the garbage collector comes in and collects or moves the original line during compaction - all the other lines inside "now point to some arbitrary other memory, and not to the original line from which you they were cut out.

EDIT. Unlike the player playing with an error (which is very smart, but IMHO a bit scary), you can allocate a StringBuilder with a predetermined bandwidth, which eliminates the need for redistribution of internal arrays. See http://msdn.microsoft.com/ en-us / library / h1h0a5sy.aspx .

+2
source

Just create your own addressing system instead of trying to use unsafe code to display the internal data structure.

Mapping a string (which also reads as char[] ) into an array of smaller strings is no different from building a list of address information (index and length of each substring). So create a new List<Tuple<int,int>> instead of string[] and use this data to return the correct string from your original immutable data structure. It can be easily encapsulated in something that string[] exposed.

+2
source

There is no way in .NET to create an instance of String that shares data with another string. Some discussion about why this appears in this comment from Eric Lippert.

0
source

All Articles