Parsing a substring for double direct

Question

Parsing a substring for double direct

If I have a string, such as 1 2 3 , and I determine the position of the substring containing double , how can I parse it directly from the substring without creating a temporary string?

For example, I could do System.Double.Parse(str.Substring(0, 1)) , but that would create a temporary string that would be slow and unnecessary. Is it possible to parse a double element directly from part of the source string?

EDIT

Eric Lippert questioned my motives here, stating that "Small strings are cheap." The motivation for this comes from the fact that I am doing the same for parsing ints and I see a significant performance improvement, because apparently small lines are not so cheap.

Here is a function that leaks the sequence of ints through temporary lines:

 let lex f (s: string) = let rec inside i0 (s: string, i) = if i = s.Length then f (s.Substring(i0, i-i0) |> System.Int32.Parse) else let c = s.[i] if '0'<=c && c<='9' then inside i0 (s, i+1) else f (s.Substring(i0, i-i0) |> System.Int32.Parse) outside (s, i) and outside (s: string, i) = if i < s.Length then let c = s.[i] if '0'<=c && c<='9' then inside i (s, i) else outside (s, i+1) outside (s, 0)

It takes from 2.4s to lex 15,625,000 ints per line.

Here is a version that avoids temporary lines:

 let lex f (s: string) = let rec inside n (s: string, i) = if i = s.Length then fn else let c = s.[i] if '0'<=c && c<='9' then inside (10*n + int c - int '0') (s, i+1) else fn outside (s, i) and outside (s: string, i) = if i < s.Length then let c = s.[i] if '0'<=c && c<='9' then inside 0 (s, i) else outside (s, i+1) outside (s, 0)

It takes 0.255s, more than 9 times faster than a solution using timelines.

I see no reason why floating vocabulary should be different. Therefore, without providing the ability to parse float from a .NET substring, it remains an order of magnitude higher in performance on the table. I do a lot of scientific computing and often have a lot of data for lex, especially at startup, so I really don't want to throw performance down the drain like this.

+6

c # .net f #

Jon harrop Jan 7 '16 at 2:26

source share

5 answers

Alex butenko · Answer 1 · 2016-01-07T02:52:39+0000

Yes, I think this is fully doable. You can write your own parsing function, you can even base it on the actual Double.Parse() source code. This code doesn't look big and scary, and I think you can optimize it even more for your needs.

Saeb amini · Answer 2 · 2016-01-07T03:00:19+0000

You can parse a string by digit, something like this:

 static double CustomConvertToDouble(string input, int startIndex, int length) { double result = 0d; int lastDigitIndex = startIndex + length - 1; int power = 0; for (int i = lastDigitIndex; i >= startIndex; i--) { int digit = (input[i] - '0'); result += (Math.Pow(10, power++)) * digit; } return result; }

Using:

 string tmp = "1 2 3"; double result = CustomConvertToDouble(tmp, 0, 1); Console.WriteLine(result); // 1

You can expand this to accept decimal points, etc. into account.

But I really doubt that the performance bottleneck could be the usual way, and I’m interested in knowing why you want to wrestle. If this piece of code is really performance critical, is it possible that the best route writes it in another language?

Mark seemann · Answer 3 · 2016-01-07T06:30:41+0000

If you are looking for only single digits, it's simple enough:

 let readDigit si = let getDigit x = if '0' <= x && x <= '9' then byte x - 48uy // byte value of '0' else failwith "Not a digit" s |> Seq.item i |> getDigit |> double

This F # implementation uses string implements char seq and that the char value can be converted to a byte value.

I doubt this is faster than using Double.Parse(str.Substring(0, 1)) .

vick · Answer 4 · 2016-01-07T03:06:12+0000

 for (int x = 0; x < input.Length; x++) { if(input[x] != ' ') Console.WriteLine(Double.Parse(input[x].ToString())); }

Does not create any additional Enumerable objects, but Double.Parse only excludes strings, so toString is required.

jdweng · Answer 5 · 2016-01-07T02:41:13+0000

This is the best you can do.

 static void Main(string[] args) { string input = "1 2 3"; double[] output = input.Split(new char[] {' '},StringSplitOptions.RemoveEmptyEntries).Select(x => double.Parse(x)).ToArray(); }

Parsing a substring for double direct

More articles: