Why is composition of functions in F # so 60% slower than piping?

Admittedly, I'm not sure if I'm comparing apples to apples or apples to pears right here. But I am especially surprised at the proximity of the difference, where we can expect a weaker difference, if any.

A pipeline can often be expressed as a composition of a function and vice versa , and I would suggest that the compiler knows this too, so I tried a little experiment:

// simplified example of some SB helpers: let inline bcreate() = new StringBuilder(64) let inline bget (sb: StringBuilder) = sb.ToString() let inline appendf fmt (sb: StringBuilder) = Printf.kbprintf (fun () -> sb) sb fmt let inline appends (s: string) (sb: StringBuilder) = sb.Append s let inline appendi (i: int) (sb: StringBuilder) = sb.Append i let inline appendb (b: bool) (sb: StringBuilder) = sb.Append b // test function for composition, putting some garbage data in SB let compose a = (appends "START" >> appendb true >> appendi 10 >> appendi a >> appends "0x" >> appendi 65535 >> appendi 10 >> appends "test" >> appends "END") (bcreate()) // test function for piping, putting the same garbage data in SB let pipe a = bcreate() |> appends "START" |> appendb true |> appendi 10 |> appendi a |> appends "0x" |> appendi 65535 |> appendi 10 |> appends "test" |> appends "END" 

Testing this option in FSI (64-bit flag --optimize ):

 > for i in 1 .. 500000 do compose 123 |> ignore;; Real: 00:00:00.390, CPU: 00:00:00.390, GC gen0: 62, gen1: 1, gen2: 0 val it : unit = () > for i in 1 .. 500000 do pipe 123 |> ignore;; Real: 00:00:00.249, CPU: 00:00:00.249, GC gen0: 27, gen1: 0, gen2: 0 val it : unit = () 

A slight difference would be understandable, but this is a performance decrease of 1.6% (60%).

I would expect the bulk of the work to happen in StringBuilder , but apparently the overhead of the composition has a pretty big impact.

I understand that in most practical situations this difference will be insignificant, but if you write large formatted text files (for example, log files), as in this case, it affects.

I am using the latest version of F #.

+7
performance function-composition f # piping
source share
2 answers

I tried your example with FSI and found no noticeable difference:

 > #time for i in 1 .. 500000 do compose 123 |> ignore --> Timing now on Real: 00:00:00.229, CPU: 00:00:00.234, GC gen0: 32, gen1: 32, gen2: 0 val it : unit = () > #time;; --> Timing now off > #time for i in 1 .. 500000 do pipe 123 |> ignore;;;; --> Timing now on Real: 00:00:00.214, CPU: 00:00:00.218, GC gen0: 30, gen1: 30, gen2: 0 val it : unit = () 

Measurement in BenchmarkDotNet (the first table is just one compost / pipe run, the second table does it 500,000 times), I found something similar

  Method | Platform | Jit | Median | StdDev | Gen 0 | Gen 1 | Gen 2 | Bytes Allocated/Op | -------- |--------- |---------- |------------ |----------- |--------- |------ |------ |------------------- | compose | X64 | RyuJit | 319.7963 ns | 5.0299 ns | 2,848.50 | - | - | 182.54 | pipe | X64 | RyuJit | 308.5887 ns | 11.3793 ns | 2,453.82 | - | - | 155.88 | compose | X86 | LegacyJit | 428.0141 ns | 3.6112 ns | 1,970.00 | - | - | 126.85 | pipe | X86 | LegacyJit | 416.3469 ns | 8.0869 ns | 1,886.00 | - | - | 121.86 | Method | Platform | Jit | Median | StdDev | Gen 0 | Gen 1 | Gen 2 | Bytes Allocated/Op | -------- |--------- |---------- |------------ |---------- |--------- |------ |------ |------------------- | compose | X64 | RyuJit | 160.8059 ms | 4.6699 ms | 3,514.75 | - | - | 56,224,980.75 | pipe | X64 | RyuJit | 163.1026 ms | 4.9829 ms | 3,120.00 | - | - | 50,025,686.21 | compose | X86 | LegacyJit | 215.8562 ms | 4.2769 ms | 2,292.00 | - | - | 36,820,936.68 | pipe | X86 | LegacyJit | 209.9219 ms | 2.5605 ms | 2,220.00 | - | - | 35,554,575.32 | 

It is possible that the differences you are measuring are related to GC. Try to force the GC to build before / after your timings.

However, looking at the source code for the pipe operator:

 let inline (|>) xf = fx 

and comparisons with the composition operator:

 let inline (>>) fgx = g(fx) 

it seems clear that the composition operator will create lambda functions, which should lead to more distributions. This can also be seen in BenchmarkDotNet tests. It can also be the cause of the performance difference you see.

+9
source share

Without any in-depth knowledge of the internal functions of F #, what I can say from the generated IL is that compose will give lambdas (and many of them if optimizations are disabled), whereas in pipe all calls to append* will be nested.

Generated IL for pipe function:

 Main.pipe: IL_0000: nop IL_0001: ldc.i4.s 40 IL_0003: newobj System.Text.StringBuilder..ctor IL_0008: ldstr "START" IL_000D: callvirt System.Text.StringBuilder.Append IL_0012: ldc.i4.1 IL_0013: callvirt System.Text.StringBuilder.Append IL_0018: ldc.i4.s 0A IL_001A: callvirt System.Text.StringBuilder.Append IL_001F: ldarg.0 IL_0020: callvirt System.Text.StringBuilder.Append IL_0025: ldstr "0x" IL_002A: callvirt System.Text.StringBuilder.Append IL_002F: ldc.i4 FF FF 00 00 IL_0034: callvirt System.Text.StringBuilder.Append IL_0039: ldc.i4.s 0A IL_003B: callvirt System.Text.StringBuilder.Append IL_0040: ldstr "test" IL_0045: callvirt System.Text.StringBuilder.Append IL_004A: ldstr "END" IL_004F: callvirt System.Text.StringBuilder.Append IL_0054: ret 

Generated IL for compose function:

 Main.compose: IL_0000: nop IL_0001: ldarg.0 IL_0002: newobj Main+compose@10..ctor IL_0007: stloc.1 IL_0008: ldloc.1 IL_0009: newobj Main+compose@10-1..ctor IL_000E: stloc.0 IL_000F: ldc.i4.s 40 IL_0011: newobj System.Text.StringBuilder..ctor IL_0016: stloc.2 IL_0017: ldloc.0 IL_0018: ldloc.2 IL_0019: callvirt Microsoft.FSharp.Core.FSharpFunc<System.Text.StringBuilder,System.Text.StringBuilder>.Invoke IL_001E: ldstr "END" IL_0023: callvirt System.Text.StringBuilder.Append IL_0028: ret compose@10.Invoke : IL_0000: nop IL_0001: ldarg.0 IL_0002: ldfld Main+compose@10.a IL_0007: ldarg.1 IL_0008: call Main.f@1 IL_000D: ldc.i4.s 0A IL_000F: callvirt System.Text.StringBuilder.Append IL_0014: ret compose@10..ctor : IL_0000: ldarg.0 IL_0001: call Microsoft.FSharp.Core.FSharpFunc<System.Text.StringBuilder,System.Text.StringBuilder>..ctor IL_0006: ldarg.0 IL_0007: ldarg.1 IL_0008: stfld Main+compose@10.a IL_000D: ret compose@10-1.Invoke : IL_0000: nop IL_0001: ldarg.0 IL_0002: ldfld Main+compose@10-1.f IL_0007: ldarg.1 IL_0008: callvirt Microsoft.FSharp.Core.FSharpFunc<System.Text.StringBuilder,System.Text.StringBuilder>.Invoke IL_000D: ldstr "test" IL_0012: callvirt System.Text.StringBuilder.Append IL_0017: ret compose@10-1..ctor : IL_0000: ldarg.0 IL_0001: call Microsoft.FSharp.Core.FSharpFunc<System.Text.StringBuilder,System.Text.StringBuilder>..ctor IL_0006: ldarg.0 IL_0007: ldarg.1 IL_0008: stfld Main+compose@10-1.f IL_000D: ret 
+6
source share

All Articles