Why is gccgo slower than gc in this particular case?

I'm sure everyone, knowing golang , knows that the blog post is here.

After reading it again, I wondered if using gccgo instead of go build could increase the speed a bit more. In my typical use case (scientific computing), a gccgo generated binary is always faster than a go build generated.

So, just take this file: havlak6.go and compile it:

 go build havlak6.go -O havlak6_go gccgo -o havlak6_gccgo -march=native -Ofast havlak6.go 

Surprise!

 $/usr/bin/time ./havlak6_go 5.45user 0.06system 0:05.54elapsed 99%CPU $/usr/bin/time ./havlak6_gccgo 11.38user 0.16system 0:11.74elapsed 98%CPU 

I am curious and want to know why the "optimizing" compiler produces slow code.

I tried using gprof in the gccgo generated binary:

 gccgo -pg -march=native -Ofast havlak6.go ./a.out gprof a.out gmon.out 

no luck:

 Flat profile: Each sample counts as 0.01 seconds. no time accumulated 

As you can see, the code was not profiled.

Of course, I am reading this , but as you can see, the program takes 10 + seconds to execute ... The number of samples should be> 1000.

I also tried:

 rm a.out gmon.out LDFLAGS='-g -pg' gccgo -g -pg -march=native -Ofast havlak6.go ./a.out gprof 

There will be no success.

Do you know what happened? Do you have an idea why gccgo , while all its optimization procedures will not be faster than gc in this case?

go version: 1.0.2 gcc version: 4.7.2

EDIT:

Oh, I completely forgot to mention ... I obviously tried pprof on a gccgo generated binary ... Here is top10 :

 Welcome to pprof! For help, type 'help'. (pprof) top10 Total: 1143 samples 1143 100.0% 100.0% 1143 100.0% 0x00007fbfb04cf1f4 0 0.0% 100.0% 890 77.9% 0x00007fbfaf81101e 0 0.0% 100.0% 4 0.3% 0x00007fbfaf8deb64 0 0.0% 100.0% 1 0.1% 0x00007fbfaf8f2faf 0 0.0% 100.0% 3 0.3% 0x00007fbfaf8f2fc5 0 0.0% 100.0% 1 0.1% 0x00007fbfaf8f2fc9 0 0.0% 100.0% 1 0.1% 0x00007fbfaf8f2fd6 0 0.0% 100.0% 1 0.1% 0x00007fbfaf8f2fdf 0 0.0% 100.0% 2 0.2% 0x00007fbfaf8f4a2f 0 0.0% 100.0% 1 0.1% 0x00007fbfaf8f4a33 

And that’s why I’m looking for something else.

EDIT2:

Since it seems that someone wants my question to be closed, I did not try to use gprof from blue: https://groups.google.com/d/msg/golang-nuts/1xESoT5Xcd0/bpMvxQeJguMJ

+6
source share
2 answers

Running the gccgo-generated binary in Valgrind seems to indicate that gccgo has an inefficient memory allocator. This may be one of the reasons why gccgo 4.7.2 is slower than go 1.0.2. It is not possible to run the binary generated in 1.0.2 under Valgrind, so it is difficult to confirm the fact that memory allocation is the main gccgo performance issue in this case.

+2
source

Remember go build also uses static linking by default, so to compare apples and apples you must specify gccgo with the -static or -static-libgo .

0
source

All Articles