GHC Strategy In Stream

I have a Scotty api server that creates Elasticsearch , extracts results from ES and displays json.

Compared to other servers such as Phoenix and Gin , I get higher CPU utilization and bandwidth for serving ES responses with BloodHound , but Gin and Phoenix were better than Scotty in memory efficiency.

Statistics for Scotty

wrk -t30 -c100 -d30s "http://localhost:3000/filters?apid=1&hfa=true" Running 30s test @ http://localhost:3000/filters?apid=1&hfa=true 30 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 192.04ms 305.45ms 1.95s 83.06% Req/Sec 133.42 118.21 1.37k 75.54% 68669 requests in 30.10s, 19.97MB read Requests/sec: 2281.51 Transfer/sec: 679.28KB 

This data is on my Mac with GHC 7.10.1 installed

Processor Information 2.5GHx i5
Memory Information 8 GB 1600 MHz DDR3

The lightweight w390> GHC impresses me, but the memory issue remains a big problem.

Using memory profiling gave me the following statistics

  39,222,354,072 bytes allocated in the heap 277,239,312 bytes copied during GC 522,218,848 bytes maximum residency (14 sample(s)) 761,408 bytes maximum slop 1124 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 373 colls, 373 par 2.802s 0.978s 0.0026s 0.0150s Gen 1 14 colls, 13 par 0.534s 0.166s 0.0119s 0.0253s Parallel GC work balance: 42.38% (serial 0%, perfect 100%) TASKS: 18 (1 bound, 17 peak workers (17 total), using -N4) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.001s ( 0.008s elapsed) MUT time 31.425s ( 36.161s elapsed) GC time 3.337s ( 1.144s elapsed) EXIT time 0.000s ( 0.001s elapsed) Total time 34.765s ( 37.314s elapsed) Alloc rate 1,248,117,604 bytes per MUT second Productivity 90.4% of total user, 84.2% of total elapsed gc_alloc_block_sync: 27215 whitehole_spin: 0 gen[0].sync: 8919 gen[1].sync: 30902 

Phoenix never took up more than 150 MB, and Gin significantly reduced memory.

I believe that the GHC uses a marking and scanning strategy for the GC. I also think that it would be better to use a GC turn-based strategy similar to Erlang VM to improve memory efficiency.

And, interpreting don Stewart's answer to a related question , there must be some way to change the GC strategy in the GHC.

I also noted that memory usage remained stable and rather low when concurrency was low, so I think that memory usage only increases when concurrency is pretty high.

Any ideas / pointers to solve this problem.

+7
garbage-collection multithreading memory haskell ghc
source share
1 answer

http://community.haskell.org/~simonmar/papers/local-gc.pdf

This article by Simon Marlow describes local heaps in threads and claims that it was implemented in the GHC. It is dated 2011. I canโ€™t be sure if this is really what the current GHC version is doing (i.e. it got into the GHC version version, it's still the current status quo, etc.), but it seems to me that the memory is not was fully composed.

I will also point to a section of the GHC manual that explains the settings you can configure to configure the garbage collector:

https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime-control.html#rts-options-gc

In particular, by default, the GHC uses a 2-space collection, but adding the -c RTS option allows you to use a slightly slower 1-space collection, which should consume less RAM. (I absolutely donโ€™t understand which generation this information refers to.)

It seems to me that Simon Marlowe is the one who does most of the RTS stuff (including the garbage collector), so if you can find it in the IRC, he should ask if you want the direct truth ...

+1
source share

All Articles