I have a Scotty api server that creates Elasticsearch , extracts results from ES and displays json.
Compared to other servers such as Phoenix and Gin , I get higher CPU utilization and bandwidth for serving ES responses with BloodHound , but Gin and Phoenix were better than Scotty in memory efficiency.
Statistics for Scotty
wrk -t30 -c100 -d30s "http://localhost:3000/filters?apid=1&hfa=true" Running 30s test @ http://localhost:3000/filters?apid=1&hfa=true 30 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 192.04ms 305.45ms 1.95s 83.06% Req/Sec 133.42 118.21 1.37k 75.54% 68669 requests in 30.10s, 19.97MB read Requests/sec: 2281.51 Transfer/sec: 679.28KB
This data is on my Mac with GHC 7.10.1 installed
Processor Information 2.5GHx i5
Memory Information 8 GB 1600 MHz DDR3
The lightweight w390> GHC impresses me, but the memory issue remains a big problem.
Using memory profiling gave me the following statistics
39,222,354,072 bytes allocated in the heap 277,239,312 bytes copied during GC 522,218,848 bytes maximum residency (14 sample(s)) 761,408 bytes maximum slop 1124 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 373 colls, 373 par 2.802s 0.978s 0.0026s 0.0150s Gen 1 14 colls, 13 par 0.534s 0.166s 0.0119s 0.0253s Parallel GC work balance: 42.38% (serial 0%, perfect 100%) TASKS: 18 (1 bound, 17 peak workers (17 total), using -N4) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.001s ( 0.008s elapsed) MUT time 31.425s ( 36.161s elapsed) GC time 3.337s ( 1.144s elapsed) EXIT time 0.000s ( 0.001s elapsed) Total time 34.765s ( 37.314s elapsed) Alloc rate 1,248,117,604 bytes per MUT second Productivity 90.4% of total user, 84.2% of total elapsed gc_alloc_block_sync: 27215 whitehole_spin: 0 gen[0].sync: 8919 gen[1].sync: 30902
Phoenix never took up more than 150 MB, and Gin significantly reduced memory.
I believe that the GHC uses a marking and scanning strategy for the GC. I also think that it would be better to use a GC turn-based strategy similar to Erlang VM to improve memory efficiency.
And, interpreting don Stewart's answer to a related question , there must be some way to change the GC strategy in the GHC.
I also noted that memory usage remained stable and rather low when concurrency was low, so I think that memory usage only increases when concurrency is pretty high.
Any ideas / pointers to solve this problem.