You are right in the Hbase Memstore. In general, when something is written in HBase, it is first written to the storage in memory (memstore), once this memstore reaches a certain size *, it is flushed to disk in the storage file (everything is also written directly to the log file for durability).
* From a global point of view, HBase uses 40% of the heap by default (see the hbase.regionserver.global.memstore.upperLimit property) for all memstores in all regions of all column families of all tables. If this limit is reached, it will start flushing some memoirs until the memory used by the memoirs is at least 35% of the heap (lowerLimit property). This is customizable, but you need to have the perfect calculation to have this change.
Yes, the GC has an effect on memstore, and you can really change this behavior using the Memstore-local distribution buffer. I would advise you to read an article from 3 articles, โPreventing Full GC in HBase Using MemStore-Local Allocation Buffers,โ as shown below: http://www.cloudera.com/blog/2011/02/avoiding-full- gcs-in-hbase-with-memstore-local-allocation-buffers-part-1 /
source share