How does HBase guarantee row level atomicity?

Given the fact that HBase stores each column family in a separate HFile and the fact that a row can span many column families. How does HBase ensure that the put / delete operation on a row that spans multiple column families is truly atomic?

+7
source share
2 answers

All entries in a row, no matter how many families of columns can be in this row, go to one register server and this register server then writes the edit to the WAL (Hlog) regions, then the records are synchronized, then the data is added to memstore so that it is filed. Then - as soon as memstore reaches its limit - memstore will be flushed to disk. If any problems arise with the register server, and it crashes / dies / has a plug, then the WAL can be started to keep everything consistent. See HBASE-2283 and the Hbase 101 architecture for more information.

+6
source

HBase currently achieves row level atomicity, despite writing multiple HFiles while flushing all column families. The reset is triggered when the largest number of columns reaches the specified flash size. There is another MemStore level timestamp that allows you to manage multiple versions of concurrency to read MemStore, but this does not exist for keys / values ​​that are written to HFiles. Switching to a single-user flash (a desirable feature to increase efficiency) will require that a similar timestamp be added to the file format.

+1
source

All Articles