Read / write synchronization without commit in MVCC implementation

Question

Read / write synchronization without commit in MVCC implementation

I was obsessed with the correctness of some insecure code, and I would really appreciate any data that I can get. My question is how to achieve some required cross-thread synchronization using the receive and release semantics in the C ++ 11 memory model. Before my question, some background ...

In MVCC, a writer can install a new version of an object without affecting readers of older versions of objects. However, if an author installs a new version of an object when a reader with a more ranked timestamp has already received a link to an older version, the writer’s transaction must be canceled and retried. This is necessary to maintain serializable isolation of snapshots (as if all successful transactions were executed one after another in the order of the time stamp). Readers have never had to retry because of a post, but authors may need to be thrown back and retried if their activity “pulls out of the dig” from under readers with earlier timestamps. To implement this limitation, a read timestamp is used. The idea is that the reader updates the timestamp of the reading of the object to his own timestamp before acquiring the link, and the writer checks the timestamp of the reading to see if it is normal to continue the new version of this object.

Suppose there are two transactions: T1 (writer) and T2 (reader), which operate on separate threads.

T1 (writer) does the following:

void DataStore::update(CachedObject* oldObject, CachedObject* newObject) { . . . COcontainer* container = oldObject->parent(); tid_t newTID = newObject->revision(); container->setObject(newObject); tid_t* rrp = &container->readRevision; tid_t rr = __atomic_load_n(rrp, __ATOMIC_ACQUIRE); while (true) { if (rr > newTID) throw TransactionRetryEx(); if (__atomic_compare_exchange_n( rrp, &rr, rr, false, __ATOMIC_RELEASE, __ATOMIC_RELAXED) { break; } } }

T2 (reader) does the following:

 CachedObject* Transaction::onRead(CachedObject* object) { tid_t tid = Transaction::mine()->tid(); COcontainer* container = object->parent(); tid_t* rrp = &container->readRevision; tid_t rr = __atomic_load_n(rrp, __ATOMIC_ACQUIRE); while (rr < tid) { if (__atomic_compare_exchange_n( rrp, &rr, tid, false, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE)) { break; } } // follow the chain of objects to find the newest one this transaction can use object = object->newest(); // caller can use object now return object; }

This is a simple summary of the situation I'm worried about:

  ABC <----*----*----*----> timestamp order A: old object timestamp B: new object timestamp (T1 timestamp) C: "future" reader timestamp (T2 timestamp) * If T2@C reads object@A , T1@B must be rolled back.

If T1 is fully executed before the start of T2 (and the effects of T1 are fully visible for T2), then there is no problem. T2 will receive a link to the version of the object installed by T1, which it can use because the timestamp of T1 is less than T2. (A transaction can read objects "from the past," but it cannot "peer into the future.")

If T2 is fully executed before the start of T1 (and the effects of T2 are fully visible for T1), then there is no problem. T1 will see that the transaction "from the future" potentially reads an older version of the object. Thus, T1 will be rolled back and a new transaction will be created to repeat the work.

The problem (of course) guarantees the correct behavior while starting T1 and T2. It would be very simple to use mutexes to eliminate the conditions of the race, but I would make a decision with locks if I were convinced that there was no other way. I am sure that this should be possible with the C ++ 11 memory acquisition and release models. I am fine with some complexity, as long as I can be satisfied that the code is correct. I really want readers to work as fast as possible, which is the main function of selling MVCC.

Questions:

1. Considering the above (partial) code, do you think that there is a race condition, so that T1 cannot be rolled back (via throw TransactionRetryEx() ) in case T2 continues to use an older version of the object?

2. If the code is incorrect, explain why and please provide general guidance for its proper use.

3. Even if the code looks right, can you see how it can be more efficient?

My reasoning in DataStore::update() is that if the __atomic_compare_exchange_n() call succeeds, it means that the “conflicting” reader thread has not yet updated the read timestamp, and therefore it also did not go through the chain of version objects to find The recently available version that has just been installed.

I'm going to buy the book Transactional Information Systems: Theory, Algorithms, and Practices Concurrency Management and Recovery , but I thought I would also bother you: DI think I should have bought the book before, but I'm also sure that I don’t I will study anything that deprives you of the right to a large amount of my work.

I hope I have given enough information to give an answer. I will be happy to edit my question to make it clearer if I get constructive criticism. If this question (or one such as he) has already been asked and answered, that would be great.

Thanks!

+7

c ++ c ++ 11

Adam mckee 12 sept '12 at 8:07

source share

1 answer

Seg fault · Answer 1 · 2012-09-23T11:55:24+0000

This is complicated, I can’t say anything about 1 and 2. However, but with respect to 3, I noticed something:

When __atomic_compare_exchange_n returns false, then the current * rrp value is written to rr, so __atomic_load () s inside the loops are redundant (in T2 they just throw it out, in T1 they do it once before the loop, as in T2).

As a general remark, it may not be necessary to think about receiving / issuing until everything else in the algorithm is complete; then you can check how strong the memory barrier is "everywhere."

Read / write synchronization without commit in MVCC implementation

More articles: