Issue about memory order on ARM

Started by 盏一about 6 years ago3 messages
#1盏一
w@hidva.com

The code in GetSnapshotData() that read the `xid` field of  PGXACT struct has a dependency on code in GetNewTransactionId() that write `MyPgXact->xid`. It means that the store of xid should happen before the load of it. In C11, we can use [Release-Acquire ordering](https://en.cppreference.com/w/c/atomic/memory_order#Release-Acquire_ordering) to achieve it. But now, there is no special operation to do it(, and the [volatile](https://en.cppreference.com/w/c/language/volatile) keyword should not play any role in this situation).

So it means that when a backend A returns from GetNewTransactionId(), the newval of `MyPgXact-&gt;xid` maybe just in CPU store buffer, or CPU cache line, so the newval is not yet visible to backend B that calling GetSnapshotData(). So the assumption that 'all top-level XIDs <= latestCompletedXid are either present in the ProcArray, or not running anymore' may not be safe.&nbsp;

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: 盏一 (#1)
Re: Issue about memory order on ARM

"=?utf-8?B?55uP5LiA?=" <w@www.hidva.com> writes:

The code in GetSnapshotData() that read the `xid` field of&nbsp; PGXACT struct has a dependency on code in GetNewTransactionId() that write `MyPgXact-&gt;xid`. It means that the store of xid should happen before the load of it. In C11, we can use [Release-Acquire ordering](https://en.cppreference.com/w/c/atomic/memory_order#Release-Acquire_ordering) to achieve it. But now, there is no special operation to do it(, and the [volatile](https://en.cppreference.com/w/c/language/volatile) keyword should not play any role in this situation).
So it means that when a backend A returns from GetNewTransactionId(), the newval of `MyPgXact-&gt;xid` maybe just in CPU store buffer, or CPU cache line, so the newval is not yet visible to backend B that calling GetSnapshotData(). So the assumption that 'all top-level XIDs <= latestCompletedXid are either present in the ProcArray, or not running anymore' may not be safe.&nbsp;

You'e ignoring the memory barriers that are implicit in LWLock
acquisition and release; as well as the fact that it's transaction
end, not start, that needs to be interlocked. Please read the section
"Interlocking Transaction Begin, Transaction End, and Snapshots"
in src/backend/access/transam/README.

regards, tom lane

#3盏一
w@hidva.com
In reply to: 盏一 (#1)
Re:Issue about memory order on ARM

Sorry to bother you, now I know that there is no problem here.

The model for reading and writing of PGXACT::xid and ShmemVariableCache->latestCompletedXid can be simplified as follows:

backend A backend B backend C
wlock(XidGenLock); wlock(XidGenLock); rlock(ProcArrayLock);
write APgXact->xid; write BPgXact->xid; read latestCompletedXid;
unlock(XidGenLock); unlock(XidGenLock); read APgXact->xid;
... read BPgXact->xid;
wlock(ProcArrayLock); unlock(ProcArrayLock);
write latestCompletedXid;
unlock(ProcArrayLock);

My previous problem was that C might not be able to see the value of APgXact->xid written by A because there was no obvious acquire-release operation during this. But now I find that there are already some acquire-release operations here. Because of the `unlock(XidGenLock)` in A and `wlock(XidGenLock)` in B and the rules introduced in [Inter-thread happens-before](https://en.cppreference.com/w/cpp/atomic/memory_order), we can know that the `write APgXact->xid` in A inter-thread happens before `write BPgXact->xid` in B. And `write BPgXact->xid` is sequenced before `write latestCompletedXid` in B according to rules introduced in [Sequenced-before rules](https://en.cppreference.com/w/cpp/language/eval_order). And similarly `write latestCompletedXid` in B inter-thread happens before `read latestCompletedXid` in C. So the `write APgXact->xid` in A inter-thread happens before `read APgXact->xid` in C. So C can see the value of APgXact->xid written by A.