commit timestamps and replication
Hi,
we're discussing a system design and it boils down to a question
regarding commit timestamps (in the sense of [1]https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#Commit_timestamp_tracking,
track_commit_timestamp='on'):
We have a insert-only (append-only) table. Do commit timestamps on
this table constitute the same order in which records become visible
on a secondary (streaming replication)? Is there any reason why this
might not be the case?
To put this differently: If a client reads from a secondary and reads
the "latest" record by commit timestamp (commit time T). Is it safe to
assume that there won't ever be another record with a lower commit
timestamp < T, that only shows up on the secondary after that read?
I'm aware of concerns regarding physical time, time adjustments and so
on, so the question here assumes those things never happen.
Thank you!
Best regards,
Andreas
[1]: https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#Commit_timestamp_tracking
Hi,
On 2018-09-14 21:01:54 +0200, Andreas Brandl wrote:
we're discussing a system design and it boils down to a question
regarding commit timestamps (in the sense of [1],
track_commit_timestamp='on'):We have a insert-only (append-only) table. Do commit timestamps on
this table constitute the same order in which records become visible
on a secondary (streaming replication)? Is there any reason why this
might not be the case?To put this differently: If a client reads from a secondary and reads
the "latest" record by commit timestamp (commit time T). Is it safe to
assume that there won't ever be another record with a lower commit
timestamp < T, that only shows up on the secondary after that read?
No, that's not safe to assume. The order of visibility on the standby
is determined by the point the WAL record is inserted into the log. The
commit timestamp has to *previously* be determined to be part of the WAL
log:
SetCurrentTransactionStopTimestamp();
XactLogCommitRecord(xactStopTimestamp,
nchildren, children, nrels, rels,
nmsgs, invalMessages,
RelcacheInitFileInval, forceSyncCommit,
MyXactFlags,
InvalidTransactionId, NULL /* plain commit */ );
Those two things happen sequentially, *WITHOUT* a lock preventing
concurrent transactions to do the same.
I'm aware of concerns regarding physical time, time adjustments and so
on, so the question here assumes those things never happen.
Note that in addition to that you cannot be certain that concurrent
transactions *in the same postgres cluster* have perfectly coherent
timestamp - clock drift between CPU sockets does still sometimes occur,
and was extremely common.
Greetings,
Andres Freund