Detailed questions about pg_xact_commit_timestamp

Started by Morris de Oryxover 6 years ago5 messages
#1Morris de Oryx
morrisdeoryx@gmail.com

I have some specific questions about pg_xact_commit_timestamp, and am
hoping that this is the right place to ask. I read a lot of the commentary
about the original patch, and the contributors seem to be here. If I'm
asking in the wrong place, just let me know.

I'm working on a design for a concurrency-safe incremental aggregate rollup
system,and pg_xact_commit_timestamp sounds perfect. But I've found very
little commentary on it generally, and couldn't figure out how it works in
detail from the source code.

Hopefully, someone knows the answers to a few questions:

* Is it possible for pg_xact_commit_timestamp to produce times out of
order? What I'm after is a way to identify records that have been chagned
since a specific time so that I can get any later changes for processing. I
don't need them in commit order, so overlapping timestamps aren't a
problem.

* How many bytes are added to each row in the final implementation? The
discussions I saw seemed to be ranging from 12-24 bytes. There was
discussion of adding in extra bytes for "just in case." This is pre 9.5, so
a world ago.

* Are the timestamps indexed internally? With a B-tree? I ask for
capacity-planning reasons.

* I've seen on StackOverflow and the design discussions that the timestamps
are not kept indefinitely, but can't find the details on exactly how long
they are stored.

* Any rules of thumb on the performance impact of enabling
pg_xact_commit_timestamp? I don't need the data on all tables but, where I
do, it sounds like it might work perfectly.

Many thanks for any assistance!

#2Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Morris de Oryx (#1)
Re: Detailed questions about pg_xact_commit_timestamp

Hi,

On 7/9/19 12:22 AM, Morris de Oryx wrote:

I have some specific questions about pg_xact_commit_timestamp, and am hoping
that this is the right place to ask. I read a lot of the commentary about the
original patch, and the contributors seem to be here. If I'm asking in the wrong
place, just let me know.

I'm working on a design for a concurrency-safe incremental aggregate rollup
system,and pg_xact_commit_timestamp sounds perfect. But I've found very little
commentary on it generally, and couldn't figure out how it works in detail from
the source code.

Hopefully, someone knows the answers to a few questions:

* Is it possible for pg_xact_commit_timestamp to produce times out of order?
What I'm after is a way to identify records that have been chagned since a
specific time so that I can get any later changes for processing. I don't need
them in commit order, so overlapping timestamps aren't a problem.

I think yes. For example, you can have a session "A" xid 34386826 that commit
after session "B" xid 34386827:
postgres=# select pg_xact_commit_timestamp('34386827'::xid);
pg_xact_commit_timestamp
-------------------------------
2019-07-11 09:32:29.806183+00
(1 row)

postgres=# select pg_xact_commit_timestamp('34386826'::xid);
pg_xact_commit_timestamp
------------------------------
2019-07-11 09:32:38.99444+00
(1 row)

* How many bytes are added to each row in the final implementation? The
discussions I saw seemed to be ranging from 12-24 bytes. There was discussion of
adding in extra bytes for "just in case." This is pre 9.5, so a world ago.

src/backend/access/transam/commit_ts.c says 8+4 bytes per xact.

Note it is not per row but per xact: We only have to store the timestamp for one
xid.

* Are the timestamps indexed internally? With a B-tree? I ask for
capacity-planning reasons.

I think no.

* I've seen on StackOverflow and the design discussions that the timestamps are
not kept indefinitely, but can't find the details on exactly how long they are
stored.

Yes timestamp are stored in pg_commit_ts directory. Old timestamp are removed
after freeze has explained in
https://www.postgresql.org/docs/current/routine-vacuuming.html :

The sole disadvantage of increasing autovacuum_freeze_max_age (and

vacuum_freeze_table_age along with it) is that the pg_xact and pg_commit_ts
subdirectories of the database cluster will take more space, because it must
store the commit status and (if track_commit_timestamp is enabled) timestamp of
all transactions back to the autovacuum_freeze_max_age horizon. The commit
status uses two bits per transaction, so if autovacuum_freeze_max_age is set to
its maximum allowed value of two billion, pg_xact can be expected to grow to
about half a gigabyte and pg_commit_ts to about 20GB. If this is trivial
compared to your total database size, setting autovacuum_freeze_max_age to its
maximum allowed value is recommended. Otherwise, set it depending on what you
are willing to allow for pg_xact and pg_commit_ts storage. (The default, 200
million transactions, translates to about 50MB of pg_xact storage and about 2GB
of pg_commit_ts storage.)

* Any rules of thumb on the performance impact of enabling
pg_xact_commit_timestamp? I don't need the data on all tables but, where I do,
it sounds like it might work perfectly.

Many thanks for any assistance!

I didn't notice any performance impact, but I didn't do any extensive testing.

Regards,

#3Morris de Oryx
morrisdeoryx@gmail.com
In reply to: Adrien Nayrat (#2)
Re: Detailed questions about pg_xact_commit_timestamp

Adrien, thanks very much for answering my question. Just a couple of
follow-up points, if you don't mind.

In our answer, you offer an example of pg_xact_commit_timestamp showing
out-of-sequence commit times:

Session xid pg_xact_commit_timestamp
A 34386826 2019-07-11 09:32:38.994440+00 Started earlier,
committed later
B 34386827 2019-07-11 09:32:29.806183+00

I may not have asked my question clearly, or I may not understand the
answer properly. Or both ;-) If I understand it correctly, an xid is
assigned when a transaction starts. One transaction might take a second,
another might take ten minutes. So, the xid sequence doesn't imply anything
at all about commit sequence. What I'm trying to figure out is if it is
possible for the commit timestamps to somehow be out of order. What I'm
looking for is a way of finding changes committed since a specific moment.
When the transaction started doesn't matter in my case.

Is pg_xact_commit_timestamp suitable for this? I'm getting the impression
that it isn't. But I don't understand quite how. And if it isn't suited to
this purpose, does anyone know what pg_xact_commit_timestamp is for? What
I'm after is something like a "xcommitserial" that increases reliably, and
monotonically on transaction commit. That's how I'm hoping that
pg_xact_commit_timestamp functions.

Thanks also for making me understand that pg_xact_commit_timestamp applies
to a *transaction*, not to each row. That makes it a lot lighter in the
database. I was thinking 12 bytes+ per row, which is completely off for my
case. (I tend to insert thousands of rows in a transaction.)

Yes timestamp are stored in pg_commit_ts directory. Old timestamp are

removed after freeze has explained in

https://www.postgresql.org/docs/current/routine-vacuuming.html

Thanks for the answer, and for kindly pointing me to the right section of
the documentation. It's easy to get impatient with new(er) users. I'm _not_
lazy about reading manuls and researching but, well, the Postgres
documentation is over 3,000 pages long when you download it. So, I may have
missed a detail or two.... If I read that correctly, the ~4 billion number
range is made into an endless circle by keeping ~2 billions numbers in the
past, and 2 billion in the future. If that's right, I'm never going to be
so out of data that the ~2 billion number window is too small.

#4Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Morris de Oryx (#3)
Re: Detailed questions about pg_xact_commit_timestamp

On 7/12/19 2:50 PM, Morris de Oryx wrote:

Adrien, thanks very much for answering my question. Just a couple of follow-up
points, if you don't mind.

In our answer, you offer an example of pg_xact_commit_timestamp showing
out-of-sequence commit times:

Session     xid          pg_xact_commit_timestamp
A           34386826     2019-07-11 09:32:38.994440+00  Started earlier,
committed later
B           34386827     2019-07-11 09:32:29.806183+00

I may not have asked my question clearly, or I may not understand the answer
properly. Or both ;-) If I understand it correctly, an xid is assigned when a
transaction starts.

It is a little bit more complicated :) When a transaction start, a *virtual* xid
is assigned. It is when the transaction change the state of the database, an xid
is assigned:

Throughout running a transaction, a server process holds an exclusive lock on the transaction's virtual transaction ID. If a permanent ID is assigned to the transaction (which normally happens only if the transaction changes the state of the database), it also holds an exclusive lock on the transaction's permanent transaction ID until it ends.

https://www.postgresql.org/docs/current/view-pg-locks.html

(It shouldn't change anything for you)

One transaction might take a second, another might take ten
minutes. So, the xid sequence doesn't imply anything at all about commit
sequence. What I'm trying to figure out is if it is possible for the commit
timestamps to somehow be out of order.

I am sorry but I don't understand what you mean by "commit timestamps to somehow
be out of order"?

What I'm looking for is a way of finding
changes committed since a specific moment. When the transaction started doesn't
matter in my case.

Yes, the commit timestamp is the time when the transaction is committed :
postgres=# begin;
BEGIN
postgres=# select now();
now
------------------------------
2019-07-16 08:46:59.64712+00
(1 row)

postgres=# select txid_current();
txid_current
--------------
34386830
(1 row)

postgres=# commit;
COMMIT
postgres=# select pg_xact_commit_timestamp('34386830'::xid);
pg_xact_commit_timestamp
-------------------------------
2019-07-16 08:47:30.238746+00
(1 row)

Is pg_xact_commit_timestamp suitable for this? I'm getting the impression that
it isn't. But I don't understand quite how. And if it isn't suited to this
purpose, does anyone know what pg_xact_commit_timestamp is for? What I'm after
is something like a "xcommitserial" that increases reliably, and monotonically
on transaction commit. That's how I'm hoping that pg_xact_commit_timestamp
functions.

I don't think so. pg_xact_commit_timestamp returns the timestamp. If you want
some kind of ordering you have to fetch all commit timestamps (with their
respective xid) and order them.

You also can implement this tracking by yourself with triggers which insert a
row containing xid and timestamp in a tracking table. You can add an index on
timestamp column. With this approach you don't have to worry about vacuum freeze
which remove old timestamps. As you add more write, it could be more expensive
than track_commit_timestamp.

Thanks also for making me understand that pg_xact_commit_timestamp applies to a
*transaction*, not to each row. That makes it a lot lighter in the database. I
was thinking 12 bytes+ per row, which is completely off for my case. (I tend to
insert thousands of rows in a transaction.)

Yes timestamp are stored in pg_commit_ts directory. Old timestamp are removed

after freeze has explained in

https://www.postgresql.org/docs/current/routine-vacuuming.html

Thanks for the answer, and for kindly pointing me to the right section of the
documentation. It's easy to get impatient with new(er) users. I'm _not_ lazy
about reading manuls and researching but, well, the Postgres documentation is
over 3,000 pages long when you download it. So, I may have missed a detail or
two.... If I read that correctly, the ~4 billion number range is made into an
endless circle by keeping ~2 billions numbers in the past, and 2 billion in the
future. If that's right, I'm never going to be so out of data that the ~2
billion number window is too small.

Yes it is a circular counter because xid are stored on 32 bits. However you have
to keep in mind that vacuum freeze old visible rows (default is 200 millions
transactions) and you lose timestamp information.

Sawada-san made a good presentation on freezing:
https://www.slideshare.net/masahikosawada98/introduction-vauum-freezing-xid-wraparound

You can also look at this website:
http://www.interdb.jp/pg/pgsql05.html#_5.1.
http://www.interdb.jp/pg/pgsql06.html#_6.3.

Regards,

--
Adrien

#5Morris de Oryx
morrisdeoryx@gmail.com
In reply to: Adrien Nayrat (#4)
Re: Detailed questions about pg_xact_commit_timestamp

Adrien, thanks a lot for taking the time to try and explain all of these
details to me. I'm looking at incremental rollups, and thinking through
various alternative designs. It sounds like pg_xact_commit_timestamp just
isn't the right tool for my purposes, so I'll go in another direction.

All the same, I've learned a _lot_ of important points about Postgres from
trying to sort all of this out. Your messages have been a real help.

On Tue, Jul 16, 2019 at 7:03 PM Adrien Nayrat <adrien.nayrat@anayrat.info>
wrote:

Show quoted text

On 7/12/19 2:50 PM, Morris de Oryx wrote:

Adrien, thanks very much for answering my question. Just a couple of

follow-up

points, if you don't mind.

In our answer, you offer an example of pg_xact_commit_timestamp showing
out-of-sequence commit times:

Session xid pg_xact_commit_timestamp
A 34386826 2019-07-11 09:32:38.994440+00 Started earlier,
committed later
B 34386827 2019-07-11 09:32:29.806183+00

I may not have asked my question clearly, or I may not understand the

answer

properly. Or both ;-) If I understand it correctly, an xid is assigned

when a

transaction starts.

It is a little bit more complicated :) When a transaction start, a
*virtual* xid
is assigned. It is when the transaction change the state of the database,
an xid
is assigned:

Throughout running a transaction, a server process holds an exclusive

lock on the transaction's virtual transaction ID. If a permanent ID is
assigned to the transaction (which normally happens only if the transaction
changes the state of the database), it also holds an exclusive lock on the
transaction's permanent transaction ID until it ends.

https://www.postgresql.org/docs/current/view-pg-locks.html

(It shouldn't change anything for you)

One transaction might take a second, another might take ten
minutes. So, the xid sequence doesn't imply anything at all about commit
sequence. What I'm trying to figure out is if it is possible for the

commit

timestamps to somehow be out of order.

I am sorry but I don't understand what you mean by "commit timestamps to
somehow
be out of order"?

What I'm looking for is a way of finding
changes committed since a specific moment. When the transaction started

doesn't

matter in my case.

Yes, the commit timestamp is the time when the transaction is committed :
postgres=# begin;
BEGIN
postgres=# select now();
now
------------------------------
2019-07-16 08:46:59.64712+00
(1 row)

postgres=# select txid_current();
txid_current
--------------
34386830
(1 row)

postgres=# commit;
COMMIT
postgres=# select pg_xact_commit_timestamp('34386830'::xid);
pg_xact_commit_timestamp
-------------------------------
2019-07-16 08:47:30.238746+00
(1 row)

Is pg_xact_commit_timestamp suitable for this? I'm getting the

impression that

it isn't. But I don't understand quite how. And if it isn't suited to

this

purpose, does anyone know what pg_xact_commit_timestamp is for? What I'm

after

is something like a "xcommitserial" that increases reliably, and

monotonically

on transaction commit. That's how I'm hoping that

pg_xact_commit_timestamp

functions.

I don't think so. pg_xact_commit_timestamp returns the timestamp. If you
want
some kind of ordering you have to fetch all commit timestamps (with their
respective xid) and order them.

You also can implement this tracking by yourself with triggers which
insert a
row containing xid and timestamp in a tracking table. You can add an index
on
timestamp column. With this approach you don't have to worry about vacuum
freeze
which remove old timestamps. As you add more write, it could be more
expensive
than track_commit_timestamp.

Thanks also for making me understand that pg_xact_commit_timestamp

applies to a

*transaction*, not to each row. That makes it a lot lighter in the

database. I

was thinking 12 bytes+ per row, which is completely off for my case. (I

tend to

insert thousands of rows in a transaction.)

Yes timestamp are stored in pg_commit_ts directory. Old timestamp are

removed

after freeze has explained in

https://www.postgresql.org/docs/current/routine-vacuuming.html

Thanks for the answer, and for kindly pointing me to the right section

of the

documentation. It's easy to get impatient with new(er) users. I'm _not_

lazy

about reading manuls and researching but, well, the Postgres

documentation is

over 3,000 pages long when you download it. So, I may have missed a

detail or

two.... If I read that correctly, the ~4 billion number range is made

into an

endless circle by keeping ~2 billions numbers in the past, and 2 billion

in the

future. If that's right, I'm never going to be so out of data that the ~2
billion number window is too small.

Yes it is a circular counter because xid are stored on 32 bits. However
you have
to keep in mind that vacuum freeze old visible rows (default is 200
millions
transactions) and you lose timestamp information.

Sawada-san made a good presentation on freezing:

https://www.slideshare.net/masahikosawada98/introduction-vauum-freezing-xid-wraparound

You can also look at this website:
http://www.interdb.jp/pg/pgsql05.html#_5.1.
http://www.interdb.jp/pg/pgsql06.html#_6.3.

Regards,

--
Adrien