32/64-bit transaction IDs?

Started by Ed L.about 23 years ago68 messagesgeneral
Jump to latest
#1Ed L.
pgsql@bluepolka.net

I need a guaranteed ordering of transaction IDs as they are queued for
asyncronous replication. I'd use the 32-bit internal transaction ID from
GetTransactionId(), but 32-bits will not suffice for the logic I'd like to
use. A 64-bit ID would work well enough.

Further, I am uncertain those 32-bit transaction IDs represent a correctly
ordered sequence of transactions.

Do the internal transaction IDs provide the correct transaction order?

Are 64-bit transaction IDs available?

If not, what other tricks are there to get guaranteed ordering?

Ed

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#1)
Re: 32/64-bit transaction IDs?

"Ed L." <pgsql@bluepolka.net> writes:

I need a guaranteed ordering of transaction IDs as they are queued for
asyncronous replication. I'd use the 32-bit internal transaction ID from
GetTransactionId(), but 32-bits will not suffice for the logic I'd like to
use. A 64-bit ID would work well enough.

Uh ... your replication logic cannot ensure to reach closure within two
billion transactions? Something's pretty out of joint there.

Further, I am uncertain those 32-bit transaction IDs represent a correctly
ordered sequence of transactions.

XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.

regards, tom lane

#3Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#2)
Re: 32/64-bit transaction IDs?

On Thursday March 20 2003 11:12, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

I need a guaranteed ordering of transaction IDs as they are queued for
asyncronous replication. I'd use the 32-bit internal transaction ID
from GetTransactionId(), but 32-bits will not suffice for the logic I'd
like to use. A 64-bit ID would work well enough.

Uh ... your replication logic cannot ensure to reach closure within two
billion transactions? Something's pretty out of joint there.

Uh...er...well... It just simplifies and robustifies a little logic if I can
do a clean-sweep delete of replicated transactions based on the xid being
less than some marker value. Something like

DELETE
FROM replication_queue
WHERE xid < (SELECT MIN(last_mirrored_xid) FROM slave);

where slave is a relation of slaves to keep track of where they are in the
replication process. When transaction IDs wrap around, that logic breaks
down for the new xids of 1, 2, 3, ... , and more accounting is required.
In our case, under considerable load, it probably wouldn't wrap for a year
or three, but it'd be nice to avoid that little timebomb it if it's easy.

Further, I am uncertain those 32-bit transaction IDs represent a
correctly ordered sequence of transactions.

XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.

Yes, I'm interested in any reliable ordering of transaction commits for the
purpose of serial asyncronous replication. Is that possible?

Ed

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#3)
Re: 32/64-bit transaction IDs?

"Ed L." <pgsql@bluepolka.net> writes:

XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.

Yes, I'm interested in any reliable ordering of transaction commits for the
purpose of serial asyncronous replication. Is that possible?

The order in which commit records are inserted into the WAL log would
work (and if you just need an ID number, the commit records' WAL-log
addresses are 64-bits and would serve nicely). But you'll have to add
code to the backend to get at that.

regards, tom lane

#5Ed Loehr
ed@LoehrTech.com
In reply to: Tom Lane (#4)
Re: 32/64-bit transaction IDs?

On Friday March 21 2003 11:53, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.

Yes, I'm interested in any reliable ordering of transaction commits for
the purpose of serial asyncronous replication. Is that possible?

The order in which commit records are inserted into the WAL log would
work (and if you just need an ID number, the commit records' WAL-log
addresses are 64-bits and would serve nicely). But you'll have to add
code to the backend to get at that.

A 64-bit serial ID number would suffice. I could spend a little time on
that. Any starter hints on files/functions/structure of such an addition?
Something akin to GetTransactionId()?

Ed

#6Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#4)
Re: 32/64-bit transaction IDs?

On Friday March 21 2003 11:53, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.

Yes, I'm interested in any reliable ordering of transaction commits for
the purpose of serial asyncronous replication. Is that possible?

The order in which commit records are inserted into the WAL log would
work (and if you just need an ID number, the commit records' WAL-log
addresses are 64-bits and would serve nicely). But you'll have to add
code to the backend to get at that.

A 64-bit serial ID number would suffice. I could spend a little time on
that. Any starter hints on files/functions/structure of such an addition?
Something akin to GetTransactionId()?

Ed

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed Loehr (#5)
Re: 32/64-bit transaction IDs?

Ed Loehr <ed@LoehrTech.com> writes:

A 64-bit serial ID number would suffice. I could spend a little time on
that. Any starter hints on files/functions/structure of such an addition?
Something akin to GetTransactionId()?

You'd have to capture the WAL "recptr" returned to
RecordTransactionCommit by XLogInsert. What you do with it afterwards,
I have no idea.

regards, tom lane

#8Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#7)
Re: 32/64-bit transaction IDs?

On Friday March 21 2003 2:51, you wrote:

Ed Loehr <ed@LoehrTech.com> writes:

A 64-bit serial ID number would suffice. I could spend a little time
on that. Any starter hints on files/functions/structure of such an
addition? Something akin to GetTransactionId()?

You'd have to capture the WAL "recptr" returned to
RecordTransactionCommit by XLogInsert. What you do with it afterwards,
I have no idea.

Well, I'm trying to capture a 64-bit psuedo-transaction ID, just like the
WAL record number, but to do it within a C trigger so I can queue it into
another table and have all-or-none semantics. Am I correct in assuming the
XLogInsert() call is made after the transaction is guaranteed to completed?
If so, wouldn't this be beyond the triggered function's reach?

Ed

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#8)
Re: 32/64-bit transaction IDs?

"Ed L." <pgsql@bluepolka.net> writes:

On Friday March 21 2003 2:51, you wrote:

You'd have to capture the WAL "recptr" returned to
RecordTransactionCommit by XLogInsert. What you do with it afterwards,
I have no idea.

Well, I'm trying to capture a 64-bit psuedo-transaction ID, just like the
WAL record number, but to do it within a C trigger so I can queue it into
another table and have all-or-none semantics. Am I correct in assuming the
XLogInsert() call is made after the transaction is guaranteed to completed?
If so, wouldn't this be beyond the triggered function's reach?

It's certainly out of reach of anything executed within the transaction,
since by definition the commit record is only written after the
transaction is done. It seems to me to be a contradiction in terms to
expect within-transaction actions to have any information about commit
order of their transaction.

regards, tom lane

#10Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#9)
Re: 32/64-bit transaction IDs?

On Friday March 21 2003 10:07, Tom Lane wrote:

Well, I'm trying to capture a 64-bit psuedo-transaction ID, just like
the WAL record number, but to do it within a C trigger so I can queue
it into another table and have all-or-none semantics. Am I correct in
assuming the XLogInsert() call is made after the transaction is
guaranteed to completed? If so, wouldn't this be beyond the triggered
function's reach?

It's certainly out of reach of anything executed within the transaction,
since by definition the commit record is only written after the
transaction is done. It seems to me to be a contradiction in terms to
expect within-transaction actions to have any information about commit
order of their transaction.

I see your point. Maybe it's not possible to get perfect ordering from any
information available within a transaction?

Using the transaction ID for ordering seems problematic given the
variability of transaction lifetimes, not to mention the 32-bit issue. I
wonder if it'd be advisable to make WAL data available in a (system?)
table, maybe mapping the transaction ID to the WAL record number? Just
looking for some way to make the true commit order visible to a SQL query
in order to leverage existing replication code ...

Ed

#11Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#10)
Re: 32/64-bit transaction IDs?

On Friday March 21 2003 11:12, Ed L. wrote:

Using the transaction ID for ordering seems problematic given the
variability of transaction lifetimes, not to mention the 32-bit issue. I
wonder if it'd be advisable to make WAL data available in a (system?)
table, maybe mapping the transaction ID to the WAL record number? Just
looking for some way to make the true commit order visible to a SQL query
in order to leverage existing replication code ...

This is admittedly a half-baked idea from someone with little knowledge of
the pg code, but more specifically, I'm imagining a system table that looks
like this (I know this is not how system tables are specified):

create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoff

This would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table. But of course,
this table would get a gazillion inserts, and the appropriate logic for
clearing/truncating and the potential performance and memory impacts are
unclear to me. Still, it does have the appeal of allowing a combination of
a trigger and SQL to reliably determine the transaction order, which would
seem to allow asyncronous trigger-based replication schemes to get the
right replay order.

Any other ideas as to how to get the guaranteed correct transaction order
via triggers?

Ed

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#11)
Re: 32/64-bit transaction IDs?

"Ed L." <pgsql@bluepolka.net> writes:

create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoff

This would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table.

... except that once a transaction has committed, it can't go around
making more database entries. You can only modify the DB within a
transaction.

regards, tom lane

#13Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#12)
Re: 32/64-bit transaction IDs?

On Saturday March 22 2003 8:44, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoff

This would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table.

... except that once a transaction has committed, it can't go around
making more database entries. You can only modify the DB within a
transaction.

There may be other good reasons not to do it, but it appears it does not all
need to be done within the trigger:

1. Within the transaction, a user trigger records the transaction ID into a
table along with tuple data via GetCurrentTransactionId() just as dbmirror
and rserv do now, essentially like

insert into replication_queue (xid, data, ...)

2. In xact.c, RecordTransactionCommit() inserts the (xid,recno) pair into
pg_xlog just after XLogInsert() for the commit. It doesn't matter if it's
within reach of the user's transaction-constrained trigger code or not
because it's done by the system.

3. An asyncronous slave replicator comes along later and match the xid to
to the replication tuples, as in

select
from pg_xlog l, replication_queue q
where l.xid = q.xid
order by l.recno;

...and gets a guaranteed ordering.

I'm also wondering if my concerns are unfounded in regard to the impact of
variable transaction lifetimes on the xid. Maybe MVCC maybe takes care of
any concerns from variable transaction lifetimes so that the xid, even
though it comes from the beginning of the transaction, is a reliable
ordering?

Ed

#14Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#13)
Re: 32/64-bit transaction IDs?

On Saturday March 22 2003 9:15, Ed L. wrote:

On Saturday March 22 2003 8:44, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoff

This would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table.

... except that once a transaction has committed, it can't go around
making more database entries. You can only modify the DB within a
transaction.

...

2. In xact.c, RecordTransactionCommit() inserts the (xid,recno) pair
into pg_xlog just after XLogInsert() for the commit. It doesn't matter
if it's within reach of the user's transaction-constrained trigger code
or not because it's done by the system.

Am I correct in assuming the system can modify system tables outside a
transaction?

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#14)
Re: 32/64-bit transaction IDs?

"Ed L." <pgsql@bluepolka.net> writes:

Am I correct in assuming the system can modify system tables outside a
transaction?

No, it can't. *Any* DB change has to be done within a transaction.

regards, tom lane

#16Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#15)
Re: 32/64-bit transaction IDs?

On Saturday March 22 2003 9:26, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

Am I correct in assuming the system can modify system tables outside a
transaction?

No, it can't. *Any* DB change has to be done within a transaction.

D'oh. Horse...dead.

You didn't say, but do you think the xid is a reliable replication replay
ordering even though it comes at the start of the transaction?

Ed

#17Hervé Piedvache
herve@elma.fr
In reply to: Tom Lane (#15)
Thousands INSERT going slow ...

Hi,

I'm just testing insertion of about 600 000 records inside 3 tables.

Just making a big text file with 3 inserts each time (for my 3 tables) like
insert into xx (yy) values ('data'); so I have 3 x 600 000 inserts inside the
file.

Table N°2 have a reference on the Table N°1 with the primary key ...
It's not a transaction ... I have only a primary key on each 3 tables ...
The tables where empty at the beginning and all the database vacuum full
analysed before the test.
Same result on PostgreSQL 7.2.3 and 7.3.2 ...

So I get about 1000 inserts done in 10 seconds at the beginning of the file,
then after one hour I get 25 seconds for 1000 inserts ... and it's going
slower and slower ...

Before making a text file ... I have done the same thing with a Perl script
with DBI to make the 3 inserts in a transaction (of the 600 000 records x3),
and it was really slower (I stopped the script after 10 hours and 819 000
inserts done.) And same thing it was going slower and slower ... it was about
2000 inserts for 10 seconds the 10 first minutes... then after 10 hours I was
at 40 inserts for 10 seconds ...

I'm using for my tests as server with Bi-Pentium III 1.8 Ghz - 2Mb of RAM and
Linux 2.4.18 ... the memory do not move ... the swap was never used ...

Why PostgreSQL's going slower ?
Do you have any idea to have a better and quicker insert method ?

Thanks per advance for your help :o)

regards,
--
Hervé

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#16)
Re: 32/64-bit transaction IDs?

"Ed L." <pgsql@bluepolka.net> writes:

You didn't say, but do you think the xid is a reliable replication replay
ordering even though it comes at the start of the transaction?

Dunno. What requirements have you really got?

regards, tom lane

#19Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#18)
Re: 32/64-bit transaction IDs?

On Saturday March 22 2003 9:55, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

You didn't say, but do you think the xid is a reliable replication
replay ordering even though it comes at the start of the transaction?

Dunno. What requirements have you really got?

Again, the context is asyncronous trigger-based master-slave replication ala
contrib/dbmirror. In this context, AFAICS, the primary requirement is to
queue updates, inserts, and deletes on a master db for later SQL retrieval
and subsequent serial "replay" into a slave in the "right" order. The
master-queued tuple changes should be groupable as transactions so that any
replication can enforce all-or-none semantics on the slave, though normally
it should never hit a snag since the master and slave are initialized as
identical copies. The queue should be ordered the "right" way for serial
replay into a slave, whatever that "right" way is in order to maintain
consistency with the master. I assume triggers will have to be disabled
during replay on the slave to avoid time-sensitive side-effects.

DBMirror basically already does all of this, except for disabling triggers
and my uncertainty about the ordering issue.

Ed

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hervé Piedvache (#17)
Re: Thousands INSERT going slow ...

=?iso-8859-15?q?Herv=E9=20Piedvache?= <herve@elma.fr> writes:

Table N�2 have a reference on the Table N�1 with the primary key ...
It's not a transaction ... I have only a primary key on each 3 tables ...
The tables where empty at the beginning and all the database vacuum full
analysed before the test.
Same result on PostgreSQL 7.2.3 and 7.3.2 ...

So I get about 1000 inserts done in 10 seconds at the beginning of the file,
then after one hour I get 25 seconds for 1000 inserts ... and it's going
slower and slower ...

You probably should *not* have vacuum analyzed while the tables were
empty. The planner generated plans for the foreign-key checks based on
the assumption that the tables are tiny --- so it's probably using
seqscans to make the checks. The actual inserts won't slow down
materially as the tables fill, but seqscan foreign-key checks will.

You could vacuum now to update the stats, but that won't affect
the foreign-key plans that the inserting process has already cached.
You'll have to stop the inserting process and start a new one in any
case.

Another possibility is to start again with freshly created tables and no
initial vacuum. If the planner has no stats it should default to
indexscans.

regards, tom lane

#21Doug McNaught
doug@mcnaught.org
In reply to: Ed L. (#1)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#19)
#23Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#23)
#25Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#25)
#27Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#27)
#29Andrew Sullivan
andrew@libertyrms.info
In reply to: Tom Lane (#26)
#30Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#26)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#30)
#32Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#31)
#33Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#32)
#35Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#34)
#36Richard Huxton
dev@archonet.com
In reply to: Ed L. (#35)
#37Jan Wieck
JanWieck@Yahoo.com
In reply to: Ed L. (#1)
#38Dennis Gearon
gearond@cvc.net
In reply to: Jan Wieck (#37)
#39Ed L.
pgsql@bluepolka.net
In reply to: Jan Wieck (#37)
#40Ed L.
pgsql@bluepolka.net
In reply to: Richard Huxton (#36)
#41Jan Wieck
JanWieck@Yahoo.com
In reply to: Ed L. (#1)
#42Ed L.
pgsql@bluepolka.net
In reply to: Jan Wieck (#41)
#43Jan Wieck
JanWieck@Yahoo.com
In reply to: Ed L. (#1)
#44Ed L.
pgsql@bluepolka.net
In reply to: Jan Wieck (#43)
#45Jan Wieck
JanWieck@Yahoo.com
In reply to: Ed L. (#1)
#46elein
elein@sbcglobal.net
In reply to: Ed L. (#35)
#47Ed L.
pgsql@bluepolka.net
In reply to: Jan Wieck (#45)
#48Ed L.
pgsql@bluepolka.net
In reply to: elein (#46)
#49Ed L.
pgsql@bluepolka.net
In reply to: Ed L. (#47)
#50Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Ed L. (#44)
#51Ed L.
pgsql@bluepolka.net
In reply to: Stephan Szabo (#50)
#52Jan Wieck
JanWieck@Yahoo.com
In reply to: Ed L. (#1)
#53Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Ed L. (#51)
#54Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Ed L. (#44)
#55Steven Singer
ssinger@navtechinc.com
In reply to: Stephan Szabo (#50)
#56Steven Singer
ssinger@navtechinc.com
In reply to: Ed L. (#44)
#57Ed L.
pgsql@bluepolka.net
In reply to: Jim Nasby (#53)
#58Ed L.
pgsql@bluepolka.net
In reply to: Jim Nasby (#54)
#59Ed L.
pgsql@bluepolka.net
In reply to: Steven Singer (#56)
#60Martijn van Oosterhout
kleptog@svana.org
In reply to: Ed L. (#48)
#61Steven Singer
ssinger@navtechinc.com
In reply to: Ed L. (#59)
#62Ed L.
pgsql@bluepolka.net
In reply to: Steven Singer (#61)
#63Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Steven Singer (#55)
#64Andrew Sullivan
andrew@libertyrms.info
In reply to: Jan Wieck (#45)
#65Ed L.
pgsql@bluepolka.net
In reply to: Andrew Sullivan (#64)
#66Dennis Gearon
gearond@cvc.net
In reply to: Ed L. (#1)
#67Andrew Sullivan
andrew@libertyrms.info
In reply to: Ed L. (#65)
#68Ed L.
pgsql@bluepolka.net
In reply to: Dennis Gearon (#66)