32/64-bit transaction IDs?
I need a guaranteed ordering of transaction IDs as they are queued for
asyncronous replication. I'd use the 32-bit internal transaction ID from
GetTransactionId(), but 32-bits will not suffice for the logic I'd like to
use. A 64-bit ID would work well enough.
Further, I am uncertain those 32-bit transaction IDs represent a correctly
ordered sequence of transactions.
Do the internal transaction IDs provide the correct transaction order?
Are 64-bit transaction IDs available?
If not, what other tricks are there to get guaranteed ordering?
Ed
"Ed L." <pgsql@bluepolka.net> writes:
I need a guaranteed ordering of transaction IDs as they are queued for
asyncronous replication. I'd use the 32-bit internal transaction ID from
GetTransactionId(), but 32-bits will not suffice for the logic I'd like to
use. A 64-bit ID would work well enough.
Uh ... your replication logic cannot ensure to reach closure within two
billion transactions? Something's pretty out of joint there.
Further, I am uncertain those 32-bit transaction IDs represent a correctly
ordered sequence of transactions.
XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.
regards, tom lane
On Thursday March 20 2003 11:12, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
I need a guaranteed ordering of transaction IDs as they are queued for
asyncronous replication. I'd use the 32-bit internal transaction ID
from GetTransactionId(), but 32-bits will not suffice for the logic I'd
like to use. A 64-bit ID would work well enough.Uh ... your replication logic cannot ensure to reach closure within two
billion transactions? Something's pretty out of joint there.
Uh...er...well... It just simplifies and robustifies a little logic if I can
do a clean-sweep delete of replicated transactions based on the xid being
less than some marker value. Something like
DELETE
FROM replication_queue
WHERE xid < (SELECT MIN(last_mirrored_xid) FROM slave);
where slave is a relation of slaves to keep track of where they are in the
replication process. When transaction IDs wrap around, that logic breaks
down for the new xids of 1, 2, 3, ... , and more accounting is required.
In our case, under considerable load, it probably wouldn't wrap for a year
or three, but it'd be nice to avoid that little timebomb it if it's easy.
Further, I am uncertain those 32-bit transaction IDs represent a
correctly ordered sequence of transactions.XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.
Yes, I'm interested in any reliable ordering of transaction commits for the
purpose of serial asyncronous replication. Is that possible?
Ed
"Ed L." <pgsql@bluepolka.net> writes:
XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.
Yes, I'm interested in any reliable ordering of transaction commits for the
purpose of serial asyncronous replication. Is that possible?
The order in which commit records are inserted into the WAL log would
work (and if you just need an ID number, the commit records' WAL-log
addresses are 64-bits and would serve nicely). But you'll have to add
code to the backend to get at that.
regards, tom lane
On Friday March 21 2003 11:53, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.Yes, I'm interested in any reliable ordering of transaction commits for
the purpose of serial asyncronous replication. Is that possible?The order in which commit records are inserted into the WAL log would
work (and if you just need an ID number, the commit records' WAL-log
addresses are 64-bits and would serve nicely). But you'll have to add
code to the backend to get at that.
A 64-bit serial ID number would suffice. I could spend a little time on
that. Any starter hints on files/functions/structure of such an addition?
Something akin to GetTransactionId()?
Ed
On Friday March 21 2003 11:53, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
XIDs are issued in transaction start order, not transaction commit
order. Dunno if that's what's concerning you.Yes, I'm interested in any reliable ordering of transaction commits for
the purpose of serial asyncronous replication. Is that possible?The order in which commit records are inserted into the WAL log would
work (and if you just need an ID number, the commit records' WAL-log
addresses are 64-bits and would serve nicely). But you'll have to add
code to the backend to get at that.
A 64-bit serial ID number would suffice. I could spend a little time on
that. Any starter hints on files/functions/structure of such an addition?
Something akin to GetTransactionId()?
Ed
Ed Loehr <ed@LoehrTech.com> writes:
A 64-bit serial ID number would suffice. I could spend a little time on
that. Any starter hints on files/functions/structure of such an addition?
Something akin to GetTransactionId()?
You'd have to capture the WAL "recptr" returned to
RecordTransactionCommit by XLogInsert. What you do with it afterwards,
I have no idea.
regards, tom lane
On Friday March 21 2003 2:51, you wrote:
Ed Loehr <ed@LoehrTech.com> writes:
A 64-bit serial ID number would suffice. I could spend a little time
on that. Any starter hints on files/functions/structure of such an
addition? Something akin to GetTransactionId()?You'd have to capture the WAL "recptr" returned to
RecordTransactionCommit by XLogInsert. What you do with it afterwards,
I have no idea.
Well, I'm trying to capture a 64-bit psuedo-transaction ID, just like the
WAL record number, but to do it within a C trigger so I can queue it into
another table and have all-or-none semantics. Am I correct in assuming the
XLogInsert() call is made after the transaction is guaranteed to completed?
If so, wouldn't this be beyond the triggered function's reach?
Ed
"Ed L." <pgsql@bluepolka.net> writes:
On Friday March 21 2003 2:51, you wrote:
You'd have to capture the WAL "recptr" returned to
RecordTransactionCommit by XLogInsert. What you do with it afterwards,
I have no idea.
Well, I'm trying to capture a 64-bit psuedo-transaction ID, just like the
WAL record number, but to do it within a C trigger so I can queue it into
another table and have all-or-none semantics. Am I correct in assuming the
XLogInsert() call is made after the transaction is guaranteed to completed?
If so, wouldn't this be beyond the triggered function's reach?
It's certainly out of reach of anything executed within the transaction,
since by definition the commit record is only written after the
transaction is done. It seems to me to be a contradiction in terms to
expect within-transaction actions to have any information about commit
order of their transaction.
regards, tom lane
On Friday March 21 2003 10:07, Tom Lane wrote:
Well, I'm trying to capture a 64-bit psuedo-transaction ID, just like
the WAL record number, but to do it within a C trigger so I can queue
it into another table and have all-or-none semantics. Am I correct in
assuming the XLogInsert() call is made after the transaction is
guaranteed to completed? If so, wouldn't this be beyond the triggered
function's reach?It's certainly out of reach of anything executed within the transaction,
since by definition the commit record is only written after the
transaction is done. It seems to me to be a contradiction in terms to
expect within-transaction actions to have any information about commit
order of their transaction.
I see your point. Maybe it's not possible to get perfect ordering from any
information available within a transaction?
Using the transaction ID for ordering seems problematic given the
variability of transaction lifetimes, not to mention the 32-bit issue. I
wonder if it'd be advisable to make WAL data available in a (system?)
table, maybe mapping the transaction ID to the WAL record number? Just
looking for some way to make the true commit order visible to a SQL query
in order to leverage existing replication code ...
Ed
On Friday March 21 2003 11:12, Ed L. wrote:
Using the transaction ID for ordering seems problematic given the
variability of transaction lifetimes, not to mention the 32-bit issue. I
wonder if it'd be advisable to make WAL data available in a (system?)
table, maybe mapping the transaction ID to the WAL record number? Just
looking for some way to make the true commit order visible to a SQL query
in order to leverage existing replication code ...
This is admittedly a half-baked idea from someone with little knowledge of
the pg code, but more specifically, I'm imagining a system table that looks
like this (I know this is not how system tables are specified):
create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoff
This would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table. But of course,
this table would get a gazillion inserts, and the appropriate logic for
clearing/truncating and the potential performance and memory impacts are
unclear to me. Still, it does have the appeal of allowing a combination of
a trigger and SQL to reliably determine the transaction order, which would
seem to allow asyncronous trigger-based replication schemes to get the
right replay order.
Any other ideas as to how to get the guaranteed correct transaction order
via triggers?
Ed
"Ed L." <pgsql@bluepolka.net> writes:
create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoff
This would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table.
... except that once a transaction has committed, it can't go around
making more database entries. You can only modify the DB within a
transaction.
regards, tom lane
On Saturday March 22 2003 8:44, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoffThis would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table.... except that once a transaction has committed, it can't go around
making more database entries. You can only modify the DB within a
transaction.
There may be other good reasons not to do it, but it appears it does not all
need to be done within the trigger:
1. Within the transaction, a user trigger records the transaction ID into a
table along with tuple data via GetCurrentTransactionId() just as dbmirror
and rserv do now, essentially like
insert into replication_queue (xid, data, ...)
2. In xact.c, RecordTransactionCommit() inserts the (xid,recno) pair into
pg_xlog just after XLogInsert() for the commit. It doesn't matter if it's
within reach of the user's transaction-constrained trigger code or not
because it's done by the system.
3. An asyncronous slave replicator comes along later and match the xid to
to the replication tuples, as in
select
from pg_xlog l, replication_queue q
where l.xid = q.xid
order by l.recno;
...and gets a guaranteed ordering.
I'm also wondering if my concerns are unfounded in regard to the impact of
variable transaction lifetimes on the xid. Maybe MVCC maybe takes care of
any concerns from variable transaction lifetimes so that the xid, even
though it comes from the beginning of the transaction, is a reliable
ordering?
Ed
On Saturday March 22 2003 9:15, Ed L. wrote:
On Saturday March 22 2003 8:44, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
create table pg_xlog (
xid int4 unique not null,
recno int8 unique not null,
);
-- recno = (XLogRecPtr.xlogid) << 32) + XLogRecPtr.xrecoffThis would map transaction IDs to WAL log record numbers. It seems
straight-forward to get the right data into this table.... except that once a transaction has committed, it can't go around
making more database entries. You can only modify the DB within a
transaction.
...
2. In xact.c, RecordTransactionCommit() inserts the (xid,recno) pair
into pg_xlog just after XLogInsert() for the commit. It doesn't matter
if it's within reach of the user's transaction-constrained trigger code
or not because it's done by the system.
Am I correct in assuming the system can modify system tables outside a
transaction?
"Ed L." <pgsql@bluepolka.net> writes:
Am I correct in assuming the system can modify system tables outside a
transaction?
No, it can't. *Any* DB change has to be done within a transaction.
regards, tom lane
On Saturday March 22 2003 9:26, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
Am I correct in assuming the system can modify system tables outside a
transaction?No, it can't. *Any* DB change has to be done within a transaction.
D'oh. Horse...dead.
You didn't say, but do you think the xid is a reliable replication replay
ordering even though it comes at the start of the transaction?
Ed
Hi,
I'm just testing insertion of about 600 000 records inside 3 tables.
Just making a big text file with 3 inserts each time (for my 3 tables) like
insert into xx (yy) values ('data'); so I have 3 x 600 000 inserts inside the
file.
Table N°2 have a reference on the Table N°1 with the primary key ...
It's not a transaction ... I have only a primary key on each 3 tables ...
The tables where empty at the beginning and all the database vacuum full
analysed before the test.
Same result on PostgreSQL 7.2.3 and 7.3.2 ...
So I get about 1000 inserts done in 10 seconds at the beginning of the file,
then after one hour I get 25 seconds for 1000 inserts ... and it's going
slower and slower ...
Before making a text file ... I have done the same thing with a Perl script
with DBI to make the 3 inserts in a transaction (of the 600 000 records x3),
and it was really slower (I stopped the script after 10 hours and 819 000
inserts done.) And same thing it was going slower and slower ... it was about
2000 inserts for 10 seconds the 10 first minutes... then after 10 hours I was
at 40 inserts for 10 seconds ...
I'm using for my tests as server with Bi-Pentium III 1.8 Ghz - 2Mb of RAM and
Linux 2.4.18 ... the memory do not move ... the swap was never used ...
Why PostgreSQL's going slower ?
Do you have any idea to have a better and quicker insert method ?
Thanks per advance for your help :o)
regards,
--
Hervé
"Ed L." <pgsql@bluepolka.net> writes:
You didn't say, but do you think the xid is a reliable replication replay
ordering even though it comes at the start of the transaction?
Dunno. What requirements have you really got?
regards, tom lane
On Saturday March 22 2003 9:55, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
You didn't say, but do you think the xid is a reliable replication
replay ordering even though it comes at the start of the transaction?Dunno. What requirements have you really got?
Again, the context is asyncronous trigger-based master-slave replication ala
contrib/dbmirror. In this context, AFAICS, the primary requirement is to
queue updates, inserts, and deletes on a master db for later SQL retrieval
and subsequent serial "replay" into a slave in the "right" order. The
master-queued tuple changes should be groupable as transactions so that any
replication can enforce all-or-none semantics on the slave, though normally
it should never hit a snag since the master and slave are initialized as
identical copies. The queue should be ordered the "right" way for serial
replay into a slave, whatever that "right" way is in order to maintain
consistency with the master. I assume triggers will have to be disabled
during replay on the slave to avoid time-sensitive side-effects.
DBMirror basically already does all of this, except for disabling triggers
and my uncertainty about the ordering issue.
Ed
=?iso-8859-15?q?Herv=E9=20Piedvache?= <herve@elma.fr> writes:
Table N�2 have a reference on the Table N�1 with the primary key ...
It's not a transaction ... I have only a primary key on each 3 tables ...
The tables where empty at the beginning and all the database vacuum full
analysed before the test.
Same result on PostgreSQL 7.2.3 and 7.3.2 ...
So I get about 1000 inserts done in 10 seconds at the beginning of the file,
then after one hour I get 25 seconds for 1000 inserts ... and it's going
slower and slower ...
You probably should *not* have vacuum analyzed while the tables were
empty. The planner generated plans for the foreign-key checks based on
the assumption that the tables are tiny --- so it's probably using
seqscans to make the checks. The actual inserts won't slow down
materially as the tables fill, but seqscan foreign-key checks will.
You could vacuum now to update the stats, but that won't affect
the foreign-key plans that the inserting process has already cached.
You'll have to stop the inserting process and start a new one in any
case.
Another possibility is to start again with freshly created tables and no
initial vacuum. If the planner has no stats it should default to
indexscans.
regards, tom lane