xlog switch & last record before the switch

Started by Florian G. Pflugover 18 years ago3 messages
#1Florian G. Pflug
fgp@phlo.org

Hi

To test my PITR-slave readonly-query patch, I continously do
insert into test ...
pg_switch_xlog()
sleep 1
on the master, and let the slave process the generated xlogs

The log output on the slave looks the following (unnecessary lines remove)

LOG: restored log file "00000001000000000000016E" from archive
LOG: REDO @ 0/16E00020; LSN 0/16E0004C: prev 0/16D00098; xid 1346: Transaction
- commit: 2007-07-12 15:43:00.686056+02
LOG: REDO @ 0/16E0004C; LSN 0/16E0008C: prev 0/16E00020; xid 0: XLOG -
checkpoint: redo 0/16E00020; tli 1; xid 0/1347;
oid 24576; multi 1; offset 0; online
LOG: REDO @ 0/16E0008C; LSN 0/16E00140: prev 0/16E0004C; xid 1350: Sequence -
log: rel 1663/1/16384
LOG: REDO @ 0/16E00140; LSN 0/16E00B88: prev 0/16E0008C; xid 1350; bkpb1: Heap
- insert: rel 1663/1/16386; tid 2/49
LOG: REDO @ 0/16E00B88; LSN 0/16F00000: prev 0/16E00140; xid 1350: XLOG - xlog
switch

LOG: restored log file "00000001000000000000016F" from archive
LOG: REDO @ 0/16F00020; LSN 0/16F0004C: prev 0/16E00B88; xid 1350: Transaction
- commit: 2007-07-12 15:43:02.159717+02
LOG: REDO @ 0/16F0004C; LSN 0/16F00098: prev 0/16F00020; xid 1352: Heap -
insert: rel 1663/1/16386; tid 2/50
LOG: REDO @ 0/16F00098; LSN 0/17000000: prev 0/16F0004C; xid 1352: XLOG - xlog
switch

LOG: restored log file "000000010000000000000170" from archive
LOG: REDO @ 0/17000020; LSN 0/1700004C: prev 0/16F00098; xid 1352: Transaction
- commit: 2007-07-12 15:43:02.26456+02
LOG: REDO @ 0/1700004C; LSN 0/17000098: prev 0/17000020; xid 1356: Heap -
insert: rel 1663/1/16386; tid 2/51
LOG: REDO @ 0/17000098; LSN 0/17100000: prev 0/1700004C; xid 1356: XLOG - xlog
switch

As you can see, the COMMIT records seems to end up being logged *after*
the xlog switch. I would have expected the order
"heap-insert, commit, switch, heap-insert, commit, switch, ...", not
"heap-insert, switch, commit, heap-insert, switch, commit, ...".

Is this the expected behaviour, or just an artefact of the implementation
of xlog switches?

greetings, Florian Pflug

#2Simon Riggs
simon@2ndquadrant.com
In reply to: Florian G. Pflug (#1)
Re: xlog switch & last record before the switch

On Thu, 2007-07-12 at 16:17 +0200, Florian G. Pflug wrote:

To test my PITR-slave readonly-query patch, I continously do
insert into test ...
pg_switch_xlog()
sleep 1
on the master, and let the slave process the generated xlogs

The log output on the slave looks the following (unnecessary lines remove)

LOG: restored log file "00000001000000000000016E" from archive
LOG: REDO @ 0/16E00020; LSN 0/16E0004C: prev 0/16D00098; xid 1346: Transaction
- commit: 2007-07-12 15:43:00.686056+02
LOG: REDO @ 0/16E0004C; LSN 0/16E0008C: prev 0/16E00020; xid 0: XLOG -
checkpoint: redo 0/16E00020; tli 1; xid 0/1347;
oid 24576; multi 1; offset 0; online
LOG: REDO @ 0/16E0008C; LSN 0/16E00140: prev 0/16E0004C; xid 1350: Sequence -
log: rel 1663/1/16384
LOG: REDO @ 0/16E00140; LSN 0/16E00B88: prev 0/16E0008C; xid 1350; bkpb1: Heap
- insert: rel 1663/1/16386; tid 2/49
LOG: REDO @ 0/16E00B88; LSN 0/16F00000: prev 0/16E00140; xid 1350: XLOG - xlog
switch

LOG: restored log file "00000001000000000000016F" from archive
LOG: REDO @ 0/16F00020; LSN 0/16F0004C: prev 0/16E00B88; xid 1350: Transaction
- commit: 2007-07-12 15:43:02.159717+02
LOG: REDO @ 0/16F0004C; LSN 0/16F00098: prev 0/16F00020; xid 1352: Heap -
insert: rel 1663/1/16386; tid 2/50
LOG: REDO @ 0/16F00098; LSN 0/17000000: prev 0/16F0004C; xid 1352: XLOG - xlog
switch

LOG: restored log file "000000010000000000000170" from archive
LOG: REDO @ 0/17000020; LSN 0/1700004C: prev 0/16F00098; xid 1352: Transaction
- commit: 2007-07-12 15:43:02.26456+02
LOG: REDO @ 0/1700004C; LSN 0/17000098: prev 0/17000020; xid 1356: Heap -
insert: rel 1663/1/16386; tid 2/51
LOG: REDO @ 0/17000098; LSN 0/17100000: prev 0/1700004C; xid 1356: XLOG - xlog
switch

As you can see, the COMMIT records seems to end up being logged *after*
the xlog switch. I would have expected the order
"heap-insert, commit, switch, heap-insert, commit, switch, ...", not
"heap-insert, switch, commit, heap-insert, switch, commit, ...".

Is this the expected behaviour, or just an artefact of the implementation
of xlog switches?

Can you show the exact SQL executed?

If you do INSERT ... ;select pg_switch_xlog() then the COMMIT will occur
after the switch. If you do:
INSERT ...;
select pg_switch_xlog();
then the COMMIT will occur before the switch.

If it were otherwise this would imply statements were executed prior to
the previous commit, which I hope and pray is never the case.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#3Florian G. Pflug
fgp@phlo.org
In reply to: Simon Riggs (#2)
Re: xlog switch & last record before the switch

Simon Riggs wrote:

On Thu, 2007-07-12 at 16:17 +0200, Florian G. Pflug wrote:

To test my PITR-slave readonly-query patch, I continously do
insert into test ...
pg_switch_xlog()
sleep 1
on the master, and let the slave process the generated xlogs

The log output on the slave looks the following (unnecessary lines remove)

<snipped log output>>

As you can see, the COMMIT records seems to end up being logged *after*
the xlog switch. I would have expected the order
"heap-insert, commit, switch, heap-insert, commit, switch, ...", not
"heap-insert, switch, commit, heap-insert, switch, commit, ...".

Is this the expected behaviour, or just an artefact of the implementation
of xlog switches?

Can you show the exact SQL executed?

If you do INSERT ... ;select pg_switch_xlog() then the COMMIT will occur
after the switch. If you do:
INSERT ...;
select pg_switch_xlog();
then the COMMIT will occur before the switch.

Ah, you caught me - I had forgotten that pgsql -c "<statement>"
executes the statement in one transaction.

I was just going to suggest that pg_start_backup, pg_stop_backup
and pg_switch_xlog emit a warning or even an error if called
from within a larger transaction, because that's quite certainly
not what the user wants. But since those are just plain functions,
I guess checking for that might not be trivial...

If it were otherwise this would imply statements were executed prior to
the previous commit, which I hope and pray is never the case.

You can relax, the bug was located in front of the screen :-)

greetings, Florian Pflug