Bug in two-phase transaction recovery

Started by Stas Kelvichover 9 years ago5 messages
#1Stas Kelvich
s.kelvich@postgrespro.ru
2 attachment(s)

Hello.

Some time ago two-phase state file format was changed to have variable size GID,
but several places that read that files were not updated to use new offsets. Problem
exists in master and 9.6 and can be reproduced on prepared transactions with
savepoints. For example:

create table t(id int);
begin;
insert into t values (42);
savepoint s1;
insert into t values (43);
prepare transaction 'x';
begin;
insert into t values (142);
savepoint s1;
insert into t values (143);
prepare transaction 'y’;

and restart the server. Startup process will hang. Fix attached.

Also while looking at StandbyRecoverPreparedTransactions() i’ve noticed that buffer
for 2pc file is allocated in TopMemoryContext but never freed. That probably exists
for a long time.

Attachments:

gidlen_fixes.diffapplication/octet-stream; name=gidlen_fixes.diffDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 9f55adc..a4ba65f 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1758,8 +1758,9 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
 			 * need to hold a lock while examining it.  We still acquire the
 			 * lock to modify it, though.
 			 */
-			subxids = (TransactionId *)
-				(buf + MAXALIGN(sizeof(TwoPhaseFileHeader)));
+			subxids = (TransactionId *) (buf +
+								MAXALIGN(sizeof(TwoPhaseFileHeader)) +
+								MAXALIGN(hdr->gidlen));
 			for (i = 0; i < hdr->nsubxacts; i++)
 			{
 				TransactionId subxid = subxids[i];
@@ -1877,8 +1878,9 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
 			 * Examine subtransaction XIDs ... they should all follow main
 			 * XID.
 			 */
-			subxids = (TransactionId *)
-				(buf + MAXALIGN(sizeof(TwoPhaseFileHeader)));
+			subxids = (TransactionId *) (buf +
+								MAXALIGN(sizeof(TwoPhaseFileHeader)) +
+								MAXALIGN(hdr->gidlen));
 			for (i = 0; i < hdr->nsubxacts; i++)
 			{
 				TransactionId subxid = subxids[i];
standby_recover_pfree.diffapplication/octet-stream; name=standby_recover_pfree.diffDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 9f55adc..5415604 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1886,6 +1888,8 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
 				Assert(TransactionIdFollows(subxid, xid));
 				SubTransSetParent(xid, subxid, overwriteOK);
 			}
+
+			pfree(buf);
 		}
 	}
 	FreeDir(cldir);
#2Michael Paquier
michael.paquier@gmail.com
In reply to: Stas Kelvich (#1)
Re: Bug in two-phase transaction recovery

On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:

Some time ago two-phase state file format was changed to have variable size GID,
but several places that read that files were not updated to use new offsets. Problem
exists in master and 9.6 and can be reproduced on prepared transactions with
savepoints.

Oops and meh. This meritates an open item, and has better be fixed by
9.6.0. I am glad you noticed that honestly. And we had better take
care of this issue as soon as possible.

For example:

create table t(id int);
begin;
insert into t values (42);
savepoint s1;
insert into t values (43);
prepare transaction 'x';
begin;
insert into t values (142);
savepoint s1;
insert into t values (143);
prepare transaction 'y’;

and restart the server. Startup process will hang. Fix attached.

In crash recovery this is very likely to fail with an assertion if
those are enabled:
TRAP: FailedAssertion("!(TransactionIdFollows(subxid, xid))", File:
"twophase.c", Line: 1767)
And the culprit is e0694cf9, which makes this open item owned by Simon.

I also had a look at the patch you are proposing, and I think that's a
correct fix. I also looked at the other code paths scanning the 2PC
state file and did not notice extra problems. The good news is that
the 2PC file generation is not impacted, only its scan at recovery is,
so the impact of the bug is limited for existing 9.6 deployments where
both 2PC and subtransactions are involved.

Also while looking at StandbyRecoverPreparedTransactions() i’ve noticed that buffer
for 2pc file is allocated in TopMemoryContext but never freed. That probably exists
for a long time.

@@ -1886,6 +1886,8 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
Assert(TransactionIdFollows(subxid, xid));
SubTransSetParent(xid, subxid, overwriteOK);
}
+
+ pfree(buf);
}
This one is a good catch. I have checked also the other callers of
ReadTwoPhaseFile but I am not seeing any other leak. That's a leak,
not critical though so applying it only on HEAD would be enough IMO.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Michael Paquier (#2)
Re: Bug in two-phase transaction recovery

On Thu, Sep 8, 2016 at 12:13 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich <s.kelvich@postgrespro.ru>
wrote:

Some time ago two-phase state file format was changed to have variable

size GID,

but several places that read that files were not updated to use new

offsets. Problem

exists in master and 9.6 and can be reproduced on prepared transactions

with

savepoints.

Oops and meh. This meritates an open item, and has better be fixed by
9.6.0. I am glad you noticed that honestly. And we had better take
care of this issue as soon as possible.

Indeed, it's a bug. Thanks Stas for tracking it down and Michael for the
review and checking other places. I also looked at the patch and it seems
fine to me. FWIW I looked at all other places where TwoPhaseFileHeader is
referred and they look safe to me.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#4Simon Riggs
simon@2ndquadrant.com
In reply to: Michael Paquier (#2)
Re: Bug in two-phase transaction recovery

On 8 September 2016 at 07:43, Michael Paquier <michael.paquier@gmail.com> wrote:

On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:

Some time ago two-phase state file format was changed to have variable size GID,
but several places that read that files were not updated to use new offsets. Problem
exists in master and 9.6 and can be reproduced on prepared transactions with
savepoints.

Oops and meh. This meritates an open item, and has better be fixed by
9.6.0. I am glad you noticed that honestly. And we had better take
care of this issue as soon as possible.

Looking now.

Also while looking at StandbyRecoverPreparedTransactions() i’ve noticed that buffer
for 2pc file is allocated in TopMemoryContext but never freed. That probably exists
for a long time.

@@ -1886,6 +1886,8 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
Assert(TransactionIdFollows(subxid, xid));
SubTransSetParent(xid, subxid, overwriteOK);
}
+
+ pfree(buf);
}
This one is a good catch. I have checked also the other callers of
ReadTwoPhaseFile but I am not seeing any other leak. That's a leak,
not critical though so applying it only on HEAD would be enough IMO.

Far from critical, but backpatched to 9.6 because it isn't just
executed once at startup, it is executed every shutdown checkpoint.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndquadrant.com
In reply to: Simon Riggs (#4)
Re: Bug in two-phase transaction recovery

On 8 September 2016 at 11:18, Simon Riggs <simon@2ndquadrant.com> wrote:

On 8 September 2016 at 07:43, Michael Paquier <michael.paquier@gmail.com> wrote:

On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:

Some time ago two-phase state file format was changed to have variable size GID,
but several places that read that files were not updated to use new offsets. Problem
exists in master and 9.6 and can be reproduced on prepared transactions with
savepoints.

Oops and meh. This meritates an open item, and has better be fixed by
9.6.0. I am glad you noticed that honestly. And we had better take
care of this issue as soon as possible.

Looking now.

Committed, thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers