2PC support for pglogical
Hi.
Here is proof-of-concept version of two phase commit support for logical replication. There are some changes in core postgres, pglogical_output and pglogical extensions. I’ve used version from 2nd Quadrant repo, pglogical branch and rebased it to current head. Some notes about this patch:
* LWLockAssign() was deleted, so I changed that to new locks tranche api.
* Seems that only reliable way to get GID during replay of commit/rollback prepared is to force postgres to write GID in corresponding records, otherwise we can lose correspondence between xid and gid if we are replaying data from wal sender while transaction was commited some time ago. So i’ve changed postgres to write gid’s not only on prepare, but also on commit/rollback prepared. That should be done only in logical level, but now I just want to here some other opinions on that.
* Abort prepared xlog record also lack database information. Normally logical decoding just cleans reorder buffer when facing abort, but in case of 2PC we should send it to callbacks anyway. So I’ve added that info to abort records.
* Prepare emits xlog record with TwoPhaseFileHeader in it and that structure is the same as xl_xact_parsed_commit, but with some fields renamed. Probably that is just due to historical reasons. It is possible to change PREPARE to write ordinary commit records with some flag and then use the same infrastructure to parse it. So DecodePrepare/DecodeCommit can be done with the same function.
Attachments:
On 10 March 2016 at 22:50, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:
Hi.
Here is proof-of-concept version of two phase commit support for logical
replication.
I missed this when you posted it, so sorry for the late response.
I've read through this but not tested it yet. I really appreciate you doing
it, it's been on my wishlist/backburner for ages.
On reading through the patch I noticed that there doesn't seem to be any
consideration of locking. The prepared xact can still hold strong locks on
catalogs. How's that handled? I think Robert's group locking stuff is what
we'll want here - for a prepared xact we can join the prepared xact as a
group lock member so we inherit its locks. Right now if you try DDL in a
prepared xact I suspect it'll break.
On a more minor note, I think the pglogical_output and pglogical changes
should be a separate patch to the patch to core PostgreSQL. Keep them
separate. "git format-patch" on a tree is good for this.
A few misc comments on my first read-through:
Shouldn't PGLOGICAL_COMMIT etc be an enum { } ?
+/*
+ * ParsePrepareRecord
+ */
isn't a super-helpful comment.
Not sure what this means:
+ /* Ignoring abortrels? */
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
typo:
+ ... shortcut for coomiting bare xact ...
DecodePrepare seems to be mostly a copy of DecodeCommit. Can that be done
with less duplication?
Not sure I understand the rationale of "bare xact" as terminology/naming,
e.g. ReorderBufferCommitBareXact . I guess you're getting at the fact that
it's just a commit or rollback, not data. Phase2 ? 2PC?
* Seems that only reliable way to get GID during replay of commit/rollback
prepared is to force postgres to write GID in corresponding records,
otherwise we can lose correspondence between xid and gid if we are
replaying data from wal sender while transaction was commited some time
ago. So i’ve changed postgres to write gid’s not only on prepare, but also
on commit/rollback prepared. That should be done only in logical level, but
now I just want to here some other opinions on that.
That sounds sensible.
* Abort prepared xlog record also lack database information. Normally
logical decoding just cleans reorder buffer when facing abort, but in case
of 2PC we should send it to callbacks anyway. So I’ve added that info to
abort records.
I was wondering what that was. Again, seems sensible, though I'm not
totally convinced of the naming.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Mar 23, 2016 at 1:44 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 10 March 2016 at 22:50, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:
Hi.
Here is proof-of-concept version of two phase commit support for logical
replication.I missed this when you posted it, so sorry for the late response.
I've read through this but not tested it yet. I really appreciate you doing
it, it's been on my wishlist/backburner for ages.On reading through the patch I noticed that there doesn't seem to be any
consideration of locking. The prepared xact can still hold strong locks on
catalogs. How's that handled? I think Robert's group locking stuff is what
we'll want here - for a prepared xact we can join the prepared xact as a
group lock member so we inherit its locks. Right now if you try DDL in a
prepared xact I suspect it'll break.
I have grave doubts about that approach. It's not impossible it could
be right, but I wouldn't bet the farm on it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 Mar 2016, at 17:03, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 23, 2016 at 1:44 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 10 March 2016 at 22:50, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:
Hi.
Here is proof-of-concept version of two phase commit support for logical
replication.I missed this when you posted it, so sorry for the late response.
I've read through this but not tested it yet. I really appreciate you doing
it, it's been on my wishlist/backburner for ages.On reading through the patch I noticed that there doesn't seem to be any
consideration of locking. The prepared xact can still hold strong locks on
catalogs. How's that handled? I think Robert's group locking stuff is what
we'll want here - for a prepared xact we can join the prepared xact as a
group lock member so we inherit its locks. Right now if you try DDL in a
prepared xact I suspect it'll break.I have grave doubts about that approach. It's not impossible it could
be right, but I wouldn't bet the farm on it.
I think all the locking already handled properly by creating dummy backend in PGPROC, as it done in usual postgres 2pc implementation.
Just checked DDL with following scenario:
* prepare tx that inserts a row in table on master
* execute DROP TABLE on pglogical slave
* commit prepared on master
Now it behaves as expected — slave blocks DROP TABLE until commit prepared on master.
---
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 March 2016 at 22:20, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:
I think all the locking already handled properly by creating dummy backend
in PGPROC, as it done in usual postgres 2pc implementation.
On the downstream, yes. But what about the decoding, reorder buffer and
output plugin? They access the relcache etc. If you changed the structure
of a table in a prepared xact you have to make sure you get that right.
Just checked DDL with following scenario:
* prepare tx that inserts a row in table on master
* execute DROP TABLE on pglogical slave
* commit prepared on masterNow it behaves as expected — slave blocks DROP TABLE until commit prepared
on master.
Try doing DDL in the prepared xact on the upstream side, especially things
that make structural changes to tables. Test both things that do full-table
rewrites (alter table ... add column ... default not null) and things that
don't (alter table ... drop column ; alter table ... add column ...;
without not null or default).
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services