pgsql: Add support for temporary replication slots
Add support for temporary replication slots
This allows creating temporary replication slots that are removed
automatically at the end of the session or on error.
From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/a924c327e2793d2025b19e18de7917110dc8afd8
Modified Files
--------------
contrib/test_decoding/Makefile | 2 +-
contrib/test_decoding/expected/ddl.out | 4 +-
contrib/test_decoding/expected/slot.out | 58 +++++++++++++++++++++++++++
contrib/test_decoding/sql/slot.sql | 20 ++++++++++
doc/src/sgml/func.sgml | 16 ++++++--
doc/src/sgml/protocol.sgml | 13 ++++++-
src/backend/catalog/system_views.sql | 11 ++++++
src/backend/replication/repl_gram.y | 22 +++++++----
src/backend/replication/repl_scanner.l | 1 +
src/backend/replication/slot.c | 69 ++++++++++++++++++++++++++-------
src/backend/replication/slotfuncs.c | 24 ++++++++----
src/backend/replication/walsender.c | 28 +++++++------
src/backend/storage/lmgr/proc.c | 3 ++
src/backend/tcop/postgres.c | 3 ++
src/include/catalog/pg_proc.h | 6 +--
src/include/nodes/replnodes.h | 1 +
src/include/replication/slot.h | 4 +-
src/test/regress/expected/rules.out | 3 +-
18 files changed, 237 insertions(+), 51 deletions(-)
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Mon, Dec 12, 2016 at 11:16 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
Add support for temporary replication slots
This allows creating temporary replication slots that are removed
automatically at the end of the session or on error.From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Branch
------
masterDetails
-------
http://git.postgresql.org/pg/commitdiff/a924c327e2793d2025b19e18de7917110dc8afd8Modified Files
--------------
contrib/test_decoding/Makefile | 2 +-
contrib/test_decoding/expected/ddl.out | 4 +-
contrib/test_decoding/expected/slot.out | 58 +++++++++++++++++++++++++++
contrib/test_decoding/sql/slot.sql | 20 ++++++++++
doc/src/sgml/func.sgml | 16 ++++++--
doc/src/sgml/protocol.sgml | 13 ++++++-
src/backend/catalog/system_views.sql | 11 ++++++
src/backend/replication/repl_gram.y | 22 +++++++----
src/backend/replication/repl_scanner.l | 1 +
src/backend/replication/slot.c | 69 ++++++++++++++++++++++++++-------
src/backend/replication/slotfuncs.c | 24 ++++++++----
src/backend/replication/walsender.c | 28 +++++++------
src/backend/storage/lmgr/proc.c | 3 ++
src/backend/tcop/postgres.c | 3 ++
src/include/catalog/pg_proc.h | 6 +--
src/include/nodes/replnodes.h | 1 +
src/include/replication/slot.h | 4 +-
src/test/regress/expected/rules.out | 3 +-
18 files changed, 237 insertions(+), 51 deletions(-)
Doesn't this need catversion bump?
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> writes:
Add support for temporary replication slots
Some of the slower buildfarm members are failing test-decoding-check since
this went in. At least on prairiedog, it looks like a race condition in
the test:
** /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/expected/slot.out Mon Dec 12 09:24:32 2016
--- /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/./regression_output/results/slot.out Mon Dec 12 10:01:31 2016
***************
*** 32,38 ****
-- should fail because the temporary slot was dropped automatically
SELECT pg_drop_replication_slot('regression_slot_t');
! ERROR: replication slot "regression_slot_t" does not exist
-- test switching between slots in a session
SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot1', 'test_decoding', true);
?column?
--- 32,38 ----
-- should fail because the temporary slot was dropped automatically
SELECT pg_drop_replication_slot('regression_slot_t');
! ERROR: replication slot "regression_slot_t" is active for PID 17615
-- test switching between slots in a session
SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot1', 'test_decoding', true);
?column?
regards, tom lane
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
I wrote:
Peter Eisentraut <peter_e@gmx.net> writes:
Add support for temporary replication slots
Some of the slower buildfarm members are failing test-decoding-check since
this went in. At least on prairiedog, it looks like a race condition in
the test:
Having now looked a bit more closely, that's almost certainly what it is:
the test is assuming that the old backend exits instantaneously, which
it doesn't. You might want to borrow the wait-for-previous-session-to-die
loop I put into src/test/modules/test_extensions/sql/test_extensions.sql
recently. (Make sure you get the fixed version ;-))
regards, tom lane
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 12/12/16 16:55, Tom Lane wrote:
I wrote:
Peter Eisentraut <peter_e@gmx.net> writes:
Add support for temporary replication slots
Some of the slower buildfarm members are failing test-decoding-check since
this went in. At least on prairiedog, it looks like a race condition in
the test:Having now looked a bit more closely, that's almost certainly what it is:
the test is assuming that the old backend exits instantaneously, which
it doesn't. You might want to borrow the wait-for-previous-session-to-die
loop I put into src/test/modules/test_extensions/sql/test_extensions.sql
recently. (Make sure you get the fixed version ;-))
Yes you are correct, we tried naively to first drop another slot in
hopes that it will give the other backend enough time to close, but
apparently that wasn't robust enough for slower machines.
Attached is the fixed test using the loop from test_extensions (and yes
I would have missed the pg_stat_clear_snapshot() call without the hint,
thanks :) ).
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
0001-Fix-the-test_decoding-slot-test-race-condition.patchapplication/x-patch; name=0001-Fix-the-test_decoding-slot-test-race-condition.patchDownload+26-7
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
Yes you are correct, we tried naively to first drop another slot in
hopes that it will give the other backend enough time to close, but
apparently that wasn't robust enough for slower machines.
Attached is the fixed test using the loop from test_extensions (and yes
I would have missed the pg_stat_clear_snapshot() call without the hint,
thanks :) ).
Pushed, thanks.
regards, tom lane
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Fujii Masao <masao.fujii@gmail.com> writes:
On Mon, Dec 12, 2016 at 11:16 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
Add support for temporary replication slots
Doesn't this need catversion bump?
Yes, absolutely, because of the ABI break for the affected functions.
Pushed.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I wrote:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
Attached is the fixed test using the loop from test_extensions (and yes
I would have missed the pg_stat_clear_snapshot() call without the hint,
thanks :) ).
Pushed, thanks.
Hm, buildfarm says this didn't fix it. Where exactly does the dropping
of the slot happen ... is it not complete by the time the backend exits?
regards, tom lane
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 13/12/16 00:39, Tom Lane wrote:
I wrote:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
Attached is the fixed test using the loop from test_extensions (and yes
I would have missed the pg_stat_clear_snapshot() call without the hint,
thanks :) ).Pushed, thanks.
Hm, buildfarm says this didn't fix it. Where exactly does the dropping
of the slot happen ... is it not complete by the time the backend exits?
Yes, I've seen. The cleanup of slots is done in ProcKill(), I wonder,
since it's on_shmem_exit hook, and the pgstat_beshutdown_hook is as
well, maybe the pgstat_beshutdown_hook is called before ProcKill and the
query is lucky to hit right in between.
I am not quite sure what would be the best way to work around that. We
could wait for the slot to disappear instead of trying to drop it, but
then the test would hang on failure.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
On 13/12/16 00:39, Tom Lane wrote:
Hm, buildfarm says this didn't fix it. Where exactly does the dropping
of the slot happen ... is it not complete by the time the backend exits?
Yes, I've seen. The cleanup of slots is done in ProcKill(), I wonder,
since it's on_shmem_exit hook, and the pgstat_beshutdown_hook is as
well, maybe the pgstat_beshutdown_hook is called before ProcKill and the
query is lucky to hit right in between.
Hm. That seems like a pretty bogus place to do it. An awful lot of the
backend infrastructure is already gone by then, if I recall the ordering
correctly. Maybe ShutdownPostgres would be a saner place; but it really
depends on what you think the module layering is for this facility.
I would definitely not think it is proc.c's responsibility, though.
regards, tom lane
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 13/12/16 01:40, Tom Lane wrote:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
On 13/12/16 00:39, Tom Lane wrote:
Hm, buildfarm says this didn't fix it. Where exactly does the dropping
of the slot happen ... is it not complete by the time the backend exits?Yes, I've seen. The cleanup of slots is done in ProcKill(), I wonder,
since it's on_shmem_exit hook, and the pgstat_beshutdown_hook is as
well, maybe the pgstat_beshutdown_hook is called before ProcKill and the
query is lucky to hit right in between.Hm. That seems like a pretty bogus place to do it. An awful lot of the
backend infrastructure is already gone by then, if I recall the ordering
correctly. Maybe ShutdownPostgres would be a saner place; but it really
depends on what you think the module layering is for this facility.
I would definitely not think it is proc.c's responsibility, though.
Well, the problem is that that's the place where the currently active
slot is released (if there was an active one). So we'd need to move that
part somewhere else as well. I am not sure what's the reasoning for
releasing it at that specific spot so CCing Andres.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-12-12 19:40:37 -0500, Tom Lane wrote:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
Yes, I've seen. The cleanup of slots is done in ProcKill(), I wonder,
since it's on_shmem_exit hook, and the pgstat_beshutdown_hook is as
well, maybe the pgstat_beshutdown_hook is called before ProcKill and the
query is lucky to hit right in between.Hm. That seems like a pretty bogus place to do it.
Well, that'd not be Pet[e]r's fault. Robert and I had some preexisting
slot cleanpucode there (which I'd like to consolidate btw).
backend infrastructure is already gone by then, if I recall the ordering
correctly. Maybe ShutdownPostgres would be a saner place; but it really
depends on what you think the module layering is for this facility.
I would definitely not think it is proc.c's responsibility, though.
Slots quite possibly can used by bgworkers, so I don't think
ShutdownPostgres would be right. I don't think there's a perfect place
for it atm, so it's ProcKill()...
- Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-12-13 01:42:48 +0100, Petr Jelinek wrote:
On 13/12/16 01:40, Tom Lane wrote:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
On 13/12/16 00:39, Tom Lane wrote:
Hm, buildfarm says this didn't fix it. Where exactly does the dropping
of the slot happen ... is it not complete by the time the backend exits?Yes, I've seen. The cleanup of slots is done in ProcKill(), I wonder,
since it's on_shmem_exit hook, and the pgstat_beshutdown_hook is as
well, maybe the pgstat_beshutdown_hook is called before ProcKill and the
query is lucky to hit right in between.Hm. That seems like a pretty bogus place to do it. An awful lot of the
backend infrastructure is already gone by then, if I recall the ordering
correctly. Maybe ShutdownPostgres would be a saner place; but it really
depends on what you think the module layering is for this facility.
I would definitely not think it is proc.c's responsibility, though.Well, the problem is that that's the place where the currently active
slot is released (if there was an active one). So we'd need to move that
part somewhere else as well. I am not sure what's the reasoning for
releasing it at that specific spot so CCing Andres.
Why don't we just instead make the loop over pg_replication_slots?
Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 13/12/16 01:49, Andres Freund wrote:
On 2016-12-13 01:42:48 +0100, Petr Jelinek wrote:
On 13/12/16 01:40, Tom Lane wrote:
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
On 13/12/16 00:39, Tom Lane wrote:
Hm, buildfarm says this didn't fix it. Where exactly does the dropping
of the slot happen ... is it not complete by the time the backend exits?Yes, I've seen. The cleanup of slots is done in ProcKill(), I wonder,
since it's on_shmem_exit hook, and the pgstat_beshutdown_hook is as
well, maybe the pgstat_beshutdown_hook is called before ProcKill and the
query is lucky to hit right in between.Hm. That seems like a pretty bogus place to do it. An awful lot of the
backend infrastructure is already gone by then, if I recall the ordering
correctly. Maybe ShutdownPostgres would be a saner place; but it really
depends on what you think the module layering is for this facility.
I would definitely not think it is proc.c's responsibility, though.Well, the problem is that that's the place where the currently active
slot is released (if there was an active one). So we'd need to move that
part somewhere else as well. I am not sure what's the reasoning for
releasing it at that specific spot so CCing Andres.Why don't we just instead make the loop over pg_replication_slots?
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-12-13 01:57:01 +0100, Petr Jelinek wrote:
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.
I don't see the problem with that. If you're really concerned you can
set a statement timeout.
Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Andres Freund <andres@anarazel.de> writes:
On 2016-12-13 01:57:01 +0100, Petr Jelinek wrote:
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.
I don't see the problem with that. If you're really concerned you can
set a statement timeout.
I don't think "change the test" is the right response here.
I think the problem is that we're disconnecting from the slot at the
wrong step of backend shutdown, and that we need to fix that before
it bites us on some more painful parts of our anatomies. You can't
just throw darts at the code when deciding where to do things, and
proc.c is NOT the place that should be concerned with replication
slots.
regards, tom lane
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-12-12 20:08:01 -0500, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2016-12-13 01:57:01 +0100, Petr Jelinek wrote:
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.I don't see the problem with that. If you're really concerned you can
set a statement timeout.I don't think "change the test" is the right response here.
I think the problem is that we're disconnecting from the slot at the
wrong step of backend shutdown, and that we need to fix that before
it bites us on some more painful parts of our anatomies. You can't
just throw darts at the code when deciding where to do things, and
proc.c is NOT the place that should be concerned with replication
slots.
And why is that / where would be more appropriate? It's not
ShutdownPostgres() (usable from bgworkers). And it has to be
after LWLockReleaseAll(), because otherwise we'll potentially just
deadlock against ourselves.
Greetings,
Andres Freund
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 13/12/16 02:08, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2016-12-13 01:57:01 +0100, Petr Jelinek wrote:
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.I don't see the problem with that. If you're really concerned you can
set a statement timeout.I don't think "change the test" is the right response here.
I think the problem is that we're disconnecting from the slot at the
wrong step of backend shutdown, and that we need to fix that before
it bites us on some more painful parts of our anatomies. You can't
just throw darts at the code when deciding where to do things, and
proc.c is NOT the place that should be concerned with replication
slots.
It has been concerned with replication slots since 9.4 so it's not like
it's newly invented concern. As Andres says, we don't have better place
to put it to at the moment.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 13/12/16 02:00, Andres Freund wrote:
On 2016-12-13 01:57:01 +0100, Petr Jelinek wrote:
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.I don't see the problem with that. If you're really concerned you can
set a statement timeout.
Okay in case we decide it's the right way to go attached patch does
that. I also added some more tests based on your feedback while I am at it.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
0001-Various-temporary-slots-test-improvements.patchtext/x-diff; name=0001-Various-temporary-slots-test-improvements.patchDownload+71-13
On 12/12/16 8:28 PM, Petr Jelinek wrote:
On 13/12/16 02:00, Andres Freund wrote:
On 2016-12-13 01:57:01 +0100, Petr Jelinek wrote:
I mentioned that as possible solution upthread, I am only worried that
the failure scenario is basically infinite loop.I don't see the problem with that. If you're really concerned you can
set a statement timeout.Okay in case we decide it's the right way to go attached patch does
that. I also added some more tests based on your feedback while I am at it.
This looks mostly reasonable to me, but the location and xid output from
pg_logical_slot_get_changes() is not portable/repeatable, so it should
be omitted from the output.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers