Logical decoding without slots: decoding in lockstep with recovery

Started by Craig Ringerabout 5 years ago5 messages
#1Craig Ringer
craig.ringer@enterprisedb.com

Hi all

I want to share an idea I've looked at a few times where I've run into
situations where logical slots were inadvertently dropped, or where it
became necessary to decode changes in the past on a slot.

As most of you will know you can't just create a logical slot in the past.
Even if it was permitted, it'd be unsafe due to catalog_xmin retention
requirements and missing WAL.

But if we can arrange a physical replica to replay the WAL of interest and
decode each commit as soon as it's replayed by the startup process, we know
the needed catalog rows must all exist, so it's safe to decode the change.

So it should be feasible to run logical decoding in standby, even without a
replication slot, so long as we:

* pause startup process after each xl_xact_commit
* wake the walsender running logical decoding
* decode and process until ReorderBufferCommit for the just-committed xact
returns
* wake the startup process to decode the up to the next commit

Can anyone see any obvious problem with this?

I don't think the potential issues with WAL commit visibility order vs
shmem commit visibility order should be a concern.

I see this as potentially useful in data recovery, where you might want to
be able to extract a change stream for a subset of tables from PITR
recovery, for example. Also for audit use.

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Craig Ringer (#1)
Re: Logical decoding without slots: decoding in lockstep with recovery

On Wed, Dec 23, 2020 at 12:26 PM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:

Hi all

I want to share an idea I've looked at a few times where I've run into situations where logical slots were inadvertently dropped, or where it became necessary to decode changes in the past on a slot.

As most of you will know you can't just create a logical slot in the past. Even if it was permitted, it'd be unsafe due to catalog_xmin retention requirements and missing WAL.

But if we can arrange a physical replica to replay the WAL of interest and decode each commit as soon as it's replayed by the startup process, we know the needed catalog rows must all exist, so it's safe to decode the change.

So it should be feasible to run logical decoding in standby, even without a replication slot, so long as we:

* pause startup process after each xl_xact_commit
* wake the walsender running logical decoding
* decode and process until ReorderBufferCommit for the just-committed xact returns
* wake the startup process to decode the up to the next commit

How will you deal with subscriber restart? I think you need some way
to remember confirmed_flush_lsn and restart_lsn and then need to teach
WAL machinery to restart from some previous point.

--
With Regards,
Amit Kapila.

#3Craig Ringer
craig.ringer@enterprisedb.com
In reply to: Amit Kapila (#2)
Re: Logical decoding without slots: decoding in lockstep with recovery

On Wed, 23 Dec 2020, 18:57 Amit Kapila, <amit.kapila16@gmail.com> wrote:

On Wed, Dec 23, 2020 at 12:26 PM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:

Hi all

I want to share an idea I've looked at a few times where I've run into

situations where logical slots were inadvertently dropped, or where it
became necessary to decode changes in the past on a slot.

As most of you will know you can't just create a logical slot in the

past. Even if it was permitted, it'd be unsafe due to catalog_xmin
retention requirements and missing WAL.

But if we can arrange a physical replica to replay the WAL of interest

and decode each commit as soon as it's replayed by the startup process, we
know the needed catalog rows must all exist, so it's safe to decode the
change.

So it should be feasible to run logical decoding in standby, even

without a replication slot, so long as we:

* pause startup process after each xl_xact_commit
* wake the walsender running logical decoding
* decode and process until ReorderBufferCommit for the just-committed

xact returns

* wake the startup process to decode the up to the next commit

How will you deal with subscriber restart? I think you need some way
to remember confirmed_flush_lsn and restart_lsn and then need to teach
WAL machinery to restart from some previous point.

The simplest option, albeit slow, would be to require the subscriber to
confirm flush before allowing more WAL redo. That's what I'd initially
assumed. This is a bit of a corner case situation after all, it's never
going to be fast with the switching back and forth.

More efficient would be to decode and output to a local spool file and
fsync it. Then separately send that to the subscriber, like has been
discussed in other work on more efficient logical decoding. So the output
plugin output would go to a local spool.

That can be implemented with pg_recvlogical if needed.

Show quoted text

--
With Regards,
Amit Kapila.

#4Andres Freund
andres@anarazel.de
In reply to: Craig Ringer (#1)
Re: Logical decoding without slots: decoding in lockstep with recovery

Hi,

On 2020-12-23 14:56:07 +0800, Craig Ringer wrote:

I want to share an idea I've looked at a few times where I've run into
situations where logical slots were inadvertently dropped, or where it
became necessary to decode changes in the past on a slot.

As most of you will know you can't just create a logical slot in the past.
Even if it was permitted, it'd be unsafe due to catalog_xmin retention
requirements and missing WAL.

But if we can arrange a physical replica to replay the WAL of interest and
decode each commit as soon as it's replayed by the startup process, we know
the needed catalog rows must all exist, so it's safe to decode the change.

So it should be feasible to run logical decoding in standby, even without a
replication slot, so long as we:

* pause startup process after each xl_xact_commit
* wake the walsender running logical decoding
* decode and process until ReorderBufferCommit for the just-committed xact
returns
* wake the startup process to decode the up to the next commit

I don't think it's safe to just do this for each xl_xact_commit - we can
remove needed rows at quite a few places, not just around transaction
commit. Rows needed to correctly decode rows earlier in the transaction
might not be available by the time the commit record was logged.

I think you'd basically have to run logical decoding in lockstep with
WAL replay, i.e. replay one record, then call logical decoding for that
record, replay the next record, ...

Can anyone see any obvious problem with this?

The patch for logical decoding on the standby
/messages/by-id/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
should provide some of the infrastructure to do this properly. Should
really commit it. /me adds an entry to the top of the todo list.

Greetings,

Andres Freund

#5Craig Ringer
craig.ringer@enterprisedb.com
In reply to: Andres Freund (#4)
Re: Logical decoding without slots: decoding in lockstep with recovery

On Sat, 26 Dec 2020 at 06:51, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-12-23 14:56:07 +0800, Craig Ringer wrote:

I want to share an idea I've looked at a few times where I've run into
situations where logical slots were inadvertently dropped, or where it
became necessary to decode changes in the past on a slot.

As most of you will know you can't just create a logical slot in the

past.

Even if it was permitted, it'd be unsafe due to catalog_xmin retention
requirements and missing WAL.

But if we can arrange a physical replica to replay the WAL of interest

and

decode each commit as soon as it's replayed by the startup process, we

know

the needed catalog rows must all exist, so it's safe to decode the

change.

So it should be feasible to run logical decoding in standby, even

without a

replication slot, so long as we:

* pause startup process after each xl_xact_commit
* wake the walsender running logical decoding
* decode and process until ReorderBufferCommit for the just-committed

xact

returns
* wake the startup process to decode the up to the next commit

I don't think it's safe to just do this for each xl_xact_commit - we can
remove needed rows at quite a few places, not just around transaction
commit.

Good point.

I vaguely recall spotting a possible decoding-on-standby issue with eager
removal of rows that are still ahead of the global xmin if the primary
"knows" can't be needed based on info about currently running backends. But
when looking over code related to HOT, visibility, and vacuum now I can't
for the life of me remember exactly what it was or find it. Hopefully I
just misunderstood at the time or was getting confused between decoding on
standby and xact streaming.

Rows needed to correctly decode rows earlier in the transaction
might not be available by the time the commit record was logged.

When can that happen?

I think you'd basically have to run logical decoding in lockstep with

WAL replay, i.e. replay one record, then call logical decoding for that
record, replay the next record, ...

That sounds likely to be unusably slow. The only way I can see it having
any hope of moving at a reasonable rate would be to run a decoding session
inside the startup process itself so we don't have to switch back/forth for
each record. But I imagine that'd probably cause a whole other set of
problems.

Can anyone see any obvious problem with this?

The patch for logical decoding on the standby
/messages/by-id/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
should provide some of the infrastructure to do this properly. Should
really commit it. /me adds an entry to the top of the todo list.

That would certainly be helpful for quite a number of things.