Hot standby, running xacts, subtransactions

Started by Heikki Linnakangasalmost 17 years ago9 messages
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com

When we take the snapshot of running transactions in the master, in
GetRunningTransactionData(), it only includes top-level xids and those
subxids that are in the subxid caches. Overflowed subxids are not
included. Isn't that a problem? When the standby initializes the
recovery procs using the running xacts information, pg_subtrans doesn't
isn't set for the overflowed xids, because that information is not
included in the WAL record. If you're lucky, the information is there
already, but we don't generally guarantee pg_subtrans to survive crash
or restart.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#1)
Re: Hot standby, running xacts, subtransactions

On Wed, 2009-02-25 at 22:39 +0200, Heikki Linnakangas wrote:

When we take the snapshot of running transactions in the master, in
GetRunningTransactionData(), it only includes top-level xids and those
subxids that are in the subxid caches. Overflowed subxids are not
included. Isn't that a problem? When the standby initializes the
recovery procs using the running xacts information, pg_subtrans doesn't
isn't set for the overflowed xids, because that information is not
included in the WAL record. If you're lucky, the information is there
already, but we don't generally guarantee pg_subtrans to survive crash
or restart.

That is exactly the reason why we don't treat an overflowed snapshot as
a valid starting point.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#2)
Re: Hot standby, running xacts, subtransactions

Simon Riggs wrote:

On Wed, 2009-02-25 at 22:39 +0200, Heikki Linnakangas wrote:

When we take the snapshot of running transactions in the master, in
GetRunningTransactionData(), it only includes top-level xids and those
subxids that are in the subxid caches. Overflowed subxids are not
included. Isn't that a problem? When the standby initializes the
recovery procs using the running xacts information, pg_subtrans doesn't
isn't set for the overflowed xids, because that information is not
included in the WAL record. If you're lucky, the information is there
already, but we don't generally guarantee pg_subtrans to survive crash
or restart.

That is exactly the reason why we don't treat an overflowed snapshot as
a valid starting point.

We don't? I don't see anything stopping it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Josh Berkus
josh@agliodbs.com
In reply to: Heikki Linnakangas (#1)
Re: Hot standby, running xacts, subtransactions

You raised that as an annoyance previously because it means that
connection in hot standby mode may be delayed in cases of heavy,
repeated use of significant numbers of subtransactions.

While most users still don't use explicit subtransactions at all,
wouldn't this also affect users who use large numbers of stored procedures?

--Josh Berkus

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#3)
Re: Hot standby, running xacts, subtransactions

On Wed, 2009-02-25 at 23:08 +0200, Heikki Linnakangas wrote:

That is exactly the reason why we don't treat an overflowed snapshot as
a valid starting point.

We don't? I don't see anything stopping it.

In GetRunningTransactionData() we explicitly set latestRunningXid to
InvalidTransactionId if the snapshot is overflowed.

That prevents the snapshot from being used to initialise the recovery
procs. I'll document that better.

You raised that as an annoyance previously because it means that
connection in hot standby mode may be delayed in cases of heavy,
repeated use of significant numbers of subtransactions. My answer was
that there is a way to avoid that but it complicates things and I'm
trying my best to avoid complexity in the first release, yet still have
it work (this decade :-))

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#4)
Re: Hot standby, running xacts, subtransactions

On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote:

You raised that as an annoyance previously because it means that
connection in hot standby mode may be delayed in cases of heavy,
repeated use of significant numbers of subtransactions.

While most users still don't use explicit subtransactions at all,
wouldn't this also affect users who use large numbers of stored procedures?

If they regularly use more than 64 levels of nested EXCEPTION clauses
*and* they start their base backups during heavy usage of those stored
procedures, then yes.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#7Robert Treat
xzilla@users.sourceforge.net
In reply to: Simon Riggs (#6)
Re: Hot standby, running xacts, subtransactions

On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote:

On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote:

You raised that as an annoyance previously because it means that
connection in hot standby mode may be delayed in cases of heavy,
repeated use of significant numbers of subtransactions.

While most users still don't use explicit subtransactions at all,
wouldn't this also affect users who use large numbers of stored
procedures?

If they regularly use more than 64 levels of nested EXCEPTION clauses
*and* they start their base backups during heavy usage of those stored
procedures, then yes.

We have stored procedrues that loop over thousands of records, with
begin...exception blocks in that loop, so I think we do that. AFAICT there's
no way to tell if you have it wrong until you fire up the standby (ie. you
can't tell at the time you make your base backup), right ?

--
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Treat (#7)
Re: Hot standby, running xacts, subtransactions

On Mon, 2009-03-02 at 21:11 -0500, Robert Treat wrote:

On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote:

On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote:

You raised that as an annoyance previously because it means that
connection in hot standby mode may be delayed in cases of heavy,
repeated use of significant numbers of subtransactions.

While most users still don't use explicit subtransactions at all,
wouldn't this also affect users who use large numbers of stored
procedures?

If they regularly use more than 64 levels of nested EXCEPTION clauses
*and* they start their base backups during heavy usage of those stored
procedures, then yes.

We have stored procedrues that loop over thousands of records, with
begin...exception blocks in that loop, so I think we do that. AFAICT there's
no way to tell if you have it wrong until you fire up the standby (ie. you
can't tell at the time you make your base backup), right ?

That was supposed to be a simplification for phase one, not a barrier
for all time.

I'm changing that now, though the effect will be that in some cases we
take longer before we accept connections. The initialisation
requirements are that we have full knowledge of transactions in progress
before we allow snapshots to be taken.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#9Robert Treat
xzilla@users.sourceforge.net
In reply to: Simon Riggs (#8)
Re: Hot standby, running xacts, subtransactions

On Tuesday 03 March 2009 03:22:30 Simon Riggs wrote:

On Mon, 2009-03-02 at 21:11 -0500, Robert Treat wrote:

On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote:

On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote:

You raised that as an annoyance previously because it means that
connection in hot standby mode may be delayed in cases of heavy,
repeated use of significant numbers of subtransactions.

While most users still don't use explicit subtransactions at all,
wouldn't this also affect users who use large numbers of stored
procedures?

If they regularly use more than 64 levels of nested EXCEPTION clauses
*and* they start their base backups during heavy usage of those stored
procedures, then yes.

We have stored procedrues that loop over thousands of records, with
begin...exception blocks in that loop, so I think we do that. AFAICT
there's no way to tell if you have it wrong until you fire up the standby
(ie. you can't tell at the time you make your base backup), right ?

That was supposed to be a simplification for phase one, not a barrier
for all time.

Understood; I only mention it because it's usually good to know how quickly we
run into some of these cases that we don't think will be common.

I'm changing that now, though the effect will be that in some cases we
take longer before we accept connections. The initialisation
requirements are that we have full knowledge of transactions in progress
before we allow snapshots to be taken.

That seems pretty reasonable; hopefully people aren't setting up hot standy
machines as an emergency scaling technique :-)

--
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com