pgsql: Restrict the use of temporary namespace in two-phase transaction

Started by Michael Paquieralmost 7 years ago15 messages
#1Michael Paquier
michael@paquier.xyz

Restrict the use of temporary namespace in two-phase transactions

Attempting to use a temporary table within a two-phase transaction is
forbidden for ages. However, there have been uncovered grounds for
a couple of other object types and commands which work on temporary
objects with two-phase commit. In short, trying to create, lock or drop
an object on a temporary schema should not be authorized within a
two-phase transaction, as it would cause its state to create
dependencies with other sessions, causing all sorts of side effects with
the existing session or other sessions spawned later on trying to use
the same temporary schema name.

Regression tests are added to cover all the grounds found, the original
report mentioned function creation, but monitoring closer there are many
other patterns with LOCK, DROP or CREATE EXTENSION which are involved.
One of the symptoms resulting in combining both is that the session
which used the temporary schema is not able to shut down completely,
waiting for being able to drop the temporary schema, something that it
cannot complete because of the two-phase transaction involved with
temporary objects. In this case the client is able to disconnect but
the session remains alive on the backend-side, potentially blocking
connection backend slots from being used. Other problems reported could
also involve server crashes.

This is back-patched down to v10, which is where 9b013dc has introduced
MyXactFlags, something that this patch relies on.

Reported-by: Alexey Bashtanov
Author: Michael Paquier
Reviewed-by: Masahiko Sawada
Discussion: /messages/by-id/5d910e2e-0db8-ec06-dd5f-baec420513c3@imap.cc
Backpatch-through: 10

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c5660e0aa52d5df27accd8e5e97295cf0e64f7d4

Modified Files
--------------
doc/src/sgml/ref/prepare_transaction.sgml | 6 +-
src/backend/access/transam/xact.c | 12 ++++
src/backend/catalog/namespace.c | 59 +++++++++++++-----
src/backend/commands/dropcmds.c | 8 +++
src/backend/commands/extension.c | 7 +++
src/backend/commands/lockcmds.c | 10 +++
src/include/access/xact.h | 5 ++
.../test_extensions/expected/test_extensions.out | 33 ++++++++++
.../test_extensions/sql/test_extensions.sql | 29 +++++++++
src/test/regress/expected/temp.out | 71 ++++++++++++++++++++++
src/test/regress/sql/temp.sql | 56 +++++++++++++++++
11 files changed, 278 insertions(+), 18 deletions(-)

#2Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#1)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Fri, Jan 18, 2019 at 12:22:52AM +0000, Michael Paquier wrote:

Restrict the use of temporary namespace in two-phase transactions

Attempting to use a temporary table within a two-phase transaction is
forbidden for ages. However, there have been uncovered grounds for
a couple of other object types and commands which work on temporary
objects with two-phase commit. In short, trying to create, lock or drop
an object on a temporary schema should not be authorized within a
two-phase transaction, as it would cause its state to create
dependencies with other sessions, causing all sorts of side effects with
the existing session or other sessions spawned later on trying to use
the same temporary schema name.

I have been monitoring the buildfarm and crake is complaining:
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=crake&br=HEAD

Here is the problem:
SET search_path TO 'pg_temp';
BEGIN;
SELECT current_schema() ~ 'pg_temp' AS is_temp_schema;
- is_temp_schema
-----------------
- t
-(1 row)
-
+ERROR:  cannot create temporary tables during a parallel operation
PREPARE TRANSACTION 'twophase_search';
-ERROR:  cannot PREPARE a transaction that has operated on temporary namespace

I am actually amazed to see the planner choose a parallel plan for
that, and the test can be fixed by enforcing those parameters I think:
SET max_parallel_workers = 0;
SET max_parallel_workers_per_gather = 0;
Could somebody confirm my assumption here by the way? This enforces a
non-parallel plan, right?

Anyway, it seems to me that this is pointing out to another issue:
current_schema() can trigger a namespace creation, hence shouldn't we
mark it as PARALLEL UNSAFE and make sure that we never run into this
problem?
--
Michael

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#2)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

Michael Paquier <michael@paquier.xyz> writes:

I have been monitoring the buildfarm and crake is complaining:
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=crake&amp;br=HEAD

I am actually amazed to see the planner choose a parallel plan for
that,

That's due to force_parallel_mode = regress, I imagine.

Anyway, it seems to me that this is pointing out to another issue:
current_schema() can trigger a namespace creation, hence shouldn't we
mark it as PARALLEL UNSAFE and make sure that we never run into this
problem?

That seems a bit annoying, but maybe we have little choice?

regards, tom lane

#4Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#3)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Thu, Jan 17, 2019 at 8:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Anyway, it seems to me that this is pointing out to another issue:
current_schema() can trigger a namespace creation, hence shouldn't we
mark it as PARALLEL UNSAFE and make sure that we never run into this
problem?

That seems a bit annoying, but maybe we have little choice?

The only other option I see is to make current_schema() not trigger a
namespace creation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#4)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Jan 17, 2019 at 8:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Anyway, it seems to me that this is pointing out to another issue:
current_schema() can trigger a namespace creation, hence shouldn't we
mark it as PARALLEL UNSAFE and make sure that we never run into this
problem?

That seems a bit annoying, but maybe we have little choice?

The only other option I see is to make current_schema() not trigger a
namespace creation.

Seems hard to avoid. We could conceivably make it return "pg_temp"
for the temp schema instead of the schema's actual name, but it's
not very hard to think of ways whereby that would make use of the
result fail in contexts where it previously worked.

Another idea is to force creation of the temp namespace as soon as
we see that search_path references it. I'm not very sure exactly
where would be a convenient place to make that happen, though.
There are paths whereby the GUC's value will change outside a
transaction, so we couldn't tie it directly to the GUC update.

regards, tom lane

#6Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#5)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Fri, Jan 18, 2019 at 03:34:30PM -0500, Tom Lane wrote:

Seems hard to avoid. We could conceivably make it return "pg_temp"
for the temp schema instead of the schema's actual name, but it's
not very hard to think of ways whereby that would make use of the
result fail in contexts where it previously worked.

CREATE EXTENSION is one such case. It would not work if referring to
the synonym pg_temp, but it can work if using directly the temporary
namespace of the session. So I feel that changing such things is
prone to break more things than to actually fix things.

Another idea is to force creation of the temp namespace as soon as
we see that search_path references it. I'm not very sure exactly
where would be a convenient place to make that happen, though.
There are paths whereby the GUC's value will change outside a
transaction, so we couldn't tie it directly to the GUC update.

This is documented at the top of namespace.c: "initial GUC processing
of search_path happens outside a transaction".
--
Michael

#7Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#6)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Sat, Jan 19, 2019 at 09:08:27AM +0900, Michael Paquier wrote:

On Fri, Jan 18, 2019 at 03:34:30PM -0500, Tom Lane wrote:

Seems hard to avoid. We could conceivably make it return "pg_temp"
for the temp schema instead of the schema's actual name, but it's
not very hard to think of ways whereby that would make use of the
result fail in contexts where it previously worked.

CREATE EXTENSION is one such case. It would not work if referring to
the synonym pg_temp, but it can work if using directly the temporary
namespace of the session. So I feel that changing such things is
prone to break more things than to actually fix things.

As long as I don't forget about it.. current_schema() is classified
as stable, so it's not like we can make it return pg_temp and then the
real temporary schema name within the same transaction...
--
Michael

#8Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#4)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Sat, Jan 19, 2019 at 5:05 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 17, 2019 at 8:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Anyway, it seems to me that this is pointing out to another issue:
current_schema() can trigger a namespace creation, hence shouldn't we
mark it as PARALLEL UNSAFE and make sure that we never run into this
problem?

That seems a bit annoying, but maybe we have little choice?

The only other option I see is to make current_schema() not trigger a
namespace creation.

Or can we make the test script set force_parallel_mode = off? Since
the failure case is a very rare in real world I think that it might be
better to change the test scripts rather than changing properly of
current_schema().

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#9Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#8)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Tue, Jan 22, 2019 at 01:47:05PM +0900, Masahiko Sawada wrote:

Or can we make the test script set force_parallel_mode = off? Since
the failure case is a very rare in real world I think that it might be
better to change the test scripts rather than changing properly of
current_schema().

Please see 396676b, which is in my opinion a quick workaround to the
problem. Even if that's a rare case, it would be confusing to the
user to see it :(
--
Michael

#10Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#9)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Tue, Jan 22, 2019 at 2:17 PM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jan 22, 2019 at 01:47:05PM +0900, Masahiko Sawada wrote:

Or can we make the test script set force_parallel_mode = off? Since
the failure case is a very rare in real world I think that it might be
better to change the test scripts rather than changing properly of
current_schema().

Please see 396676b, which is in my opinion a quick workaround to the
problem.

Oops, sorry for the too late response. Thank you.

Even if that's a rare case, it would be confusing to the
user to see it :(

Indeed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#11Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Michael Paquier (#1)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On 18/01/2019 01:22, Michael Paquier wrote:

Restrict the use of temporary namespace in two-phase transactions

We usually don't use "namespace" in user-facing error messages. Can you
change it to say "temporary schema"?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#11)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Fri, Feb 08, 2019 at 10:41:59AM +0100, Peter Eisentraut wrote:

We usually don't use "namespace" in user-facing error messages. Can you
change it to say "temporary schema"?

Or just switch to "temporary objects" like it's done on HEAD for the
second message?

Please note that I have kept the error message for temporary tables
for compatibility reasons on stable branches, and I would rather not
touch that. The second one is new though.
--
Michael

#13Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Michael Paquier (#12)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On 08/02/2019 11:00, Michael Paquier wrote:

On Fri, Feb 08, 2019 at 10:41:59AM +0100, Peter Eisentraut wrote:

We usually don't use "namespace" in user-facing error messages. Can you
change it to say "temporary schema"?

Or just switch to "temporary objects" like it's done on HEAD for the
second message?

Yeah, even better.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#13)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On 09/02/2019 16:06, Peter Eisentraut wrote:

On 08/02/2019 11:00, Michael Paquier wrote:

On Fri, Feb 08, 2019 at 10:41:59AM +0100, Peter Eisentraut wrote:

We usually don't use "namespace" in user-facing error messages. Can you
change it to say "temporary schema"?

Or just switch to "temporary objects" like it's done on HEAD for the
second message?

Yeah, even better.

Committed that way.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#14)
Re: pgsql: Restrict the use of temporary namespace in two-phase transaction

On Mon, Feb 11, 2019 at 10:59:20AM +0100, Peter Eisentraut wrote:

Committed that way.

Thanks Peter for adjusting the message, I was just going to do it.
There was a long weekend here and I had zero access to my laptop,
explaining the delay in replying.
--
Michael