Broken defenses against dropping a partitioning column

Started by Tom Laneabout 7 years ago15 messageshackers

tgl@sss.pgh.pa.us

about 7 years ago

(Moved from pgsql-bugs thread at [1]/messages/by-id/CA+u7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg@mail.gmail.com)

Consider

regression=# create domain d1 as int;
CREATE DOMAIN
regression=# create table t1 (f1 d1) partition by range(f1);
CREATE TABLE
regression=# alter table t1 drop column f1;
ERROR: cannot drop column named in partition key

So far so good, but that defense has more holes than a hunk of
Swiss cheese:

regression=# drop domain d1 cascade;
psql: NOTICE: drop cascades to column f1 of table t1
DROP DOMAIN

Of course, the table is now utterly broken, e.g.

regression=# \d t1
psql: ERROR: cache lookup failed for type 0

(More-likely variants of this include dropping an extension that
defines the type of a partitioning column, or dropping the schema
containing such a type.)

The fix I was speculating about in the pgsql-bugs thread was to add
explicit pg_depend entries making the table's partitioning columns
internally dependent on the whole table (or maybe the other way around;
haven't experimented). That fix has a couple of problems though:

1. In the example, "drop domain d1 cascade" would automatically
cascade to the whole partitioned table, including child partitions
of course. This might leave a user sad, if a few terabytes of
valuable data went away; though one could argue that they'd better
have paid more attention to what the cascade cascaded to.

2. It doesn't fix anything for pre-existing tables in pre-v12 branches.

I thought of a different possible approach, which is to move the
"cannot drop column named in partition key" error check from
ATExecDropColumn(), where it is now, to RemoveAttributeById().
That would be back-patchable, but the implication would be that
dropping anything that a partitioning column depends on would be
impossible, even with CASCADE; you'd have to manually drop the
partitioned table first. Good for data safety, but a horrible
violation of expectations, and likely of the SQL spec as well.
I'm not sure we could avoid order-of-traversal problems, either.

Ideally, perhaps, a DROP CASCADE like this would not cascade to
the whole table but only to the table's partitioned-ness property,
leaving you with a non-partitioned table with most of its data
intact. It would take a lot of work to make that happen though,
and it certainly wouldn't be back-patchable, and I'm not really
sure it's worth it.

Thoughts?

regards, tom lane

[1]: /messages/by-id/CA+u7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg@mail.gmail.com

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

about 7 years ago

In reply to: Tom Lane (#1)

Re: Broken defenses against dropping a partitioning column

On Mon, Jul 8, 2019 at 4:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

(Moved from pgsql-bugs thread at [1])

Thanks.

Consider

regression=# create domain d1 as int;
CREATE DOMAIN
regression=# create table t1 (f1 d1) partition by range(f1);
CREATE TABLE
regression=# alter table t1 drop column f1;
ERROR: cannot drop column named in partition key

So far so good, but that defense has more holes than a hunk of
Swiss cheese:

Indeed.

regression=# drop domain d1 cascade;
psql: NOTICE: drop cascades to column f1 of table t1
DROP DOMAIN

Of course, the table is now utterly broken, e.g.

regression=# \d t1
psql: ERROR: cache lookup failed for type 0

Oops.

(More-likely variants of this include dropping an extension that
defines the type of a partitioning column, or dropping the schema
containing such a type.)

Yeah. Actually, it's embarrassingly easy to fall through the holes.

create type mytype as (a int);
create table mytyptab (a mytype) partition by list (a);
drop type mytype cascade;
NOTICE: drop cascades to column a of table mytyptab
DROP TYPE
select * from mytyptab;
ERROR: cache lookup failed for type 0
LINE 1: select * from mytyptab;
^
drop table mytyptab;
ERROR: cache lookup failed for type 0

The fix I was speculating about in the pgsql-bugs thread was to add
explicit pg_depend entries making the table's partitioning columns
internally dependent on the whole table (or maybe the other way around;
haven't experimented). That fix has a couple of problems though:

1. In the example, "drop domain d1 cascade" would automatically
cascade to the whole partitioned table, including child partitions
of course. This might leave a user sad, if a few terabytes of
valuable data went away; though one could argue that they'd better
have paid more attention to what the cascade cascaded to.

2. It doesn't fix anything for pre-existing tables in pre-v12 branches.

I thought of a different possible approach, which is to move the
"cannot drop column named in partition key" error check from
ATExecDropColumn(), where it is now, to RemoveAttributeById().
That would be back-patchable, but the implication would be that
dropping anything that a partitioning column depends on would be
impossible, even with CASCADE; you'd have to manually drop the
partitioned table first. Good for data safety, but a horrible
violation of expectations, and likely of the SQL spec as well.

I prefer this second solution as it works for both preexisting and new
tables, although I also agree that it is not user-friendly. Would it
help to document that one would be unable to drop anything that a
partitioning column directly and indirectly depends on (type, domain,
schema, extension, etc.)?

I'm not sure we could avoid order-of-traversal problems, either.

Ideally, perhaps, a DROP CASCADE like this would not cascade to
the whole table but only to the table's partitioned-ness property,
leaving you with a non-partitioned table with most of its data
intact.

Yeah, it would've been nice if the partitioned-ness property of table
could be deleted independently of the table.

It would take a lot of work to make that happen though,
and it certainly wouldn't be back-patchable, and I'm not really
sure it's worth it.

Agreed that this sounds maybe more like a new feature.

Thanks,
Amit

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Tom Lane (#1)

Re: Broken defenses against dropping a partitioning column

On 2019-Jul-07, Tom Lane wrote:

Ideally, perhaps, a DROP CASCADE like this would not cascade to
the whole table but only to the table's partitioned-ness property,
leaving you with a non-partitioned table with most of its data
intact. It would take a lot of work to make that happen though,
and it certainly wouldn't be back-patchable, and I'm not really
sure it's worth it.

Maybe we can add dependencies to rows of the pg_partitioned_table
relation, with the semantics of "depends on the partitioned-ness of the
table"?

That said, I'm not sure I see the use case for an ALTER TABLE .. DROP
COLUMN command that turns a partitioned table (with existing partitions
containing data) into one non-partitioned table with all data minus the
partitioning column(s).

This seems vaguely related to the issue of dropping foreign keys; see
/messages/by-id/20190329152239.GA29258@alvherre.pgsql wherein I
settled with a non-ideal solution to the problem of being unable to
depend on something that did not cause the entire table to be dropped
in certain cases.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Alvaro Herrera (#3)

Re: Broken defenses against dropping a partitioning column

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

That said, I'm not sure I see the use case for an ALTER TABLE .. DROP
COLUMN command that turns a partitioned table (with existing partitions
containing data) into one non-partitioned table with all data minus the
partitioning column(s).

Yeah, it'd be a lot of work for a dubious goal.

This seems vaguely related to the issue of dropping foreign keys; see
/messages/by-id/20190329152239.GA29258@alvherre.pgsql wherein I
settled with a non-ideal solution to the problem of being unable to
depend on something that did not cause the entire table to be dropped
in certain cases.

That's an interesting analogy. Re-reading that thread, what I said
in <29497.1554217629@sss.pgh.pa.us> seems pretty apropos to the
current problem:

FWIW, I think that the dependency mechanism is designed around the idea
that whether it's okay to drop a *database object* depends only on what
other *database objects* rely on it, and that you can always make a DROP
valid by dropping all the dependent objects. That's not an unreasonable
design assumption, considering that the SQL standard embeds the same
assumption in its RESTRICT/CASCADE syntax.

I think that is probably a fatal objection to my idea of putting an error
check into RemoveAttributeById(). As an example, consider the possibility
that somebody makes a temporary type and then makes a permanent table with
a partitioning column of that type. What shall we do at session exit?
Failing to remove the temp type is not an acceptable choice, because that
leaves us with a permanently broken temp schema (compare bug #15631 [1]/messages/by-id/15631-188663b383e1e697@postgresql.org).

Also, I don't believe we can make that work without order-of-operations
problems in cases comparable to the original bug in this thread [2]/messages/by-id/CA+u7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg@mail.gmail.com.
One or the other order of the object OIDs is going to lead to the column
being visited for deletion before the whole table is, and again rejecting
the column deletion is not going to be an acceptable behavior.

So I think we're probably stuck with the approach of adding new internal
dependencies. If we go that route, then our options for the released
branches are (1) do nothing, or (2) back-patch the code that adds such
dependencies, but without a catversion bump. That would mean that only
tables created after the next minor releases would have protection against
this problem. That's not ideal but maybe it's okay, considering that we
haven't seen actual field reports of trouble of this kind.

regards, tom lane

[1]: /messages/by-id/15631-188663b383e1e697@postgresql.org

[2]: /messages/by-id/CA+u7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg@mail.gmail.com

Robert Haas

robertmhaas@gmail.com

about 7 years ago

In reply to: Alvaro Herrera (#3)

Re: Broken defenses against dropping a partitioning column

On Mon, Jul 8, 2019 at 10:32 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

That said, I'm not sure I see the use case for an ALTER TABLE .. DROP
COLUMN command that turns a partitioned table (with existing partitions
containing data) into one non-partitioned table with all data minus the
partitioning column(s).

I think it would be useful to have "ALTER TABLE blah NOT PARTITIONED" but I

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Robert Haas

robertmhaas@gmail.com

about 7 years ago

In reply to: Robert Haas (#5)

Re: Broken defenses against dropping a partitioning column

On Mon, Jul 8, 2019 at 11:02 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jul 8, 2019 at 10:32 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

That said, I'm not sure I see the use case for an ALTER TABLE .. DROP
COLUMN command that turns a partitioned table (with existing partitions
containing data) into one non-partitioned table with all data minus the
partitioning column(s).

I think it would be useful to have "ALTER TABLE blah NOT PARTITIONED" but I

...hit send too soon, and also, I don't think anyone will be very
happy if they get that behavior as a side effect of a DROP statement,
mostly because it could take an extremely long time to execute.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Robert Haas (#6)

Re: Broken defenses against dropping a partitioning column

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jul 8, 2019 at 11:02 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jul 8, 2019 at 10:32 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

That said, I'm not sure I see the use case for an ALTER TABLE .. DROP
COLUMN command that turns a partitioned table (with existing partitions
containing data) into one non-partitioned table with all data minus the
partitioning column(s).

I think it would be useful to have "ALTER TABLE blah NOT PARTITIONED" but I

...hit send too soon, and also, I don't think anyone will be very
happy if they get that behavior as a side effect of a DROP statement,
mostly because it could take an extremely long time to execute.

FWIW, I was imagining the action as being (1) detach all the child
partitions, (2) make parent into a non-partitioned table, (3)
drop the target column in each of these now-independent tables.
No data movement. Other than the need to acquire locks on all
the tables, it shouldn't be particularly slow.

But I'm still not volunteering to write it, because I'm not sure
anyone would want such a behavior.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

about 7 years ago

In reply to: Tom Lane (#7)

Re: Broken defenses against dropping a partitioning column

On Mon, Jul 8, 2019 at 11:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

FWIW, I was imagining the action as being (1) detach all the child
partitions, (2) make parent into a non-partitioned table, (3)
drop the target column in each of these now-independent tables.
No data movement. Other than the need to acquire locks on all
the tables, it shouldn't be particularly slow.

I see. I think that would be reasonable, but like you say, it's not
clear that it's really what users would prefer. You can think of a
partitioned table as a first-class object and the partitions as
subordinate implementation details; or you can think of the partitions
as the first-class objects and the partitioned table as the
second-rate glue that holds them together. It seems like users prefer
the former view.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Tom Lane (#4)

Re: Broken defenses against dropping a partitioning column

On Mon, Jul 08, 2019 at 10:58:56AM -0400, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

That said, I'm not sure I see the use case for an ALTER TABLE .. DROP
COLUMN command that turns a partitioned table (with existing partitions
containing data) into one non-partitioned table with all data minus the
partitioning column(s).

Yeah, it'd be a lot of work for a dubious goal.

This seems vaguely related to the issue of dropping foreign keys; see
/messages/by-id/20190329152239.GA29258@alvherre.pgsql wherein I
settled with a non-ideal solution to the problem of being unable to
depend on something that did not cause the entire table to be dropped
in certain cases.

That's an interesting analogy. Re-reading that thread, what I said
in <29497.1554217629@sss.pgh.pa.us> seems pretty apropos to the
current problem:

FWIW, I think that the dependency mechanism is designed around the idea
that whether it's okay to drop a *database object* depends only on what
other *database objects* rely on it, and that you can always make a DROP
valid by dropping all the dependent objects. That's not an unreasonable
design assumption, considering that the SQL standard embeds the same
assumption in its RESTRICT/CASCADE syntax.

I think that is probably a fatal objection to my idea of putting an error
check into RemoveAttributeById(). As an example, consider the possibility
that somebody makes a temporary type and then makes a permanent table with
a partitioning column of that type. What shall we do at session exit?
Failing to remove the temp type is not an acceptable choice, because that
leaves us with a permanently broken temp schema (compare bug #15631 [1]).

Also, I don't believe we can make that work without order-of-operations
problems in cases comparable to the original bug in this thread [2].
One or the other order of the object OIDs is going to lead to the column
being visited for deletion before the whole table is, and again rejecting
the column deletion is not going to be an acceptable behavior.

So I think we're probably stuck with the approach of adding new internal
dependencies. If we go that route, then our options for the released
branches are (1) do nothing, or (2) back-patch the code that adds such
dependencies, but without a catversion bump. That would mean that only
tables created after the next minor releases would have protection against
this problem. That's not ideal but maybe it's okay, considering that we
haven't seen actual field reports of trouble of this kind.

Couldn't we also write a function that adds those dependencies for
existing objects, and request users to run it after the update?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Tomas Vondra (#9)

Re: Broken defenses against dropping a partitioning column

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

On Mon, Jul 08, 2019 at 10:58:56AM -0400, Tom Lane wrote:

So I think we're probably stuck with the approach of adding new internal
dependencies. If we go that route, then our options for the released
branches are (1) do nothing, or (2) back-patch the code that adds such
dependencies, but without a catversion bump. That would mean that only
tables created after the next minor releases would have protection against
this problem. That's not ideal but maybe it's okay, considering that we
haven't seen actual field reports of trouble of this kind.

Couldn't we also write a function that adds those dependencies for
existing objects, and request users to run it after the update?

Maybe. I'm not volunteering to write such a thing.

BTW, it looks like somebody actually did think about this problem with
respect to external dependencies of partition expressions:

regression=# create function myabs(int) returns int language internal as 'int4abs' immutable strict parallel safe;
CREATE FUNCTION
regression=# create table foo (f1 int) partition by range (myabs(f1));
CREATE TABLE
regression=# drop function myabs(int);
ERROR: cannot drop function myabs(integer) because other objects depend on it
DETAIL: table foo depends on function myabs(integer)
HINT: Use DROP ... CASCADE to drop the dependent objects too.

Unfortunately, there's still no dependency on the column f1 in this
scenario. That means any function that wants to reconstruct the
correct dependencies would need a way to scan the partition expressions
for Vars. Not much fun from plpgsql, for sure.

regards, tom lane

#11

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Tom Lane (#4)

Re: Broken defenses against dropping a partitioning column

I wrote:

So I think we're probably stuck with the approach of adding new internal
dependencies. If we go that route, then our options for the released
branches are (1) do nothing, or (2) back-patch the code that adds such
dependencies, but without a catversion bump. That would mean that only
tables created after the next minor releases would have protection against
this problem. That's not ideal but maybe it's okay, considering that we
haven't seen actual field reports of trouble of this kind.

Here's a proposed patch for that. It's mostly pretty straightforward,
except I had to add some recursion defenses in findDependentObjects that
weren't there before. But those seem like a good idea anyway to prevent
infinite recursion in case of bogus entries in pg_depend.

I also took the liberty of improving some related error messages that
I thought were unnecessarily vague and not up to project standards.

Per above, I'm envisioning applying this to HEAD and v12 with a catversion
bump, and to v11 and v10 with no catversion bump.

regards, tom lane

#12

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Tom Lane (#11)

Re: Broken defenses against dropping a partitioning column

I wrote:

Here's a proposed patch for that. It's mostly pretty straightforward,
except I had to add some recursion defenses in findDependentObjects that
weren't there before. But those seem like a good idea anyway to prevent
infinite recursion in case of bogus entries in pg_depend.
Per above, I'm envisioning applying this to HEAD and v12 with a catversion
bump, and to v11 and v10 with no catversion bump.

Pushed. Back-patching turned up one thing I hadn't expected: pre-v12
pg_dump bleated about circular dependencies. It turned out that Peter
had already installed a hack in pg_dump to suppress that complaint in
connection with generated columns, so I improved the comment and
back-patched that too.

I nearly missed the need for that because of all the noise that
check-world emits in pre-v12 branches. We'd discussed back-patching
eb9812f27 at the time, and I think now it's tested enough that doing
so is low risk (or at least, lower risk than the risk of not seeing
a failure). So I think I'll go do that now.

regards, tom lane

#13

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Tom Lane (#12)

Re: Broken defenses against dropping a partitioning column

On 2019-Jul-22, Tom Lane wrote:

I nearly missed the need for that because of all the noise that
check-world emits in pre-v12 branches. We'd discussed back-patching
eb9812f27 at the time, and I think now it's tested enough that doing
so is low risk (or at least, lower risk than the risk of not seeing
a failure). So I think I'll go do that now.

I'd like that, as it bites me too, thanks.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14

Manuel Rigger

rigger.manuel@gmail.com

about 7 years ago

In reply to: Alvaro Herrera (#13)

Re: Broken defenses against dropping a partitioning column

Thanks a lot for the fix!

Best,
Manuel

Show quoted text

On Mon, Jul 22, 2019 at 9:35 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2019-Jul-22, Tom Lane wrote:

I nearly missed the need for that because of all the noise that
check-world emits in pre-v12 branches. We'd discussed back-patching
eb9812f27 at the time, and I think now it's tested enough that doing
so is low risk (or at least, lower risk than the risk of not seeing
a failure). So I think I'll go do that now.

I'd like that, as it bites me too, thanks.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Alvaro Herrera (#13)

Re: Broken defenses against dropping a partitioning column

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2019-Jul-22, Tom Lane wrote:

I nearly missed the need for that because of all the noise that
check-world emits in pre-v12 branches. We'd discussed back-patching
eb9812f27 at the time, and I think now it's tested enough that doing
so is low risk (or at least, lower risk than the risk of not seeing
a failure). So I think I'll go do that now.

I'd like that, as it bites me too, thanks.

Done. The approach "make check-world >/dev/null" now emits the
same amount of noise on all branches, ie just

NOTICE: database "regression" does not exist, skipping

The amount of parallelism you can apply is still pretty
branch-dependent, unfortunately.

regards, tom lane

Broken defenses against dropping a partitioning column

Attachments: