unsupportable composite type partition keys

Started by Amit Langoteover 6 years ago31 messageshackers

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

Hi,

It seems to me that we currently allow expressions that are anonymous
and self-referencing composite type records as partition key, but
shouldn't. Allowing them leads to this:

create table foo (a int) partition by list ((row(a, b)));
create table foo1 partition of foo for values in ('(1)'::foo);
create table foo2 partition of foo for values in ('(2)'::foo);
explain select * from foo where row(a) = '(1)'::foo;
ERROR: stack depth limit exceeded

Stack trace is this:

#0 errfinish (dummy=0) at elog.c:442
#1 0x0000000000911a51 in check_stack_depth () at postgres.c:3288
#2 0x00000000007970e6 in expression_tree_mutator (node=0x31890a0,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2526
#3 0x000000000082340b in eval_const_expressions_mutator
(node=0x31890a0, context=0x7fff0578ef60) at clauses.c:3605
#4 0x000000000079875c in expression_tree_mutator (node=0x31890f8,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2996
#5 0x000000000082340b in eval_const_expressions_mutator
(node=0x31890f8, context=0x7fff0578ef60) at clauses.c:3605
#6 0x000000000079810c in expression_tree_mutator (node=0x3188cc8,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2863
#7 0x000000000082225d in eval_const_expressions_mutator
(node=0x3188cc8, context=0x7fff0578ef60) at clauses.c:3154
#8 0x000000000079875c in expression_tree_mutator (node=0x3189240,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2996
#9 0x000000000082340b in eval_const_expressions_mutator
(node=0x3189240, context=0x7fff0578ef60) at clauses.c:3605
#10 0x000000000082090c in eval_const_expressions (root=0x0,
node=0x3189240) at clauses.c:2265
#11 0x0000000000a75169 in RelationBuildPartitionKey
(relation=0x7f5ca3e479a8) at partcache.c:139
#12 0x0000000000a7aa5e in RelationBuildDesc (targetRelId=17178,
insertIt=true) at relcache.c:1171
#13 0x0000000000a7c975 in RelationIdGetRelation (relationId=17178) at
relcache.c:2035
#14 0x000000000048e0c0 in relation_open (relationId=17178, lockmode=1)
at relation.c:59
#15 0x0000000000a8a4f7 in load_typcache_tupdesc (typentry=0x1c16bc0)
at typcache.c:793
#16 0x0000000000a8a3bb in lookup_type_cache (type_id=17180, flags=256)
at typcache.c:748
#17 0x0000000000a8bba4 in lookup_rowtype_tupdesc_internal
(type_id=17180, typmod=-1, noError=false) at typcache.c:1570
#18 0x0000000000a8be43 in lookup_rowtype_tupdesc (type_id=17180,
typmod=-1) at typcache.c:1656
#19 0x0000000000a0713f in record_cmp (fcinfo=0x7fff0578f4d0) at rowtypes.c:815
#20 0x0000000000a083e2 in btrecordcmp (fcinfo=0x7fff0578f4d0) at rowtypes.c:1276
#21 0x0000000000a97bd9 in FunctionCall2Coll (flinfo=0x2bb4a98,
collation=0, arg1=51939144, arg2=51940000) at fmgr.c:1162
#22 0x00000000008443f6 in qsort_partition_list_value_cmp (a=0x3188c50,
b=0x3188c58, arg=0x2bb46c0) at partbounds.c:1769
#23 0x0000000000af9dc6 in qsort_arg (a=0x3188c50, n=2, es=8,
cmp=0x84439a <qsort_partition_list_value_cmp>, arg=0x2bb46c0) at
qsort_arg.c:132
#24 0x000000000084186a in create_list_bounds (boundspecs=0x3188650,
nparts=2, key=0x2bb46c0, mapping=0x7fff0578f7d8) at partbounds.c:396
#25 0x00000000008410ec in partition_bounds_create
(boundspecs=0x3188650, nparts=2, key=0x2bb46c0,
mapping=0x7fff0578f7d8) at partbounds.c:206
#26 0x0000000000847622 in RelationBuildPartitionDesc
(rel=0x7f5ca3e47560) at partdesc.c:205
#27 0x0000000000a7aa6a in RelationBuildDesc (targetRelId=17178,
insertIt=true) at relcache.c:1172

Also:

create table foo (a int) partition by list ((row(a)));
create table foo1 partition of foo for values in (row(1));
create table foo2 partition of foo for values in (row(2));

explain select * from foo where row(a) = '(1)'::foo;
QUERY PLAN
----------------------------------------------------------
Seq Scan on foo1 foo (cost=0.00..41.88 rows=13 width=4)
Filter: (ROW(a) = '(1)'::foo)
(2 rows)

explain select * from foo where row(a) = '(2)'::foo;
QUERY PLAN
----------------------------------------------------------
Seq Scan on foo2 foo (cost=0.00..41.88 rows=13 width=4)
Filter: (ROW(a) = '(2)'::foo)
(2 rows)

-- another session
explain select * from foo where row(a) = '(1)'::foo;
ERROR: record type has not been registered
LINE 1: explain select * from foo where row(a) = '(1)'::foo;

Attached a patch to fix that.

Thanks,
Amit

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Amit Langote (#1)

Re: unsupportable composite type partition keys

Amit Langote <amitlangote09@gmail.com> writes:

It seems to me that we currently allow expressions that are anonymous
and self-referencing composite type records as partition key, but
shouldn't. Allowing them leads to this:

Hm. Seems like the restrictions here ought to be just about the same
as on index columns, no? That is, it should be roughly a test like
"no pseudo-types". The check you're proposing seems awfully specific,
and I doubt that the equivalent check in CREATE INDEX looks the same.
(But I didn't go look ... I might be wrong.)

regards, tom lane

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#2)

Re: unsupportable composite type partition keys

On Wed, Dec 18, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

It seems to me that we currently allow expressions that are anonymous
and self-referencing composite type records as partition key, but
shouldn't. Allowing them leads to this:

Hm. Seems like the restrictions here ought to be just about the same
as on index columns, no? That is, it should be roughly a test like
"no pseudo-types". The check you're proposing seems awfully specific,
and I doubt that the equivalent check in CREATE INDEX looks the same.
(But I didn't go look ... I might be wrong.)

We also need to disallow self-referencing composite type in the case
of partitioning, because otherwise it leads to infinite recursion
shown in my first email.

The timing of building PartitionDesc is what causes it, because the
construction of PartitionBoundInfo in turn requires opening the parent
relation if the partition partition key is of self-referencing
composite type, because we need the TupleDesc when sorting the
partition bounds. Maybe we'll need to rearrange that someday so that
PartitionDesc is built outside RelationBuildDesc path, so this
infinite recursion doesn't occur, but maybe allowing this case isn't
that useful to begin with?

Thanks,
Amit

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Amit Langote (#3)

Re: unsupportable composite type partition keys

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Dec 18, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. Seems like the restrictions here ought to be just about the same
as on index columns, no?

We also need to disallow self-referencing composite type in the case
of partitioning, because otherwise it leads to infinite recursion
shown in my first email.

My point is basically that CheckAttributeType already covers that
issue, as well as a lot of others. So why isn't the partitioning
code using it?

regards, tom lane

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#4)

Re: unsupportable composite type partition keys

On Wed, Dec 18, 2019 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Dec 18, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. Seems like the restrictions here ought to be just about the same
as on index columns, no?

We also need to disallow self-referencing composite type in the case
of partitioning, because otherwise it leads to infinite recursion
shown in my first email.

My point is basically that CheckAttributeType already covers that
issue, as well as a lot of others. So why isn't the partitioning
code using it?

My reason to not use it was that the error message that are produced
are not quite helpful in this case; compare what my patch produces vs.
what one gets with CheckAttributeType("expr", ...):

    a int,
    b int
 ) PARTITION BY RANGE (((a, b)));
-ERROR:  partition key cannot be of anonymous or self-referencing composite type
-LINE 4: ) PARTITION BY RANGE (((a, b)));
-                              ^
+ERROR:  column "expr" has pseudo-type record

 CREATE TABLE partitioned (
    a int,
    b int
 ) PARTITION BY RANGE ((row(a, b)));
-ERROR:  partition key cannot be of anonymous or self-referencing composite type
-LINE 4: ) PARTITION BY RANGE ((row(a, b)));
-                              ^
+ERROR:  column "expr" has pseudo-type record

 CREATE TABLE partitioned (
    a int,
    b int
 ) PARTITION BY RANGE ((row(a, b)::partitioned));
-ERROR:  partition key cannot be of anonymous or self-referencing composite type
-LINE 4: ) PARTITION BY RANGE ((row(a, b)::partitioned));
-                              ^
+ERROR:  composite type partitioned cannot be made a member of itself

Thanks,
Amit

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Amit Langote (#5)

Re: unsupportable composite type partition keys

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Dec 18, 2019 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

My point is basically that CheckAttributeType already covers that
issue, as well as a lot of others. So why isn't the partitioning
code using it?

My reason to not use it was that the error message that are produced
are not quite helpful in this case;

I can't get terribly excited about that; but in any case, if we think
the errors aren't nice enough, the answer is to improve them, not
re-implement the function badly.

After further thought, it seems to me that we are dealing with two
nearly independent issues:

1. We must not accept partition bounds values that are of underdetermined
types, else (a) we are likely to get failures like "record type has not
been registered" while loading them back from disk, and (b) polymorphic
btree support functions are likely to complain that they can't identify
the type they're supposed to work on. This is exactly the same issue that
expression indexes face, so we should be applying the same checks, that
is CheckAttributeType(). I do not believe that checking for RECORD is
adequate to close this hole. At the very least, RECORD[] is equally
dangerous, and in general I think any pseudotype would be risky.

2. If the partitioning expression contains a reference to the partitioned
table's rowtype, we get infinite recursion while trying to load the
relcache entry. The patch proposes to prevent that by checking whether
the expression's final result type is that type, but that's not nearly
adequate because a reference anywhere inside the expression is just as
bad. In general, considering possibly-inlined SQL functions, I'm doubtful
that any precheck is going to be able to prevent this scenario.

Now as far as point 1 goes, I think it's not really that awful to use
CheckAttributeType() with a dummy attribute name. The attached
incomplete patch uses "partition key" which causes it to emit errors
like

regression=# create table fool (a int, b int) partition by list ((row(a, b)));
ERROR: column "partition key" has pseudo-type record

I don't think that that's unacceptable. But if we wanted to improve it,
we could imagine adding another flag, say CHKATYPE_IS_PARTITION_KEY,
that doesn't affect CheckAttributeType's semantics, just the wording of
the error messages it throws.

As far as point 2 goes, I think this is another outgrowth of the
fundamental design error that we load a partitioned rel's partitioning
info immediately when the relcache entry is created, rather than later
on-demand. If we weren't doing that then it wouldn't be problematic
to inspect the rel's rowtype while constructing the partitioning info.
I've bitched about this before, if memory serves, but couldn't light
a fire under anyone about fixing it. Now I think we have no choice.
It was never a great idea that minimal construction of a relcache
entry could result in running arbitrary user-defined code.

Note that the end result of this would be to allow, not prohibit,
cases like your example. I wonder whether we couldn't also lift
the restriction against whole-row Vars in partition expressions.
Doesn't seem like there is much difference between such a Var and
a row(...)::table_rowtype expression.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Tom Lane (#6)

Re: unsupportable composite type partition keys

I wrote:

As far as point 2 goes, I think this is another outgrowth of the
fundamental design error that we load a partitioned rel's partitioning
info immediately when the relcache entry is created, rather than later
on-demand. If we weren't doing that then it wouldn't be problematic
to inspect the rel's rowtype while constructing the partitioning info.
I've bitched about this before, if memory serves, but couldn't light
a fire under anyone about fixing it. Now I think we have no choice.
It was never a great idea that minimal construction of a relcache
entry could result in running arbitrary user-defined code.

Here's a draft patch for that. There are a couple of secondary issues
I didn't do anything about yet:

* When rebuilding an open relcache entry for a partitioned table, this
coding now always quasi-leaks the old rd_pdcxt, where before that happened
only if the partdesc actually changed. (Even if I'd kept the
equalPartitionDescs call, it would always fail.) I complained about the
quasi-leak behavior before, but this probably pushes it up to the level of
"must fix". What I'm inclined to do is to hack
RelationDecrementReferenceCount so that, when the refcount goes to zero,
we delete any child contexts of rd_pdcxt. That's pretty annoying but in
the big scheme of things it's unlikely to matter.

* It'd be better to declare RelationGetPartitionKey and
RelationGetPartitionDesc in relcache.h and get their callers out of the
business of including rel.h, where possible.

* equalPartitionDescs is now dead code, should we remove it?

Note that the end result of this would be to allow, not prohibit,
cases like your example. I wonder whether we couldn't also lift
the restriction against whole-row Vars in partition expressions.
Doesn't seem like there is much difference between such a Var and
a row(...)::table_rowtype expression.

I didn't look into that either. I wouldn't propose back-patching that,
but it'd be interesting to try to fix it in HEAD.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Tom Lane (#6)

Re: unsupportable composite type partition keys

I wrote:

Now as far as point 1 goes, I think it's not really that awful to use
CheckAttributeType() with a dummy attribute name. The attached
incomplete patch uses "partition key" which causes it to emit errors
like
regression=# create table fool (a int, b int) partition by list ((row(a, b)));
ERROR: column "partition key" has pseudo-type record
I don't think that that's unacceptable. But if we wanted to improve it,
we could imagine adding another flag, say CHKATYPE_IS_PARTITION_KEY,
that doesn't affect CheckAttributeType's semantics, just the wording of
the error messages it throws.

Here's a fleshed-out patch that does it like that.

While poking at this, I also started to wonder why CheckAttributeType
wasn't recursing into ranges, since those are our other kind of
container type. And the answer is that it must, because we allow
creation of ranges over composite types:

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# create type foorange as range (subtype = foo);
CREATE TYPE
regression=# alter table foo add column r foorange;
ALTER TABLE

Simple things still work on table foo, but surely this is exactly
what CheckAttributeType is supposed to be preventing. With the
second attached patch you get

regression=# alter table foo add column r foorange;
ERROR: composite type foo cannot be made a member of itself

The second patch needs to go back all the way, the first one
only as far as we have partitions.

regards, tom lane

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#7)

Re: unsupportable composite type partition keys

On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

As far as point 2 goes, I think this is another outgrowth of the
fundamental design error that we load a partitioned rel's partitioning
info immediately when the relcache entry is created, rather than later
on-demand. If we weren't doing that then it wouldn't be problematic
to inspect the rel's rowtype while constructing the partitioning info.
I've bitched about this before, if memory serves, but couldn't light
a fire under anyone about fixing it. Now I think we have no choice.
It was never a great idea that minimal construction of a relcache
entry could result in running arbitrary user-defined code.

Here's a draft patch for that.

Thanks for writing the patch. This also came up recently on another thread [1]/messages/by-id/CA+HiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg@mail.gmail.com.

There are a couple of secondary issues
I didn't do anything about yet:

* When rebuilding an open relcache entry for a partitioned table, this
coding now always quasi-leaks the old rd_pdcxt, where before that happened
only if the partdesc actually changed. (Even if I'd kept the
equalPartitionDescs call, it would always fail.) I complained about the
quasi-leak behavior before, but this probably pushes it up to the level of
"must fix". What I'm inclined to do is to hack
RelationDecrementReferenceCount so that, when the refcount goes to zero,
we delete any child contexts of rd_pdcxt. That's pretty annoying but in
the big scheme of things it's unlikely to matter.

Hacking RelationDecrementReferenceCount() like that sounds OK.

-        else if (rebuild && newrel->rd_pdcxt != NULL)
+        if (rebuild && newrel->rd_pdcxt != NULL)

Checking rebuild seems unnecessary in this block, although that's true
even without the patch.

+             * To ensure that it's not leaked completely, re-attach it to the
+             * new reldesc, or make it a child of the new reldesc's rd_pdcxt
+             * in the unlikely event that there is one already.  (See hack in
+             * RelationBuildPartitionDesc.)
...
+            if (relation->rd_pdcxt != NULL) /* probably never happens */
+                MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
+            else
+                relation->rd_pdcxt = newrel->rd_pdcxt;

While I can imagine that RelationBuildPartitionDesc() might encounter
an old partition descriptor making the re-parenting hack necessary
there, I don't see why it's needed here, because a freshly built
relation descriptor would not contain the partition descriptor after
this patch.

* It'd be better to declare RelationGetPartitionKey and
RelationGetPartitionDesc in relcache.h and get their callers out of the
business of including rel.h, where possible.

Although I agree to declare them in relcache.h, that doesn't reduce
the need to include rel.h in their callers much, because anyplace that
uses RelationGetPartitionDesc() is also very likely to use
RelationGetRelid() which is in rel.h.

* equalPartitionDescs is now dead code, should we remove it?

Don't see any problem with doing so.

Note that the end result of this would be to allow, not prohibit,
cases like your example. I wonder whether we couldn't also lift
the restriction against whole-row Vars in partition expressions.
Doesn't seem like there is much difference between such a Var and
a row(...)::table_rowtype expression.

I didn't look into that either. I wouldn't propose back-patching that,
but it'd be interesting to try to fix it in HEAD.

Agreed.

Thanks,
Amit

[1]: /messages/by-id/CA+HiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg@mail.gmail.com

#10

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Amit Langote (#9)

Re: unsupportable composite type partition keys

Amit Langote <amitlangote09@gmail.com> writes:

On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
+             * To ensure that it's not leaked completely, re-attach it to the
+             * new reldesc, or make it a child of the new reldesc's rd_pdcxt
+             * in the unlikely event that there is one already.  (See hack in
+             * RelationBuildPartitionDesc.)

While I can imagine that RelationBuildPartitionDesc() might encounter
an old partition descriptor making the re-parenting hack necessary
there, I don't see why it's needed here, because a freshly built
relation descriptor would not contain the partition descriptor after
this patch.

Well, as the comment says, that's probably unreachable today. But
I could see it happening in the future, particularly if we ever allow
partitioned system catalogs. There are a lot of paths through this
code that are not obvious to the naked eye, and some of them can cause
relcache entries to get populated behind-your-back. Most of relcache.c
is careful about this; I do not see an excuse for the partition-data
code to be less so, even if we think it can't happen today.

(I notice that RelationBuildPartitionKey is making a similar assumption
that the partkey couldn't magically appear while it's working, and I
don't like it much there either.)

* It'd be better to declare RelationGetPartitionKey and
RelationGetPartitionDesc in relcache.h and get their callers out of the
business of including rel.h, where possible.

Although I agree to declare them in relcache.h, that doesn't reduce
the need to include rel.h in their callers much, because anyplace that
uses RelationGetPartitionDesc() is also very likely to use
RelationGetRelid() which is in rel.h.

Hm. Well, we can try anyway.

[1] /messages/by-id/CA+HiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg@mail.gmail.com

Oh, interesting --- I hadn't been paying much attention to that thread.
I'll compare your PoC there to what I did.

regards, tom lane

#11

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Tom Lane (#10)

Re: unsupportable composite type partition keys

BTW, I forgot to mention: while I think the patch to forbid pseudotypes
by using CheckAttributeType() can be back-patched, I'm leaning towards
not back-patching the other patch. The situation where we get into
infinite recursion seems not very likely in practice, and it's not
going to cause any crash or data loss, so I think we can just say
"sorry that's not supported before v13". The patch as I'm proposing
it seems rather invasive for a back-branch fix. Also, changing
RelationGetPartitionKey/Desc from macros to functions is at least a
weak ABI break. If there are extensions calling either, they might
still work without a recompile --- but if they have code paths that
are the first thing to touch either field since a relcache flush,
they'd crash.

regards, tom lane

#12

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#10)

Re: unsupportable composite type partition keys

On Mon, Dec 23, 2019 at 23:49 Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

[1]

/messages/by-id/CA+HiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg@mail.gmail.com

Oh, interesting --- I hadn't been paying much attention to that thread.
I'll compare your PoC there to what I did.

Actually, I should’ve said that your patch is much better attempt at
getting this in order, so there’s not much to see in my patch really. :)

Thanks,
Amit

Show quoted text

#13

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Amit Langote (#12)

Re: unsupportable composite type partition keys

Amit Langote <amitlangote09@gmail.com> writes:

On Mon, Dec 23, 2019 at 23:49 Tom Lane <tgl@sss.pgh.pa.us> wrote:

Oh, interesting --- I hadn't been paying much attention to that thread.
I'll compare your PoC there to what I did.

Actually, I should’ve said that your patch is much better attempt at
getting this in order, so there’s not much to see in my patch really. :)

One thing I see is that you chose to relocate RelationGetPartitionDesc's
declaration to partdesc.h, whereupon RelationBuildPartitionDesc doesn't
have to be exported at all anymore. Perhaps that's a better factorization
than what I did. It supposes that any caller of RelationGetPartitionDesc
is going to need partdesc.h, but that seems reasonable. We could likewise
move RelationGetPartitionKey to partcache.h.

regards, tom lane

#14

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Tom Lane (#13)

Re: unsupportable composite type partition keys

I wrote:

One thing I see is that you chose to relocate RelationGetPartitionDesc's
declaration to partdesc.h, whereupon RelationBuildPartitionDesc doesn't
have to be exported at all anymore. Perhaps that's a better factorization
than what I did. It supposes that any caller of RelationGetPartitionDesc
is going to need partdesc.h, but that seems reasonable. We could likewise
move RelationGetPartitionKey to partcache.h.

I concluded that that is indeed a better solution; it does allow removing
some rel.h inclusions (though possibly those were just duplicative?), and
it also means that relcache.c itself doesn't need any partitioning
inclusions at all.

Here's a cleaned-up patch that does it like that and also fixes the
memory leakage issue.

I noticed along the way that with partkeys only being loaded on demand,
we no longer need the incredibly-unsafe hack in RelationBuildPartitionKey
whereby it just silently ignores failure to read the pg_partitioned_table
entry. I also rearranged RelationBuildPartitionDesc so that it uses the
same context-reparenting trick as RelationBuildPartitionKey. That doesn't
save us anything, but it makes the code considerably more robust, I think;
we don't need to assume as much about what partition_bounds_copy does.

One other thing worth noting is that I used unlikely() to try to
discourage the compiler from inlining RelationBuildPartitionDesc
into RelationGetPartitionDesc (and likewise for the Key functions).
Not sure how effective that is, but it can't hurt.

I haven't removed equalPartitionDescs here; that seems like material
for a separate patch (to make it easier to put it back if we need it).

regards, tom lane

#15

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#11)

Re: unsupportable composite type partition keys

On Tue, Dec 24, 2019 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, I forgot to mention: while I think the patch to forbid pseudotypes
by using CheckAttributeType() can be back-patched, I'm leaning towards
not back-patching the other patch. The situation where we get into
infinite recursion seems not very likely in practice, and it's not
going to cause any crash or data loss, so I think we can just say
"sorry that's not supported before v13". The patch as I'm proposing
it seems rather invasive for a back-branch fix.

It is indeed.

Just to be sure, by going with "unsupported before v13", which one do you mean:

* documenting it as so
* giving an error in such cases, like the patch in the first email on
this thread did
* doing nothing really

Thanks,
Amit

#16

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#14)

Re: unsupportable composite type partition keys

On Tue, Dec 24, 2019 at 6:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

One thing I see is that you chose to relocate RelationGetPartitionDesc's
declaration to partdesc.h, whereupon RelationBuildPartitionDesc doesn't
have to be exported at all anymore. Perhaps that's a better factorization
than what I did. It supposes that any caller of RelationGetPartitionDesc
is going to need partdesc.h, but that seems reasonable. We could likewise
move RelationGetPartitionKey to partcache.h.

I concluded that that is indeed a better solution; it does allow removing
some rel.h inclusions (though possibly those were just duplicative?), and
it also means that relcache.c itself doesn't need any partitioning
inclusions at all.

Here's a cleaned-up patch that does it like that and also fixes the
memory leakage issue.

Thanks for the updated patch. I didn't find anything to complain about.

I haven't removed equalPartitionDescs here; that seems like material
for a separate patch (to make it easier to put it back if we need it).

Seems like a good idea.

Btw, does the memory leakage fix in this patch address any of the
pending concerns that were discussed on the "hyrax vs.
RelationBuildPartitionDesc" thread earlier this year[1]/messages/by-id/3800.1560366716@sss.pgh.pa.us?

Thanks,
Amit

[1]: /messages/by-id/3800.1560366716@sss.pgh.pa.us

#17

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#16)

Re: unsupportable composite type partition keys

On Tue, Dec 24, 2019 at 10:59 AM Amit Langote <amitlangote09@gmail.com> wrote:

Btw, does the memory leakage fix in this patch address any of the
pending concerns that were discussed on the "hyrax vs.
RelationBuildPartitionDesc" thread earlier this year[1]?

[1] /messages/by-id/3800.1560366716@sss.pgh.pa.us

I thought about this a little and I think it *does* address the main
complaint in the above thread.

The main complaint as I understand is that receiving repeated
invalidations due to partitions being concurrently added while a
PartitionDirectory is holding a pointer to PartitionDesc causes many
copies of PartitionDesc to pile up due to the parent table being
rebuilt upon processing of each invalidation.

Now because we don't build the PartitionDesc in the
RelationClearRelation path, that can't happen. Although it still
seems possible for the piling up to occur if it's
RelationBuildPartitionDesc that is run repeatedly via
RelationGetParttionDesc while partitions are being concurrently added,
but I couldn't find anything in the partitioning code that does that.

Thanks,
Amit

#18

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#9)

Re: unsupportable composite type partition keys

On Mon, Dec 23, 2019 at 6:42 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether we couldn't also lift
the restriction against whole-row Vars in partition expressions.
Doesn't seem like there is much difference between such a Var and
a row(...)::table_rowtype expression.

I didn't look into that either. I wouldn't propose back-patching that,
but it'd be interesting to try to fix it in HEAD.

Agreed.

I gave that a try and ended up with attached that applies on top of
your delay-loading-relcache-partition-data-2.patch.

Thanks,
Amit

#19

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Amit Langote (#15)

Re: unsupportable composite type partition keys

Amit Langote <amitlangote09@gmail.com> writes:

On Tue, Dec 24, 2019 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, I forgot to mention: while I think the patch to forbid pseudotypes
by using CheckAttributeType() can be back-patched, I'm leaning towards
not back-patching the other patch. The situation where we get into
infinite recursion seems not very likely in practice, and it's not
going to cause any crash or data loss, so I think we can just say
"sorry that's not supported before v13". The patch as I'm proposing
it seems rather invasive for a back-branch fix.

It is indeed.

Just to be sure, by going with "unsupported before v13", which one do you mean:

* documenting it as so
* giving an error in such cases, like the patch in the first email on
this thread did
* doing nothing really

I was thinking "do nothing in the back branches". I don't believe we
can detect such cases reliably (at least not without complicated logic,
which'd defeat the point), so I don't think giving an error is actually
feasible, and I doubt that documenting it would be useful. If we get
some field complaints about this, it'd be time enough to reconsider.

regards, tom lane

#20

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Tom Lane (#19)

Re: unsupportable composite type partition keys

On Wed, Dec 25, 2019 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

On Tue, Dec 24, 2019 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, I forgot to mention: while I think the patch to forbid pseudotypes
by using CheckAttributeType() can be back-patched, I'm leaning towards
not back-patching the other patch. The situation where we get into
infinite recursion seems not very likely in practice, and it's not
going to cause any crash or data loss, so I think we can just say
"sorry that's not supported before v13". The patch as I'm proposing
it seems rather invasive for a back-branch fix.

It is indeed.

Just to be sure, by going with "unsupported before v13", which one do you mean:

* documenting it as so
* giving an error in such cases, like the patch in the first email on
this thread did
* doing nothing really

I was thinking "do nothing in the back branches". I don't believe we
can detect such cases reliably (at least not without complicated logic,
which'd defeat the point), so I don't think giving an error is actually
feasible, and I doubt that documenting it would be useful. If we get
some field complaints about this, it'd be time enough to reconsider.