A bug in mapping attributes in ATExecAttachPartition()
In ATExecAttachPartition() there's following code
13715 partnatts = get_partition_natts(key);
13716 for (i = 0; i < partnatts; i++)
13717 {
13718 AttrNumber partattno;
13719
13720 partattno = get_partition_col_attnum(key, i);
13721
13722 /* If partition key is an expression, must not skip
validation */
13723 if (!partition_accepts_null &&
13724 (partattno == 0 ||
13725 !bms_is_member(partattno, not_null_attrs)))
13726 skip_validate = false;
13727 }
partattno obtained from the partition key is the attribute number of
the partitioned table but not_null_attrs contains the attribute
numbers of attributes of the table being attached which have NOT NULL
constraint on them. But the attribute numbers of partitioned table and
the table being attached may not agree i.e. partition key attribute in
partitioned table may have a different position in the table being
attached. So, this code looks buggy. Probably we don't have a test
which tests this code with different attribute order between
partitioned table and the table being attached. I didn't get time to
actually construct a testcase and test it.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 7, 2017 at 7:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
In ATExecAttachPartition() there's following code
13715 partnatts = get_partition_natts(key);
13716 for (i = 0; i < partnatts; i++)
13717 {
13718 AttrNumber partattno;
13719
13720 partattno = get_partition_col_attnum(key, i);
13721
13722 /* If partition key is an expression, must not skip
validation */
13723 if (!partition_accepts_null &&
13724 (partattno == 0 ||
13725 !bms_is_member(partattno, not_null_attrs)))
13726 skip_validate = false;
13727 }partattno obtained from the partition key is the attribute number of
the partitioned table but not_null_attrs contains the attribute
numbers of attributes of the table being attached which have NOT NULL
constraint on them. But the attribute numbers of partitioned table and
the table being attached may not agree i.e. partition key attribute in
partitioned table may have a different position in the table being
attached. So, this code looks buggy. Probably we don't have a test
which tests this code with different attribute order between
partitioned table and the table being attached. I didn't get time to
actually construct a testcase and test it.
I think this code could be removed entirely in light of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7.
Amit?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/06/08 1:44, Robert Haas wrote:
On Wed, Jun 7, 2017 at 7:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:In ATExecAttachPartition() there's following code
13715 partnatts = get_partition_natts(key);
13716 for (i = 0; i < partnatts; i++)
13717 {
13718 AttrNumber partattno;
13719
13720 partattno = get_partition_col_attnum(key, i);
13721
13722 /* If partition key is an expression, must not skip
validation */
13723 if (!partition_accepts_null &&
13724 (partattno == 0 ||
13725 !bms_is_member(partattno, not_null_attrs)))
13726 skip_validate = false;
13727 }partattno obtained from the partition key is the attribute number of
the partitioned table but not_null_attrs contains the attribute
numbers of attributes of the table being attached which have NOT NULL
constraint on them. But the attribute numbers of partitioned table and
the table being attached may not agree i.e. partition key attribute in
partitioned table may have a different position in the table being
attached. So, this code looks buggy. Probably we don't have a test
which tests this code with different attribute order between
partitioned table and the table being attached. I didn't get time to
actually construct a testcase and test it.
There seem to be couple of bugs here:
1. When partition's key attributes differ in ordering from the parent,
predicate_implied_by() will give up due to structural inequality of
Vars in the expressions. By fixing this, we can get it to return
'true' when it's really so.
2. As you said, we store partition's attribute numbers in the
not_null_attrs bitmap, but then check partattno (which is the parent's
attribute number which might differ) against the bitmap, which seems
like it might produce incorrect result. If, for example,
predicate_implied_by() set skip_validate to true, and the above code
failed to set skip_validate to false where it should have, then we
would wrongly end up skipping the scan. That is, rows in the partition
will contain null values whereas the partition constraint does not
allow it. It's hard to reproduce this currently, because with
different ordering of attributes, predicate_refute_by() never returns
true (as mentioned in 1 above), so skip_validate does not need to be
set to false again.
Consider this example:
create table p (a int, b char) partition by list (a);
create table p1 (b char not null, a int check (a in (1)));
insert into p1 values ('b', null);
Note that not_null_attrs for p1 will contain 1 corresponding to column b,
which matches key attribute of the parent, that is 1, corresponding to
column a. Hence we end up wrongly concluding that p1's partition key
column does not allow nulls.
I think this code could be removed entirely in light of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7.
I am assuming you think that because now we emit IS NOT NULL constraint
internally for any partition keys that do not allow null values (including
all the keys of range partitions as of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7). But those IS NOT NULL
constraint expressions are inconclusive as far as the application of
predicate_implied_by() to determine if we can skip the scan is concerned.
So even if predicate_implied_by() returned 'true', we cannot conclude,
just based on that result, that there are not any null values in the
partition keys.
The code in question is there to check if there are explicit NOT NULL
constraints on the partition keys. It cannot be true for expression keys,
so we give up on skip_validate in that case anyway. But if 1) there are
no expression keys, 2) all the partition key columns of the table being
attached have NOT NULL constraint, and 3) predicate_implied_by() returned
'true', we can skip the scan.
Thoughts?
I am working on a patch to fix the above mentioned issues and will post
the same no later than Friday.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 8, 2017 at 3:13 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/06/08 1:44, Robert Haas wrote:
On Wed, Jun 7, 2017 at 7:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:In ATExecAttachPartition() there's following code
13715 partnatts = get_partition_natts(key);
13716 for (i = 0; i < partnatts; i++)
13717 {
13718 AttrNumber partattno;
13719
13720 partattno = get_partition_col_attnum(key, i);
13721
13722 /* If partition key is an expression, must not skip
validation */
13723 if (!partition_accepts_null &&
13724 (partattno == 0 ||
13725 !bms_is_member(partattno, not_null_attrs)))
13726 skip_validate = false;
13727 }partattno obtained from the partition key is the attribute number of
the partitioned table but not_null_attrs contains the attribute
numbers of attributes of the table being attached which have NOT NULL
constraint on them. But the attribute numbers of partitioned table and
the table being attached may not agree i.e. partition key attribute in
partitioned table may have a different position in the table being
attached. So, this code looks buggy. Probably we don't have a test
which tests this code with different attribute order between
partitioned table and the table being attached. I didn't get time to
actually construct a testcase and test it.There seem to be couple of bugs here:
1. When partition's key attributes differ in ordering from the parent,
predicate_implied_by() will give up due to structural inequality of
Vars in the expressions. By fixing this, we can get it to return
'true' when it's really so.2. As you said, we store partition's attribute numbers in the
not_null_attrs bitmap, but then check partattno (which is the parent's
attribute number which might differ) against the bitmap, which seems
like it might produce incorrect result. If, for example,
predicate_implied_by() set skip_validate to true, and the above code
failed to set skip_validate to false where it should have, then we
would wrongly end up skipping the scan. That is, rows in the partition
will contain null values whereas the partition constraint does not
allow it. It's hard to reproduce this currently, because with
different ordering of attributes, predicate_refute_by() never returns
true (as mentioned in 1 above), so skip_validate does not need to be
set to false again.Consider this example:
create table p (a int, b char) partition by list (a);
create table p1 (b char not null, a int check (a in (1)));
insert into p1 values ('b', null);Note that not_null_attrs for p1 will contain 1 corresponding to column b,
which matches key attribute of the parent, that is 1, corresponding to
column a. Hence we end up wrongly concluding that p1's partition key
column does not allow nulls.I think this code could be removed entirely in light of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7.I am assuming you think that because now we emit IS NOT NULL constraint
internally for any partition keys that do not allow null values (including
all the keys of range partitions as of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7). But those IS NOT NULL
constraint expressions are inconclusive as far as the application of
predicate_implied_by() to determine if we can skip the scan is concerned.
So even if predicate_implied_by() returned 'true', we cannot conclude,
just based on that result, that there are not any null values in the
partition keys.
I am not able to understand this. Are you saying that
predicate_implied_by() returns true even when it's not implied when
NOT NULL constraints are involved? That sounds like a bug in
predicate_implied_by(), which should be fixed instead of adding code
to pepper over it?
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/06/08 19:25, Ashutosh Bapat wrote:
On Thu, Jun 8, 2017 at 3:13 PM, Amit Langote
I think this code could be removed entirely in light of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7.I am assuming you think that because now we emit IS NOT NULL constraint
internally for any partition keys that do not allow null values (including
all the keys of range partitions as of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7). But those IS NOT NULL
constraint expressions are inconclusive as far as the application of
predicate_implied_by() to determine if we can skip the scan is concerned.
So even if predicate_implied_by() returned 'true', we cannot conclude,
just based on that result, that there are not any null values in the
partition keys.I am not able to understand this. Are you saying that
predicate_implied_by() returns true even when it's not implied when
NOT NULL constraints are involved? That sounds like a bug in
predicate_implied_by(), which should be fixed instead of adding code
to pepper over it?
No, it's not a bug of predicate_implied_by(). I meant to say
predicate_implied_by() isn't exactly designed for ATExecAttachPartition's
purpose, especially its treatment of IS NOT NULL constraints is not
suitable for this application. To prove that the table cannot contain
NULLs when it shouldn't because of the partition constraint, we must look
for explicit NOT NULL constraints on the partition key columns, instead of
relying on the predicate_implied_by()'s proof. See the following
explanation for why that is so (or at least I think is so):
There is this text in the header comment of
predicate_implied_by_simple_clause(), which is where the individual pairs
of expressions are compared and/or passed to operator_predicate_proof(),
which mentions that the treatment of IS NOT NULL predicate is based on the
assumption that 'restrictions' that are passed to predicate_implied_by()
are a query's WHERE clause restrictions, *not* CHECK constraints that are
checked when inserting data into a table.
* When the predicate is of the form "foo IS NOT NULL", we can conclude that
* the predicate is implied if the clause is a strict operator or function
* that has "foo" as an input. In this case the clause must yield NULL when
* "foo" is NULL, which we can take as equivalent to FALSE because we know
* we are within an AND/OR subtree of a WHERE clause. (Again, "foo" is
* already known immutable, so the clause will certainly always fail.)
* Also, if the clause is just "foo" (meaning it's a boolean variable),
* the predicate is implied since the clause can't be true if "foo" is NULL.
As mentioned above, note the following part: which we can take as
equivalent to FALSE because we know we are within an AND/OR subtree of a
WHERE clause.
Which is not what we are passing to predicate_implied_by() in
ATExecAttachPartition(). We are passing it the table's CHECK constraint
clauses which behave differently for the NULL result on NULL input - they
*allow* the row to be inserted. Which means that there will be rows with
NULLs in the partition key, even if predicate_refuted_by() said that there
cannot be. We will end up *wrongly* skipping the validation scan if we
relied on just the predicate_refuted_by()'s result. That's why there is
code to check for explicit NOT NULL constraints on the partition key
columns. If there are, it's OK then to assume that all the partition
constraints are satisfied by existing constraints. One more thing: if any
partition key happens to be an expression, which there cannot be NOT NULL
constraints for, we just give up on skipping the scan, because we don't
have any declared knowledge about whether those keys are also non-null,
which we want for partitions that do not accept null values.
Does that make sense?
Thanks,
Amit
PS: Also interesting to note is the difference in behavior between
ExecQual() and ExecCheck() on NULL result.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/06/08 18:43, Amit Langote wrote:
On 2017/06/08 1:44, Robert Haas wrote:
On Wed, Jun 7, 2017 at 7:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:In ATExecAttachPartition() there's following code
13715 partnatts = get_partition_natts(key);
13716 for (i = 0; i < partnatts; i++)
13717 {
13718 AttrNumber partattno;
13719
13720 partattno = get_partition_col_attnum(key, i);
13721
13722 /* If partition key is an expression, must not skip
validation */
13723 if (!partition_accepts_null &&
13724 (partattno == 0 ||
13725 !bms_is_member(partattno, not_null_attrs)))
13726 skip_validate = false;
13727 }partattno obtained from the partition key is the attribute number of
the partitioned table but not_null_attrs contains the attribute
numbers of attributes of the table being attached which have NOT NULL
constraint on them. But the attribute numbers of partitioned table and
the table being attached may not agree i.e. partition key attribute in
partitioned table may have a different position in the table being
attached. So, this code looks buggy. Probably we don't have a test
which tests this code with different attribute order between
partitioned table and the table being attached. I didn't get time to
actually construct a testcase and test it.There seem to be couple of bugs here:
1. When partition's key attributes differ in ordering from the parent,
predicate_implied_by() will give up due to structural inequality of
Vars in the expressions. By fixing this, we can get it to return
'true' when it's really so.2. As you said, we store partition's attribute numbers in the
not_null_attrs bitmap, but then check partattno (which is the parent's
attribute number which might differ) against the bitmap, which seems
like it might produce incorrect result. If, for example,
predicate_implied_by() set skip_validate to true, and the above code
failed to set skip_validate to false where it should have, then we
would wrongly end up skipping the scan. That is, rows in the partition
will contain null values whereas the partition constraint does not
allow it. It's hard to reproduce this currently, because with
different ordering of attributes, predicate_refute_by() never returns
true (as mentioned in 1 above), so skip_validate does not need to be
set to false again.Consider this example:
create table p (a int, b char) partition by list (a);
create table p1 (b char not null, a int check (a in (1)));
insert into p1 values ('b', null);Note that not_null_attrs for p1 will contain 1 corresponding to column b,
which matches key attribute of the parent, that is 1, corresponding to
column a. Hence we end up wrongly concluding that p1's partition key
column does not allow nulls.
[ ... ]
I am working on a patch to fix the above mentioned issues and will post
the same no later than Friday.
And here is the patch.
Thanks,
Amit
Attachments:
0001-Fixes-around-partition-constraint-handling-when-atta.patchtext/plain; charset=UTF-8; name=0001-Fixes-around-partition-constraint-handling-when-atta.patchDownload+113-60
On Fri, Jun 9, 2017 at 10:31 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/06/08 19:25, Ashutosh Bapat wrote:
On Thu, Jun 8, 2017 at 3:13 PM, Amit Langote
I think this code could be removed entirely in light of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7.I am assuming you think that because now we emit IS NOT NULL constraint
internally for any partition keys that do not allow null values (including
all the keys of range partitions as of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7). But those IS NOT NULL
constraint expressions are inconclusive as far as the application of
predicate_implied_by() to determine if we can skip the scan is concerned.
So even if predicate_implied_by() returned 'true', we cannot conclude,
just based on that result, that there are not any null values in the
partition keys.I am not able to understand this. Are you saying that
predicate_implied_by() returns true even when it's not implied when
NOT NULL constraints are involved? That sounds like a bug in
predicate_implied_by(), which should be fixed instead of adding code
to pepper over it?No, it's not a bug of predicate_implied_by(). I meant to say
predicate_implied_by() isn't exactly designed for ATExecAttachPartition's
purpose, especially its treatment of IS NOT NULL constraints is not
suitable for this application. To prove that the table cannot contain
NULLs when it shouldn't because of the partition constraint, we must look
for explicit NOT NULL constraints on the partition key columns, instead of
relying on the predicate_implied_by()'s proof. See the following
explanation for why that is so (or at least I think is so):There is this text in the header comment of
predicate_implied_by_simple_clause(), which is where the individual pairs
of expressions are compared and/or passed to operator_predicate_proof(),
which mentions that the treatment of IS NOT NULL predicate is based on the
assumption that 'restrictions' that are passed to predicate_implied_by()
are a query's WHERE clause restrictions, *not* CHECK constraints that are
checked when inserting data into a table.* When the predicate is of the form "foo IS NOT NULL", we can conclude that
* the predicate is implied if the clause is a strict operator or function
* that has "foo" as an input. In this case the clause must yield NULL when
* "foo" is NULL, which we can take as equivalent to FALSE because we know
* we are within an AND/OR subtree of a WHERE clause. (Again, "foo" is
* already known immutable, so the clause will certainly always fail.)
* Also, if the clause is just "foo" (meaning it's a boolean variable),
* the predicate is implied since the clause can't be true if "foo" is NULL.As mentioned above, note the following part: which we can take as
equivalent to FALSE because we know we are within an AND/OR subtree of a
WHERE clause.Which is not what we are passing to predicate_implied_by() in
ATExecAttachPartition(). We are passing it the table's CHECK constraint
clauses which behave differently for the NULL result on NULL input - they
*allow* the row to be inserted. Which means that there will be rows with
NULLs in the partition key, even if predicate_refuted_by() said that there
cannot be. We will end up *wrongly* skipping the validation scan if we
relied on just the predicate_refuted_by()'s result. That's why there is
code to check for explicit NOT NULL constraints on the partition key
columns. If there are, it's OK then to assume that all the partition
constraints are satisfied by existing constraints. One more thing: if any
partition key happens to be an expression, which there cannot be NOT NULL
constraints for, we just give up on skipping the scan, because we don't
have any declared knowledge about whether those keys are also non-null,
which we want for partitions that do not accept null values.Does that make sense?
Thanks for the long explanation. I guess, this should be written in
comments somewhere in the code there. I see following comment in
ATExecAttachPartition()
--
*
* There is a case in which we cannot rely on just the result of the
* proof.
*/
--
I guess, this comment should be expanded to explain what you wrote above.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
May be we should pass a flag to predicate_implied_by() to handle NULL
behaviour for CHECK constraints. Partitioning has shown that it needs
to use predicate_implied_by() for comparing constraints and there may
be other cases that can come up in future. Instead of handling it
outside predicate_implied_by() we may want to change it under a flag.
On Fri, Jun 9, 2017 at 11:43 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/06/08 18:43, Amit Langote wrote:
On 2017/06/08 1:44, Robert Haas wrote:
On Wed, Jun 7, 2017 at 7:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:In ATExecAttachPartition() there's following code
13715 partnatts = get_partition_natts(key);
13716 for (i = 0; i < partnatts; i++)
13717 {
13718 AttrNumber partattno;
13719
13720 partattno = get_partition_col_attnum(key, i);
13721
13722 /* If partition key is an expression, must not skip
validation */
13723 if (!partition_accepts_null &&
13724 (partattno == 0 ||
13725 !bms_is_member(partattno, not_null_attrs)))
13726 skip_validate = false;
13727 }partattno obtained from the partition key is the attribute number of
the partitioned table but not_null_attrs contains the attribute
numbers of attributes of the table being attached which have NOT NULL
constraint on them. But the attribute numbers of partitioned table and
the table being attached may not agree i.e. partition key attribute in
partitioned table may have a different position in the table being
attached. So, this code looks buggy. Probably we don't have a test
which tests this code with different attribute order between
partitioned table and the table being attached. I didn't get time to
actually construct a testcase and test it.There seem to be couple of bugs here:
1. When partition's key attributes differ in ordering from the parent,
predicate_implied_by() will give up due to structural inequality of
Vars in the expressions. By fixing this, we can get it to return
'true' when it's really so.2. As you said, we store partition's attribute numbers in the
not_null_attrs bitmap, but then check partattno (which is the parent's
attribute number which might differ) against the bitmap, which seems
like it might produce incorrect result. If, for example,
predicate_implied_by() set skip_validate to true, and the above code
failed to set skip_validate to false where it should have, then we
would wrongly end up skipping the scan. That is, rows in the partition
will contain null values whereas the partition constraint does not
allow it. It's hard to reproduce this currently, because with
different ordering of attributes, predicate_refute_by() never returns
true (as mentioned in 1 above), so skip_validate does not need to be
set to false again.Consider this example:
create table p (a int, b char) partition by list (a);
create table p1 (b char not null, a int check (a in (1)));
insert into p1 values ('b', null);Note that not_null_attrs for p1 will contain 1 corresponding to column b,
which matches key attribute of the parent, that is 1, corresponding to
column a. Hence we end up wrongly concluding that p1's partition key
column does not allow nulls.[ ... ]
I am working on a patch to fix the above mentioned issues and will post
the same no later than Friday.And here is the patch.
Thanks,
Amit
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/06/09 20:47, Ashutosh Bapat wrote:
Thanks for the long explanation. I guess, this should be written in
comments somewhere in the code there. I see following comment in
ATExecAttachPartition()
--
*
* There is a case in which we cannot rely on just the result of the
* proof.
*/--
I guess, this comment should be expanded to explain what you wrote above.
Tried in the attached updated patch.
Thanks,
Amit
Attachments:
0001-Fixes-around-partition-constraint-handling-when-atta.patchtext/plain; charset=UTF-8; name=0001-Fixes-around-partition-constraint-handling-when-atta.patchDownload+126-60
On 2017/06/09 20:49, Ashutosh Bapat wrote:
May be we should pass a flag to predicate_implied_by() to handle NULL
behaviour for CHECK constraints. Partitioning has shown that it needs
to use predicate_implied_by() for comparing constraints and there may
be other cases that can come up in future. Instead of handling it
outside predicate_implied_by() we may want to change it under a flag.
IMHO, it may not be a good idea to modify predtest.c to suit the
partitioning code's needs. The workaround of checking that NOT NULL
constraints on partitioning columns exist seems to me to be simpler than
hacking predtest.c to teach it about the new behavior.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 12, 2017 at 4:09 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/06/09 20:49, Ashutosh Bapat wrote:
May be we should pass a flag to predicate_implied_by() to handle NULL
behaviour for CHECK constraints. Partitioning has shown that it needs
to use predicate_implied_by() for comparing constraints and there may
be other cases that can come up in future. Instead of handling it
outside predicate_implied_by() we may want to change it under a flag.IMHO, it may not be a good idea to modify predtest.c to suit the
partitioning code's needs. The workaround of checking that NOT NULL
constraints on partitioning columns exist seems to me to be simpler than
hacking predtest.c to teach it about the new behavior.
On the plus side, it might also work correctly. I mean, the problem
with what you've done here is that (a) you're completely giving up on
expressions as partition keys and (b) even if no expressions are used
for partitioning, you're still giving up unless there are NOT NULL
constraints on the partitions. Now, maybe that doesn't sound so bad,
but what it means is that if you copy-and-paste the partition
constraint into a CHECK constraint on a new table, you can't skip the
validation scan when attaching it:
rhaas=# create table foo (a int, b text) partition by range (a);
CREATE TABLE
rhaas=# create table foo1 partition of foo for values from (0) to (10);
CREATE TABLE
rhaas=# \d+ foo1
Table "public.foo1"
Column | Type | Collation | Nullable | Default | Storage | Stats
target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
a | integer | | | | plain | |
b | text | | | | extended | |
Partition of: foo FOR VALUES FROM (0) TO (10)
Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 10))
rhaas=# drop table foo1;
DROP TABLE
rhaas=# create table foo1 (like foo, check ((a IS NOT NULL) AND (a >=
0) AND (a < 10)));
CREATE TABLE
rhaas=# alter table foo attach partition foo1 for values from (0) to (10);
ALTER TABLE
I think that's going to come as an unpleasant surprise to more than
one user. I'm not sure exactly how we need to restructure things here
so that this works properly, and maybe modifying
predicate_implied_by() isn't the right thing at all; for instance,
there's also predicate_refuted_by(), which maybe could be used in some
way (inject NOT?). But I don't much like the idea that you copy and
paste the partitioning constraint into a CHECK constraint and that
doesn't work. That's not cool.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 13, 2017 at 10:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think that's going to come as an unpleasant surprise to more than
one user. I'm not sure exactly how we need to restructure things here
so that this works properly, and maybe modifying
predicate_implied_by() isn't the right thing at all; for instance,
there's also predicate_refuted_by(), which maybe could be used in some
way (inject NOT?). But I don't much like the idea that you copy and
paste the partitioning constraint into a CHECK constraint and that
doesn't work. That's not cool.
OK, I think I see the problem here. predicate_implied_by() and
predicate_refuted_by() differ in what they assume about the predicate
evaluating to NULL, but both of them assume that if the clause
evaluates to NULL, that's equivalent to false. So there's actually no
option to get the behavior we want here, which is to treat both
operands using CHECK-semantics (null is true) rather than
WHERE-semantics (null is false).
Given that, Ashutosh's proposal of passing an additional flag to
predicate_implied_by() seems like the best option. Here's a patch
implementing that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
extend-predicate-implied-by-v1.patchapplication/octet-stream; name=extend-predicate-implied-by-v1.patchDownload+75-98
Robert Haas <robertmhaas@gmail.com> writes:
OK, I think I see the problem here. predicate_implied_by() and
predicate_refuted_by() differ in what they assume about the predicate
evaluating to NULL, but both of them assume that if the clause
evaluates to NULL, that's equivalent to false. So there's actually no
option to get the behavior we want here, which is to treat both
operands using CHECK-semantics (null is true) rather than
WHERE-semantics (null is false).
Given that, Ashutosh's proposal of passing an additional flag to
predicate_implied_by() seems like the best option. Here's a patch
implementing that.
I've not reviewed the logic changes in predtest.c in detail, but
I think this is a reasonable direction to go in. Two suggestions:
1. predicate_refuted_by() should grow the extra argument at the
same time. There's no good reason to be asymmetric.
2. It might be clearer, and would certainly be shorter, to call the
extra argument something like "null_is_true".
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 13, 2017 at 5:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
OK, I think I see the problem here. predicate_implied_by() and
predicate_refuted_by() differ in what they assume about the predicate
evaluating to NULL, but both of them assume that if the clause
evaluates to NULL, that's equivalent to false. So there's actually no
option to get the behavior we want here, which is to treat both
operands using CHECK-semantics (null is true) rather than
WHERE-semantics (null is false).Given that, Ashutosh's proposal of passing an additional flag to
predicate_implied_by() seems like the best option. Here's a patch
implementing that.I've not reviewed the logic changes in predtest.c in detail, but
I think this is a reasonable direction to go in. Two suggestions:1. predicate_refuted_by() should grow the extra argument at the
same time. There's no good reason to be asymmetric.
OK.
2. It might be clearer, and would certainly be shorter, to call the
extra argument something like "null_is_true".
I think it's pretty darn important to make it clear that the argument
only applies to the clauses supplied as axioms, and not to the
predicate to be proven; if you want to control how the *predicate* is
handled with respect to nulls, change your selection as among
predicate_implied_by() and predicate_refuted_by(). For that reason, I
disesteem null_is_true, but I'm open to other suggestions.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/06/13 23:24, Robert Haas wrote:
On Mon, Jun 12, 2017 at 4:09 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2017/06/09 20:49, Ashutosh Bapat wrote:
May be we should pass a flag to predicate_implied_by() to handle NULL
behaviour for CHECK constraints. Partitioning has shown that it needs
to use predicate_implied_by() for comparing constraints and there may
be other cases that can come up in future. Instead of handling it
outside predicate_implied_by() we may want to change it under a flag.IMHO, it may not be a good idea to modify predtest.c to suit the
partitioning code's needs. The workaround of checking that NOT NULL
constraints on partitioning columns exist seems to me to be simpler than
hacking predtest.c to teach it about the new behavior.On the plus side, it might also work correctly. I mean, the problem
with what you've done here is that (a) you're completely giving up on
expressions as partition keys and (b) even if no expressions are used
for partitioning, you're still giving up unless there are NOT NULL
constraints on the partitions. Now, maybe that doesn't sound so bad,
but what it means is that if you copy-and-paste the partition
constraint into a CHECK constraint on a new table, you can't skip the
validation scan when attaching it:rhaas=# create table foo (a int, b text) partition by range (a);
CREATE TABLE
rhaas=# create table foo1 partition of foo for values from (0) to (10);
CREATE TABLE
rhaas=# \d+ foo1
Table "public.foo1"
Column | Type | Collation | Nullable | Default | Storage | Stats
target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
a | integer | | | | plain | |
b | text | | | | extended | |
Partition of: foo FOR VALUES FROM (0) TO (10)
Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 10))rhaas=# drop table foo1;
DROP TABLE
rhaas=# create table foo1 (like foo, check ((a IS NOT NULL) AND (a >=
0) AND (a < 10)));
CREATE TABLE
rhaas=# alter table foo attach partition foo1 for values from (0) to (10);
ALTER TABLEI think that's going to come as an unpleasant surprise to more than
one user. I'm not sure exactly how we need to restructure things here
so that this works properly, and maybe modifying
predicate_implied_by() isn't the right thing at all; for instance,
there's also predicate_refuted_by(), which maybe could be used in some
way (inject NOT?). But I don't much like the idea that you copy and
paste the partitioning constraint into a CHECK constraint and that
doesn't work. That's not cool.
I agree with this argument. I just tried the patch you posted in the
other email and I like how easy it makes the life for users in that they
can just look at the partition constraint of an existing partition (thanks
to 1848b73d45!) and frame the check constraint of the new partition to
attach accordingly.
IOW, +1 from me to the Ashutosh's idea.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/06/14 5:36, Robert Haas wrote:
On Tue, Jun 13, 2017 at 10:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think that's going to come as an unpleasant surprise to more than
one user. I'm not sure exactly how we need to restructure things here
so that this works properly, and maybe modifying
predicate_implied_by() isn't the right thing at all; for instance,
there's also predicate_refuted_by(), which maybe could be used in some
way (inject NOT?). But I don't much like the idea that you copy and
paste the partitioning constraint into a CHECK constraint and that
doesn't work. That's not cool.OK, I think I see the problem here. predicate_implied_by() and
predicate_refuted_by() differ in what they assume about the predicate
evaluating to NULL, but both of them assume that if the clause
evaluates to NULL, that's equivalent to false. So there's actually no
option to get the behavior we want here, which is to treat both
operands using CHECK-semantics (null is true) rather than
WHERE-semantics (null is false).Given that, Ashutosh's proposal of passing an additional flag to
predicate_implied_by() seems like the best option. Here's a patch
implementing that.
I tried this patch and it seems to work correctly.
Some comments on the patch itself:
The following was perhaps included for debugging?
+#include "nodes/print.h"
I think the following sentence in a comment near the affected code in
ATExecAttachPartition() needs to be removed.
*
* There is a case in which we cannot rely on just the result of the
* proof.
We no longer need the Bitmapset not_null_attrs. So, the line declaring it
and the following line can be removed:
not_null_attrs = bms_add_member(not_null_attrs, i);
I thought I would make these changes myself and send the v2, but realized
that you might be updating it yourself based on Tom's comments, so didn't.
By the way, I mentioned an existing problem in one of the earlier emails
on this thread about differing attribute numbers in the table being
attached causing predicate_implied_by() to give up due to structural
inequality of Vars. To clarify: table's check constraints will bear the
table's attribute numbers whereas the partition constraint generated using
get_qual_for_partbound() (the predicate) bears the parent's attribute
numbers. That results in Var arguments of the expressions passed to
predicate_implied_by() not matching and causing the latter to return
failure prematurely. Attached find a patch to fix that that applies on
top of your patch (added a test too).
Thanks,
Amit
Attachments:
ATExecAttachPartition-correct-varattnos-partconstraint-v1.patchtext/plain; charset=UTF-8; name=ATExecAttachPartition-correct-varattnos-partconstraint-v1.patchDownload+31-1
PFA patch set addressing comments by Tom and Amit.
0001 is same as Robert's patch.
On Wed, Jun 14, 2017 at 7:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jun 13, 2017 at 5:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
OK, I think I see the problem here. predicate_implied_by() and
predicate_refuted_by() differ in what they assume about the predicate
evaluating to NULL, but both of them assume that if the clause
evaluates to NULL, that's equivalent to false. So there's actually no
option to get the behavior we want here, which is to treat both
operands using CHECK-semantics (null is true) rather than
WHERE-semantics (null is false).Given that, Ashutosh's proposal of passing an additional flag to
predicate_implied_by() seems like the best option. Here's a patch
implementing that.I've not reviewed the logic changes in predtest.c in detail, but
I think this is a reasonable direction to go in. Two suggestions:1. predicate_refuted_by() should grow the extra argument at the
same time. There's no good reason to be asymmetric.OK.
0002 has these changes.
2. It might be clearer, and would certainly be shorter, to call the
extra argument something like "null_is_true".I think it's pretty darn important to make it clear that the argument
only applies to the clauses supplied as axioms, and not to the
predicate to be proven; if you want to control how the *predicate* is
handled with respect to nulls, change your selection as among
predicate_implied_by() and predicate_refuted_by(). For that reason, I
disesteem null_is_true, but I'm open to other suggestions.
The extern functions viz. predicate_refuted_by() and
predicate_implied_by() both accept restrictinfo_list and so the new
argument gets name restrictinfo_is_check, which is fine. But the
static minions have the corresponding argument named clause but the
new argument isn't named clause_is_check. I think it would be better
to be consistent everywhere and use either clause or restrictinfo.
0004 patch does that, it renames restrictinfo_list as clause_list and
the boolean argument as clause_is_check.
0003 addresses comments by Amit Langote.
In your original patch, if restrictinfo_is_check is true, we will call
operator_predicate_proof(), which does not handle anything other than
an operator expression. So, it's going to return false from that
function. Looking at the way argisrow is handled in that function, it
looks like we don't want to pass NullTest expression to
operator_predicate_proof(). 0005 handles the boolean flag in the same
way as argisrow is handled.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
On Wed, Jun 14, 2017 at 9:20 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
By the way, I mentioned an existing problem in one of the earlier emails
on this thread about differing attribute numbers in the table being
attached causing predicate_implied_by() to give up due to structural
inequality of Vars. To clarify: table's check constraints will bear the
table's attribute numbers whereas the partition constraint generated using
get_qual_for_partbound() (the predicate) bears the parent's attribute
numbers. That results in Var arguments of the expressions passed to
predicate_implied_by() not matching and causing the latter to return
failure prematurely. Attached find a patch to fix that that applies on
top of your patch (added a test too).
+ * Adjust the generated constraint to match this partition's attribute
+ * numbers. Save the original to be used later if we decide to proceed
+ * with the validation scan after all.
+ */
+ partConstraintOrig = copyObject(partConstraint);
+ partConstraint = map_partition_varattnos(partConstraint, 1, attachRel,
+ rel);
+
If the partition has different column order than the parent, its heap will also
have different column order. I am not able to understand the purpose of using
original constraints for validation using scan. Shouldn't we just use the
mapped constraint expressions?
BTW I liked the idea; this way we can keep part_6 in sync with list_parted2
even when the later changes and still manage to have different order of
attributes. Although the CHECK still assumes that there is a column "a" but
that's fine I guess.
+CREATE TABLE part_6 (
+ c int,
+ LIKE list_parted2,
+ CONSTRAINT check_a CHECK (a IS NOT NULL AND a = 6)
+);
+ALTER TABLE part_6 DROP c;
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 14, 2017 at 6:15 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
PFA patch set addressing comments by Tom and Amit.
LGTM. Committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks for taking a look.
On 2017/06/14 20:06, Ashutosh Bapat wrote:
On Wed, Jun 14, 2017 at 9:20 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:By the way, I mentioned an existing problem in one of the earlier emails
on this thread about differing attribute numbers in the table being
attached causing predicate_implied_by() to give up due to structural
inequality of Vars. To clarify: table's check constraints will bear the
table's attribute numbers whereas the partition constraint generated using
get_qual_for_partbound() (the predicate) bears the parent's attribute
numbers. That results in Var arguments of the expressions passed to
predicate_implied_by() not matching and causing the latter to return
failure prematurely. Attached find a patch to fix that that applies on
top of your patch (added a test too).+ * Adjust the generated constraint to match this partition's attribute + * numbers. Save the original to be used later if we decide to proceed + * with the validation scan after all. + */ + partConstraintOrig = copyObject(partConstraint); + partConstraint = map_partition_varattnos(partConstraint, 1, attachRel, + rel); + If the partition has different column order than the parent, its heap will also have different column order. I am not able to understand the purpose of using original constraints for validation using scan. Shouldn't we just use the mapped constraint expressions?
Actually, I dropped the approach of using partConstraintOrig altogether
from the latest updated patch. I will explain the problem I was trying to
solve with that approach, which is now replaced in the new patch by, I
think, a more correct solution.
If we end up having to perform the validation scan and the table being
attached is a partitioned table, we will scan its leaf partitions. Each
of those leaf partitions may have different attribute numbers for the
partitioning columns, so we will need to do the mapping, which actually we
do even today. With this patch however, we apply mapping to the generated
partition constraint so that it no longer bears the original parent's
attribute numbers but those of the table being attached. Down below where
we map to the leaf partition's attribute numbers, we still pass the root
partitioned table as the parent. But it may so happen that the attnos
appearing in the Vars can no longer be matched with any of the root
table's attribute numbers, resulting in the following code in
map_variable_attnos_mutator() to trigger an error:
if (attno > context->map_length || context->attno_map[attno - 1] == 0)
elog(ERROR, "unexpected varattno %d in expression to be mapped",
attno);
Consider this example:
root: (a, b, c) partition by list (a)
intermediate: (b, c, ..dropped.., a) partition by list (b)
leaf: (b, c, a) partition of intermediate
When attaching intermediate to root, we will generate the partition
constraint and after mapping, its Vars will have attno = 4. When trying
to map the same for leaf, we currently do map_partition_varattnos(expr, 1,
leaf, root). So, the innards of map_variable_attnos will try to look for
an attribute with attno = 4 in root which there isn't, so the above error
will occur. We should really be passing intermediate as parent to the
mapping routine. With the previous patch's approach, we would pass root
as the parent along with partConstraintOrig which would bear the root
parent's attnos.
Please find attached the updated patch. In addition to the already
described fixes, the patch also rearranges code so that a redundant AT
work queue entry is avoided. (Currently, we end up creating one for
attachRel even if it's partitioned, although it's harmless because
ATRewriteTables() knows to skip partitioned tables.)
Thanks,
Amit