using expression syntax for partition bounds (was: Re: Boolean partitions syntax)

Started by Amit Langotealmost 8 years ago26 messageshackers
Jump to latest
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp

(patch and discussion for PG 12)

On 2018/04/22 1:28, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

[ v8-0001-Allow-generalized-expression-syntax-for-partition.patch ]

I find what you did to a_expr here to be pretty horrid.

Thanks for the review.

I think what you should do is lose the partbound_datum and
PartitionRangeDatum productions altogether, replacing those with just
a_expr, as in the attached grammar-only patch. This would result in
needing to identify MINVALUE and MAXVALUE during parse analysis, since
the grammar would just treat them as ColId expressions. But since we're
not intending to ever allow any actual column references in partition
expressions, I don't see any harm in allowing parse analysis to treat
ColumnRefs containing those names as meaning the special items.

I have to agree this is better.

This is a little bit grotty, in that both MINVALUE and "minvalue" would
be recognized as the keyword, but it's sure a lot less messy than what's
there now. And IIRC there are some other places where we're a bit
squishy about the difference between identifiers and keywords, too.

Hmm, yes.

I tried to update the patch to do things that way. I'm going to create a
new entry in the next CF titled "generalized expression syntax for
partition bounds" and add the patch there.

Thanks,
Amit

Attachments:

v1-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v1-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+142-115
#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#1)
Re: using expression syntax for partition bounds (was: Re: Boolean partitions syntax)

On 2018/04/23 11:37, Amit Langote wrote:

I tried to update the patch to do things that way. I'm going to create a
new entry in the next CF titled "generalized expression syntax for
partition bounds" and add the patch there.

Tweaked the commit message to credit all the authors.

Thanks,
Amit

Attachments:

v2-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v2-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+142-115
#3Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Amit Langote (#2)
Re: using expression syntax for partition bounds

Thanks. I have almost missed this.

At Mon, 23 Apr 2018 11:44:14 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in <e8c770eb-a850-6645-8310-f3eaea3a72fd@lab.ntt.co.jp>

On 2018/04/23 11:37, Amit Langote wrote:

I tried to update the patch to do things that way. I'm going to create a
new entry in the next CF titled "generalized expression syntax for
partition bounds" and add the patch there.

Tweaked the commit message to credit all the authors.

+      any variable-free expression (subqueries, window functions, aggregate
+      functions, and set-returning functions are not allowed).  The data type
+      of the partition bound expression must match the data type of the
+      corresponding partition key column.

Parititioning value is slimiar to column default expression in
the sense that it appeas within a DDL. The description for
DEFAULT is:

| The value is any variable-free expression (subqueries and
| cross-references to other columns in the current table are not
| allowed)

It actually refuses aggregates, window functions and SRFs but it
is not mentioned. Whichever we choose, they ought to have the
similar description.

/*
* Strip any top-level COLLATE clause, as they're inconsequential.
* XXX - Should we add a WARNING here?
*/
while (IsA(value, CollateExpr))
value = (Node *) ((CollateExpr *) value)->arg;

Ok, I'll follow Tom's suggestion but collate is necessarilly
appears in this shape. For example ('a' collate "de_DE") || 'b')
has the collate node under the top ExprOp and we get a complaint
like following with it.

ERROR: unrecognized node type: 123

123 is T_CollateExpr. The expression "value" (mmm..) reuires
eval_const_expressions to eliminate CollateExprs. It requires
assign_expr_collations to retrieve value's collation but we don't
do that since we ignore collation this in this case.

The following results in a strange-looking error.

=# create table pt1 partition of p for values in (a);
ERROR: column "a" does not exist
LINE 1: create table pt1 partition of p for values in (a);

The p/pt1 actually have a column a.

The following results in similar error and it is wrong, too.

=# create table pr1 partition of pr for values from (a + 1) to (a + 2);
ERROR: column "a" does not exist
LINE 1: create table pr1 partition of pr for values from (a + 1) to ...

Being similar but a bit different, the following command leads to
a SEGV since it creates PARTITION_RANGE_DATUM_VALUE with value =
NULL. Even if it leaves the original node, validateInfiniteBounds
rejects it and aborts.

=# create table pr1 partition of pr for values from (hoge) to (hige);

(crash)

I fixed this using pre-columnref hook in the attached v3. This
behavles for columnrefs differently than DEFAULT.

The v3-2 does the same thing with DEFAULT. The behevior differs
whether the column exists or not.

ERROR: cannot use column referencees in bound value
ERROR: column "b" does not exist

I personally think that such similarity between DEFALUT and
partition bound (v3-2) is not required. But inserting the hook
(v3) doesn't look good for me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v3-0001-Allow-generalized-expression-syntax-for-partition.patchtext/x-patch; charset=us-asciiDownload+157-108
v3-2-0001-Allow-generalized-expression-syntax-for-partition.patchtext/x-patch; charset=us-asciiDownload+182-108
#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kyotaro Horiguchi (#3)
Re: using expression syntax for partition bounds

Horiguchi-san,

Thanks a lot for the review and sorry it took me a while to reply.
Thought I'd refresh the patch as it's in the July CF.

On 2018/04/24 18:08, Kyotaro HORIGUCHI wrote:

Thanks. I have almost missed this.

At Mon, 23 Apr 2018 11:44:14 +0900, Amit Langote wrote:

On 2018/04/23 11:37, Amit Langote wrote:

I tried to update the patch to do things that way. I'm going to create a
new entry in the next CF titled "generalized expression syntax for
partition bounds" and add the patch there.

Tweaked the commit message to credit all the authors.

+      any variable-free expression (subqueries, window functions, aggregate
+      functions, and set-returning functions are not allowed).  The data type
+      of the partition bound expression must match the data type of the
+      corresponding partition key column.

Parititioning value is slimiar to column default expression in
the sense that it appeas within a DDL. The description for
DEFAULT is:

| The value is any variable-free expression (subqueries and
| cross-references to other columns in the current table are not
| allowed)

It actually refuses aggregates, window functions and SRFs but it
is not mentioned. Whichever we choose, they ought to have the
similar description.

Actually, I referenced the default value expression syntax a lot when
working on this issue, so agree that there's a close resemblance.

Maybe, we should fix the description for default expression to say that it
can't contain subqueries, cross-references to other columns in the table,
aggregate expressions, window functions, and set-returning functions. I
also sent a patch separately:

/messages/by-id/2059e8f2-e6e6-7a9f-0de8-11eed291e641@lab.ntt.co.jp

/*
* Strip any top-level COLLATE clause, as they're inconsequential.
* XXX - Should we add a WARNING here?
*/
while (IsA(value, CollateExpr))
value = (Node *) ((CollateExpr *) value)->arg;

Ok, I'll follow Tom's suggestion but collate is necessarilly
appears in this shape. For example ('a' collate "de_DE") || 'b')
has the collate node under the top ExprOp and we get a complaint
like following with it.

ERROR: unrecognized node type: 123

123 is T_CollateExpr. The expression "value" (mmm..) reuires
eval_const_expressions to eliminate CollateExprs. It requires
assign_expr_collations to retrieve value's collation but we don't
do that since we ignore collation this in this case.

Ah yes, it seems better to call eval_const_expressions as a first step to
get rid of CollateExpr's, followed by evaluate_expr if the first step
didn't return a Const node.

The following results in a strange-looking error.

=# create table pt1 partition of p for values in (a);
ERROR: column "a" does not exist
LINE 1: create table pt1 partition of p for values in (a);

The p/pt1 actually have a column a.

The following results in similar error and it is wrong, too.

=# create table pr1 partition of pr for values from (a + 1) to (a + 2);
ERROR: column "a" does not exist
LINE 1: create table pr1 partition of pr for values from (a + 1) to ...

This one is better fixed by initializing ParseState properly as
demonstrated in your v3-2 patch.

Being similar but a bit different, the following command leads to
a SEGV since it creates PARTITION_RANGE_DATUM_VALUE with value =
NULL. Even if it leaves the original node, validateInfiniteBounds
rejects it and aborts.

=# create table pr1 partition of pr for values from (hoge) to (hige);

(crash)

Oops.

I fixed this using pre-columnref hook in the attached v3. This
behavles for columnrefs differently than DEFAULT.

The v3-2 does the same thing with DEFAULT. The behevior differs
whether the column exists or not.

ERROR: cannot use column referencees in bound value
ERROR: column "b" does not exist

I personally think that such similarity between DEFALUT and
partition bound (v3-2) is not required. But inserting the hook
(v3) doesn't look good for me.

Actually, I too like the solution of v3-2 better, instead of using the
hook, so I adopted it in the updated patch.

I also changed how range bounds are processed in transformPartitionBound,
moving some code into newly added transformPartitionRangeBounds to reduce
code duplication.

Updated patch attached.

Thanks,
Amit

Attachments:

v4-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v4-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+221-163
#5Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Amit Langote (#4)
Re: using expression syntax for partition bounds

Hello.

cf-bot compained on this but I wondered why it got so many
errors. At the first look I found a spare semicolon before a bare
then calause:p

-	if (!IsA(value, Const));
+	if (!IsA(value, Const))
elog(ERROR, "could not evaluate partition bound expression");

The attached is the fixed and rebsaed version.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v5-0001-Allow-generalized-expression-syntax-for-partition-bo.patchtext/x-patch; charset=us-asciiDownload+221-163
#6Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kyotaro Horiguchi (#5)
Re: using expression syntax for partition bounds

Horiguchi-san,

On 2018/07/06 14:26, Kyotaro HORIGUCHI wrote:

Hello.

cf-bot compained on this but I wondered why it got so many
errors. At the first look I found a spare semicolon before a bare
then calause:p

-	if (!IsA(value, Const));
+	if (!IsA(value, Const))
elog(ERROR, "could not evaluate partition bound expression");

The attached is the fixed and rebsaed version.

Thanks for taking care of that.

Regards,
Amit

#7Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#6)
Re: using expression syntax for partition bounds

Rebased due to change in addRangeTableEntryForRelation's API.

Thanks,
Amit

Attachments:

v6-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v6-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+222-163
#8Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#7)
Re: using expression syntax for partition bounds

On 2018/11/09 14:38, Amit Langote wrote:

Rebased due to change in addRangeTableEntryForRelation's API.

Rebased again due to changes in psql's describe output for partitioned tables.

Thanks,
Amit

Attachments:

v7-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v7-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+222-163
#9Peter Eisentraut
peter_e@gmx.net
In reply to: Amit Langote (#8)
Re: using expression syntax for partition bounds

On 26/11/2018 05:58, Amit Langote wrote:

On 2018/11/09 14:38, Amit Langote wrote:

Rebased due to change in addRangeTableEntryForRelation's API.

Rebased again due to changes in psql's describe output for partitioned tables.

Review:

Is "partition bound" the right term? For list partitioning, it's not
really a bound. Maybe it's OK.

Keep the ordering of EXPR_KIND_PARTITION_BOUND in the various switch
statements consistent.

I don't see any treatment of non-immutable expressions. There is a test
case with a non-immutable cast that was removed, but that's all. There
was considerable discussion earlier in the thread on whether
non-immutable expressions should be allowed. I'm not sure what the
resolution was, but in any case there should be documentation and tests
of the outcome.

The collation handling might need another look. The following is
allowed without any diagnostic:

CREATE TABLE t2 (
a text COLLATE "POSIX"
) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM ('foo' COLLATE "C") TO
('xyz');

I think the correct behavior would be that an explicit collation in the
partition bound expression is an error.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Peter Eisentraut (#9)
Re: using expression syntax for partition bounds

Thanks for the review and sorry it took me a while to reply.

On 2019/01/02 22:58, Peter Eisentraut wrote:

On 26/11/2018 05:58, Amit Langote wrote:

On 2018/11/09 14:38, Amit Langote wrote:

Rebased due to change in addRangeTableEntryForRelation's API.

Rebased again due to changes in psql's describe output for partitioned tables.

Review:

Is "partition bound" the right term? For list partitioning, it's not
really a bound. Maybe it's OK.

Hmm, maybe "partition bound value"? Just want to stress that the
expression specifies "bounding" value of a partition.

Keep the ordering of EXPR_KIND_PARTITION_BOUND in the various switch
statements consistent.

OK, fixed.

I don't see any treatment of non-immutable expressions. There is a test
case with a non-immutable cast that was removed, but that's all. There
was considerable discussion earlier in the thread on whether
non-immutable expressions should be allowed. I'm not sure what the
resolution was, but in any case there should be documentation and tests
of the outcome.

The partition bound expression is evaluated only once, that is, when the
partition creation command is executed, and what gets stored in the
catalog is a Const expression that contains the value that was computed,
not the original non-immutable expression. So, we don't need to do
anything special in terms of code to handle users possibly specifying a
non-immutable expression as partition bound. Tom said something in that
thread which seems to support such a position:

/messages/by-id/22534.1523374457@sss.pgh.pa.us

Part of the test case that was removed is the error that used to be emitted:

CREATE TABLE moneyp (
a money
) PARTITION BY LIST (a);
CREATE TABLE moneyp_10 PARTITION OF moneyp FOR VALUES IN (10);
-ERROR: specified value cannot be cast to type money for column "a"
-LINE 1: ...EATE TABLE moneyp_10 PARTITION OF moneyp FOR VALUES IN (10);
- ^
-DETAIL: The cast requires a non-immutable conversion.
-HINT: Try putting the literal value in single quotes.
-CREATE TABLE moneyp_10 PARTITION OF moneyp FOR VALUES IN ('10');

which is no longer emitted, because the patched
transformPartitionBoundValue evaluates the expression, instead of emitting
error because the expression hasn't become a Const even after applying
eval_const_expressions().

Do we need to mention any aspect of how this now works in the user-facing
documentation?

The collation handling might need another look. The following is
allowed without any diagnostic:

CREATE TABLE t2 (
a text COLLATE "POSIX"
) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM ('foo' COLLATE "C") TO
('xyz');

I think the correct behavior would be that an explicit collation in the
partition bound expression is an error.

I tend to agree with that. What happens is the partition bound expression
is assigned the collation of the parent's partition key:

+ partcollation = get_partition_col_collation(key, 0);

+            value = transformPartitionBoundValue(pstate, expr,
+                                                 colname, coltype, coltypmod,
+                                                 partcollation);

But that's essentially ignoring the collation specified by the user for
the partition bound value without providing any information about that to
the user. I've fixed that by explicitly checking if the collate clause in
the partition bound value expression contains the same collation as the
parent partition key collation and give an error otherwise.

Updated patch attached.

Thanks,
Amit

Attachments:

v8-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v8-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+265-161
#11Peter Eisentraut
peter_e@gmx.net
In reply to: Amit Langote (#10)
Re: using expression syntax for partition bounds

On 15/01/2019 07:31, Amit Langote wrote:

Is "partition bound" the right term? For list partitioning, it's not
really a bound. Maybe it's OK.

Hmm, maybe "partition bound value"? Just want to stress that the
expression specifies "bounding" value of a partition.

I was more concerned about the term "bound", which it is not range
partitioning. But I can't think of a better term.

I wouldn't change expr to value as you have done between v7 and v8,
since the point of this patch is to make expressions valid where
previously only values were.

I don't see any treatment of non-immutable expressions. There is a test
case with a non-immutable cast that was removed, but that's all. There
was considerable discussion earlier in the thread on whether
non-immutable expressions should be allowed. I'm not sure what the
resolution was, but in any case there should be documentation and tests
of the outcome.

The partition bound expression is evaluated only once, that is, when the
partition creation command is executed, and what gets stored in the
catalog is a Const expression that contains the value that was computed,
not the original non-immutable expression. So, we don't need to do
anything special in terms of code to handle users possibly specifying a
non-immutable expression as partition bound. Tom said something in that
thread which seems to support such a position:

/messages/by-id/22534.1523374457@sss.pgh.pa.us

OK, if we are going with that approach, then it needs to be documented
and there should be test cases.

The collation handling might need another look. The following is
allowed without any diagnostic:

CREATE TABLE t2 (
a text COLLATE "POSIX"
) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM ('foo' COLLATE "C") TO
('xyz');

I think the correct behavior would be that an explicit collation in the
partition bound expression is an error.

But that's essentially ignoring the collation specified by the user for
the partition bound value without providing any information about that to
the user. I've fixed that by explicitly checking if the collate clause in
the partition bound value expression contains the same collation as the
parent partition key collation and give an error otherwise.

I think that needs more refinement. In v8, the following errors

CREATE TABLE t2 ( a name COLLATE "POSIX" ) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM (name 'foo') TO (name
'xyz');
ERROR: collation of partition bound value for column "a" does not match
partition key collation "POSIX"

The problem here is that the "name" type has a collation that is neither
the one of the column nor the default collation. We can allow that.
What we don't want is someone writing an explicit COLLATE clause. I
think we just need to check that there is no top-level COLLATE clause.
This would then sort of match the logic parse_collate.c for combining
collations.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Peter Eisentraut (#11)
Re: using expression syntax for partition bounds

Thanks for the review.

On 2019/01/15 22:24, Peter Eisentraut wrote:

On 15/01/2019 07:31, Amit Langote wrote:

Is "partition bound" the right term? For list partitioning, it's not
really a bound. Maybe it's OK.

Hmm, maybe "partition bound value"? Just want to stress that the
expression specifies "bounding" value of a partition.

I was more concerned about the term "bound", which it is not range
partitioning. But I can't think of a better term.

I wouldn't change expr to value as you have done between v7 and v8,
since the point of this patch is to make expressions valid where
previously only values were.

OK, will change it back to partition_bound_expr. Removing "bound" from it
makes the term ambiguous?

I don't see any treatment of non-immutable expressions. There is a test
case with a non-immutable cast that was removed, but that's all. There
was considerable discussion earlier in the thread on whether
non-immutable expressions should be allowed. I'm not sure what the
resolution was, but in any case there should be documentation and tests
of the outcome.

The partition bound expression is evaluated only once, that is, when the
partition creation command is executed, and what gets stored in the
catalog is a Const expression that contains the value that was computed,
not the original non-immutable expression. So, we don't need to do
anything special in terms of code to handle users possibly specifying a
non-immutable expression as partition bound. Tom said something in that
thread which seems to support such a position:

/messages/by-id/22534.1523374457@sss.pgh.pa.us

OK, if we are going with that approach, then it needs to be documented
and there should be test cases.

How about the following note in the documentation:

+      Although volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used
+      for this, be careful when using them, because
+      <productname>PostgreSQL</productname> will evaluate them only once
+      when adding the partition.

Sorry but I'm not sure how or what I would test about this. Maybe, just
add a test in create_table.sql/alter_table.sql that shows that using
volatile expression doesn't cause an error?

The collation handling might need another look. The following is
allowed without any diagnostic:

CREATE TABLE t2 (
a text COLLATE "POSIX"
) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM ('foo' COLLATE "C") TO
('xyz');

I think the correct behavior would be that an explicit collation in the
partition bound expression is an error.

But that's essentially ignoring the collation specified by the user for
the partition bound value without providing any information about that to
the user. I've fixed that by explicitly checking if the collate clause in
the partition bound value expression contains the same collation as the
parent partition key collation and give an error otherwise.

I think that needs more refinement. In v8, the following errors

CREATE TABLE t2 ( a name COLLATE "POSIX" ) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM (name 'foo') TO (name
'xyz');
ERROR: collation of partition bound value for column "a" does not match
partition key collation "POSIX"

The problem here is that the "name" type has a collation that is neither
the one of the column nor the default collation. We can allow that.

So, should the "name" type's collation should simply be discarded in favor
of "POSIX" that's being used for partitioning?

What we don't want is someone writing an explicit COLLATE clause. I
think we just need to check that there is no top-level COLLATE clause.
This would then sort of match the logic parse_collate.c for combining
collations.

Maybe I'm missing something, but isn't it OK to allow the COLLATE clause
as long as it specifies the matching collation as the parent? So:

CREATE TABLE t2b PARTITION OF t2 FOR VALUES FROM (name 'bar' collate "C")
TO (name 'foo');
ERROR: collation of partition bound value for column "a" does not match
partition key collation "POSIX"

-- either of the following is ok

CREATE TABLE t2b PARTITION OF t2 FOR VALUES FROM (name 'bar' collate
"POSIX") TO (name 'foo');

CREATE TABLE t2b PARTITION OF t2 FOR VALUES FROM (name 'bar') TO (name 'foo');

Thanks,
Amit

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Amit Langote (#12)
Re: using expression syntax for partition bounds

On 16/01/2019 08:41, Amit Langote wrote:

OK, will change it back to partition_bound_expr. Removing "bound" from it
makes the term ambiguous?

Yeah, let's leave it in.

How about the following note in the documentation:

+      Although volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used
+      for this, be careful when using them, because
+      <productname>PostgreSQL</productname> will evaluate them only once
+      when adding the partition.

I don't think we have to phrase it in this warning way. Just say that
volatile expressions are evaluated at the time of the DDL statement.

Sorry but I'm not sure how or what I would test about this. Maybe, just
add a test in create_table.sql/alter_table.sql that shows that using
volatile expression doesn't cause an error?

Possibilities: Create a range partition with current_timestamp as the
upper bound and then in a separate transaction insert current_timestamp
and have it appear in the default partition. Or create list partition
with session_user as one partition's value and then insert session_user
and have it appear in that table. Or something like those.

I think that needs more refinement. In v8, the following errors

CREATE TABLE t2 ( a name COLLATE "POSIX" ) PARTITION BY RANGE (a);
CREATE TABLE t2a PARTITION OF t2 FOR VALUES FROM (name 'foo') TO (name
'xyz');
ERROR: collation of partition bound value for column "a" does not match
partition key collation "POSIX"

The problem here is that the "name" type has a collation that is neither
the one of the column nor the default collation. We can allow that.

So, should the "name" type's collation should simply be discarded in favor
of "POSIX" that's being used for partitioning?

In that specific case, yes, I think so.

What we don't want is someone writing an explicit COLLATE clause. I
think we just need to check that there is no top-level COLLATE clause.
This would then sort of match the logic parse_collate.c for combining
collations.

Maybe I'm missing something, but isn't it OK to allow the COLLATE clause
as long as it specifies the matching collation as the parent?

Yes, that should be OK.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Peter Eisentraut (#13)
Re: using expression syntax for partition bounds

Thanks for the comments.

On 2019/01/18 16:48, Peter Eisentraut wrote:

How about the following note in the documentation:

+      Although volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used
+      for this, be careful when using them, because
+      <productname>PostgreSQL</productname> will evaluate them only once
+      when adding the partition.

I don't think we have to phrase it in this warning way. Just say that
volatile expressions are evaluated at the time of the DDL statement.

OK, then just this:

+      Even volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used.

Sorry but I'm not sure how or what I would test about this. Maybe, just
add a test in create_table.sql/alter_table.sql that shows that using
volatile expression doesn't cause an error?

Possibilities: Create a range partition with current_timestamp as the
upper bound and then in a separate transaction insert current_timestamp
and have it appear in the default partition. Or create list partition
with session_user as one partition's value and then insert session_user
and have it appear in that table. Or something like those.

OK, added a test that uses current_timestamp.

So, should the "name" type's collation should simply be discarded in favor
of "POSIX" that's being used for partitioning?

In that specific case, yes, I think so.

What we don't want is someone writing an explicit COLLATE clause. I
think we just need to check that there is no top-level COLLATE clause.
This would then sort of match the logic parse_collate.c for combining
collations.

Maybe I'm missing something, but isn't it OK to allow the COLLATE clause
as long as it specifies the matching collation as the parent?

Yes, that should be OK.

Alright, I've included a test that uses cast expression in partition bound.

Updated patch attached.

Thanks,
Amit

Attachments:

v9-0001-Allow-generalized-expression-syntax-for-partition.patchtext/plain; charset=UTF-8; name=v9-0001-Allow-generalized-expression-syntax-for-partition.patchDownload+310-161
#15Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Amit Langote (#14)
Re: using expression syntax for partition bounds

Hello.

At Fri, 18 Jan 2019 17:50:56 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in <cbe3823f-f61e-1f37-c9ee-a3de9d8d565e@lab.ntt.co.jp>

Thanks for the comments.

On 2019/01/18 16:48, Peter Eisentraut wrote:

How about the following note in the documentation:

+      Although volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used
+      for this, be careful when using them, because
+      <productname>PostgreSQL</productname> will evaluate them only once
+      when adding the partition.

I don't think we have to phrase it in this warning way. Just say that
volatile expressions are evaluated at the time of the DDL statement.

OK, then just this:

+      Even volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used.

I agree to not to phrase in a warning way, but it seems
too-simplified. I think the reason is still required, like this?:

===
The expression is evaluated once at the table creation time so it
can involve even volatile expressions such as
<literal><function>CURRENT_TIMESTAMP</function></literal>.
===

Sorry but I'm not sure how or what I would test about this. Maybe, just
add a test in create_table.sql/alter_table.sql that shows that using
volatile expression doesn't cause an error?

Possibilities: Create a range partition with current_timestamp as the
upper bound and then in a separate transaction insert current_timestamp
and have it appear in the default partition. Or create list partition
with session_user as one partition's value and then insert session_user
and have it appear in that table. Or something like those.

OK, added a test that uses current_timestamp.

So, should the "name" type's collation should simply be discarded in favor
of "POSIX" that's being used for partitioning?

In that specific case, yes, I think so.

What we don't want is someone writing an explicit COLLATE clause. I
think we just need to check that there is no top-level COLLATE clause.
This would then sort of match the logic parse_collate.c for combining
collations.

Maybe I'm missing something, but isn't it OK to allow the COLLATE clause
as long as it specifies the matching collation as the parent?

Yes, that should be OK.

Alright, I've included a test that uses cast expression in partition bound.

Updated patch attached.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kyotaro Horiguchi (#15)
Re: using expression syntax for partition bounds

Horiguchi-san,

Thanks for checking.

On 2019/01/24 19:03, Kyotaro HORIGUCHI wrote:

At Fri, 18 Jan 2019 17:50:56 +0900, Amit Langote wrote:

On 2019/01/18 16:48, Peter Eisentraut wrote:

How about the following note in the documentation:

+      Although volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used
+      for this, be careful when using them, because
+      <productname>PostgreSQL</productname> will evaluate them only once
+      when adding the partition.

I don't think we have to phrase it in this warning way. Just say that
volatile expressions are evaluated at the time of the DDL statement.

OK, then just this:

+      Even volatile expressions such as
+      <literal><function>CURRENT_TIMESTAMP</function></literal> can be used.

I agree to not to phrase in a warning way, but it seems
too-simplified. I think the reason is still required, like this?:

===
The expression is evaluated once at the table creation time so it
can involve even volatile expressions such as
<literal><function>CURRENT_TIMESTAMP</function></literal>.

Ah, that's perhaps a better way of describing this feature.

Attached rebased patch containing above change.

Thanks,
Amit

Attachments:

v10-0001-Allow-generalized-expression-syntax-for-partitio.patchtext/plain; charset=UTF-8; name=v10-0001-Allow-generalized-expression-syntax-for-partitio.patchDownload+312-162
#17Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Langote (#16)
Re: using expression syntax for partition bounds

Why did you lose the parser_errposition in parse_utilcmd.c line 3854?

-	/* Fail if we don't have a constant (i.e., non-immutable coercion) */
-	if (!IsA(value, Const))
+	/* Make sure the expression does not refer to any vars. */
+	if (contain_var_clause(value))
ereport(ERROR,
-				(errcode(ERRCODE_DATATYPE_MISMATCH),
-				 errmsg("specified value cannot be cast to type %s for column \"%s\"",
-						format_type_be(colType), colName),
-				 errdetail("The cast requires a non-immutable conversion."),
-				 errhint("Try putting the literal value in single quotes."),
-				 parser_errposition(pstate, con->location)));
+				(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+				 errmsg("cannot use column references in partition bound expression")));

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alvaro Herrera (#17)
Re: using expression syntax for partition bounds

Hi,

Thanks for looking.

On 2019/01/24 21:00, Alvaro Herrera wrote:

Why did you lose the parser_errposition in parse_utilcmd.c line 3854?

-	/* Fail if we don't have a constant (i.e., non-immutable coercion) */
-	if (!IsA(value, Const))
+	/* Make sure the expression does not refer to any vars. */
+	if (contain_var_clause(value))
ereport(ERROR,
-				(errcode(ERRCODE_DATATYPE_MISMATCH),
-				 errmsg("specified value cannot be cast to type %s for column \"%s\"",
-						format_type_be(colType), colName),
-				 errdetail("The cast requires a non-immutable conversion."),
-				 errhint("Try putting the literal value in single quotes."),
-				 parser_errposition(pstate, con->location)));
+				(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+				 errmsg("cannot use column references in partition bound expression")));

The if (contain_var_clause(value)) block is new code, but I agree its
ereport should have parser_errposition just like other ereports in that
function. Fixed that in the attached.

Thanks,
Amit

Attachments:

v11-0001-Allow-generalized-expression-syntax-for-partitio.patchtext/plain; charset=UTF-8; name=v11-0001-Allow-generalized-expression-syntax-for-partitio.patchDownload+315-162
#19Peter Eisentraut
peter_e@gmx.net
In reply to: Amit Langote (#18)
Re: using expression syntax for partition bounds

On 24/01/2019 13:57, Amit Langote wrote:

The if (contain_var_clause(value)) block is new code, but I agree its
ereport should have parser_errposition just like other ereports in that
function. Fixed that in the attached.

committed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#19)
Re: using expression syntax for partition bounds

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

committed

Some of the buildfarm members are having sort-ordering problems
with this. Looks like you could work around it with different
partition names (don't assume the relative sort order of
letters and digits).

regards, tom lane

#21Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#20)
#22Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#21)
#23Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Michael Paquier (#22)
#24Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#23)
#25Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#20)
#26Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Peter Eisentraut (#25)