bug with expression index on partition

Started by Amit Langoteover 7 years ago7 messages
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
1 attachment(s)

Hi.

I noticed a bug with creating an expression index on a partitioned table.
The fact that a partition may have differing attribute numbers is not
accounted for when recursively *creating* the index on partition. The
partition index gets created but is unusable.

create table p (a int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

-- make p3 have different attribute number for column a
alter table p detach partition p3;
alter table p3 drop a, add a int;
alter table p attach partition p3 for values with (modulus 3, remainder 2);

create index on p ((a+1));
\d p3
ERROR: invalid attnum 1 for relation "p3"

That's because the IndexStmt that's passed for creating the index on p3
contains expressions whose Vars are exact copies of the parent's Vars,
which in this case is wrong. We must have converted them to p3's
attribute numbers before calling DefineIndex on p3.

Note that if the same index already exists in a partition before creating
the parent's index, then that index already contains the right Vars, no
conversion is needed in that case.

alter table p detach partition p3;
alter table p3 drop a, add a int;
create index on p3 ((a+1));
alter table p attach partition p3 for values with (modulus 3, remainder 2);
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

create index on p ((a+1));
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

In other words, the code that checks whether an index with matching
definition exists in the partition correctly converts Vars before the
actual matching, so is able to conclude that, despite different Vars in
p3's index, it is same index as the one being created in the parent. So,
the index is not created again, which actually if it were, it would be hit
by the very same bug I'm reporting.

Also is right the case where the partition being attached doesn't have the
index but the parent has and it is correctly *cloned* to the attached
partition.

alter table p detach partition p3;
alter table p3 drop a, add a int;
create index on p ((a+1));
alter table p attach partition p3 for values with (modulus 3, remainder 2);
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

So, CompareIndexInfo and generateClonedIndexStmt are both doing the right
thing, but DefineIndex is not. Attached is a patch to fix DefineIndex so
that it converts indexParams before recursing to create the index on a
partition.

By the way, do we want to do anything about the fact that we don't check
if a matching inherited index exists when creating an index directly on
the partition. I wondered this because we do check the other way around.

\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

create index on p3 ((a+1));
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))
"p3_expr_idx1" btree ((a + 1))

Thanks,
Amit

Attachments:

v1-0001-Convert-indexParams-to-partition-s-attnos-before-.patchtext/plain; charset=UTF-8; name=v1-0001-Convert-indexParams-to-partition-s-attnos-before-.patchDownload
From d7c090f1c14a5d5ebec04b5a58d626cbfe057e2c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 21 Jun 2018 14:50:20 +0900
Subject: [PATCH v1] Convert indexParams to partition's attnos before recursing

---
 src/backend/commands/indexcmds.c       | 18 +++++++++++++-
 src/test/regress/expected/indexing.out | 45 ++++++++++++++++++++++++++++++++++
 src/test/regress/sql/indexing.sql      | 26 ++++++++++++++++++++
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3a3223bffb..74ffd2c392 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -922,7 +922,6 @@ DefineIndex(Oid relationId,
 											   gettext_noop("could not convert row type"));
 				maplen = parentDesc->natts;
 
-
 				foreach(cell, childidxs)
 				{
 					Oid			cldidxid = lfirst_oid(cell);
@@ -993,7 +992,24 @@ DefineIndex(Oid relationId,
 				{
 					IndexStmt  *childStmt = copyObject(stmt);
 					bool		found_whole_row;
+					ListCell   *lc;
 
+					/* Adjust Vars to match new table's column numbering */
+					foreach(lc, childStmt->indexParams)
+					{
+						IndexElem *ielem = lfirst(lc);
+
+						/*
+						 * If the index parameter is an expression, we must
+						 * translate it to contain child Vars.
+						 */
+						if (ielem->expr)
+							ielem->expr =
+									map_variable_attnos((Node *) ielem->expr,
+														1, 0, attmap, maplen,
+														InvalidOid,
+														&found_whole_row);
+					}
 					childStmt->whereClause =
 						map_variable_attnos(stmt->whereClause, 1, 0,
 											attmap, maplen,
diff --git a/src/test/regress/expected/indexing.out b/src/test/regress/expected/indexing.out
index 2c2bf44aa8..5a9c3ee943 100644
--- a/src/test/regress/expected/indexing.out
+++ b/src/test/regress/expected/indexing.out
@@ -1337,3 +1337,48 @@ insert into covidxpart values (4, 1);
 insert into covidxpart values (4, 1);
 ERROR:  duplicate key value violates unique constraint "covidxpart4_a_b_idx"
 DETAIL:  Key (a)=(4) already exists.
+-- tests covering expression indexes in a partition tree with differing attributes
+create table idx_part (a int) partition by hash (a);
+create table idx_part1 partition of idx_part for values with (modulus 2, remainder 0);
+create table idx_part2 partition of idx_part for values with (modulus 2, remainder 1);
+alter table idx_part detach partition idx_part2;
+alter table idx_part2 drop a, add a int;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+             Table "public.idx_part2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+Partition of: idx_part FOR VALUES WITH (modulus 2, remainder 1)
+Indexes:
+    "idx_part2_expr_idx" btree ((a + 1))
+
+drop index idx_part_expr;
+alter table idx_part detach partition idx_part2;
+create index idx_part2_expr on idx_part2 ((a + 1));
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+             Table "public.idx_part2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+Partition of: idx_part FOR VALUES WITH (modulus 2, remainder 1)
+Indexes:
+    "idx_part2_expr" btree ((a + 1))
+
+drop table idx_part2;
+create table idx_part2 (b int, a int);
+alter table idx_part2 drop b;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+\d idx_part2;
+             Table "public.idx_part2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+Partition of: idx_part FOR VALUES WITH (modulus 2, remainder 1)
+Indexes:
+    "idx_part2_expr_idx" btree ((a + 1))
+
+drop table idx_part;
diff --git a/src/test/regress/sql/indexing.sql b/src/test/regress/sql/indexing.sql
index 29333b31ef..b050781e00 100644
--- a/src/test/regress/sql/indexing.sql
+++ b/src/test/regress/sql/indexing.sql
@@ -720,3 +720,29 @@ create unique index on covidxpart4 (a);
 alter table covidxpart attach partition covidxpart4 for values in (4);
 insert into covidxpart values (4, 1);
 insert into covidxpart values (4, 1);
+
+-- tests covering expression indexes in a partition tree with differing attributes
+create table idx_part (a int) partition by hash (a);
+create table idx_part1 partition of idx_part for values with (modulus 2, remainder 0);
+create table idx_part2 partition of idx_part for values with (modulus 2, remainder 1);
+
+alter table idx_part detach partition idx_part2;
+alter table idx_part2 drop a, add a int;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+
+drop index idx_part_expr;
+alter table idx_part detach partition idx_part2;
+create index idx_part2_expr on idx_part2 ((a + 1));
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+
+drop table idx_part2;
+create table idx_part2 (b int, a int);
+alter table idx_part2 drop b;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+\d idx_part2;
+
+drop table idx_part;
-- 
2.11.0

#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#1)
1 attachment(s)
Re: bug with expression index on partition

On 2018/06/21 15:35, Amit Langote wrote:

So, CompareIndexInfo and generateClonedIndexStmt are both doing the right
thing, but DefineIndex is not. Attached is a patch to fix DefineIndex so
that it converts indexParams before recursing to create the index on a
partition.

I noticed that while CompareIndexInfo and generateClonedIndexStmt would
reject the case where index expressions contain a whole-row Var, my patch
didn't teach to do the same to DefineIndex, causing asymmetric behavior.
So, whereas ATTACH PARTITION would error out when trying to clone a
parent's index that contains a whole-row Var, recursively creating an
index on partition won't.

I updated the patch so that even DefineIndex will check if any whole-row
Vars were encountered during conversion and error out if so.

Thanks,
Amit

Attachments:

v2-0001-Convert-indexParams-to-partition-s-attnos-before-.patchtext/plain; charset=UTF-8; name=v2-0001-Convert-indexParams-to-partition-s-attnos-before-.patchDownload
From f28b5e4ab8aa2c41486ef9f64b2f52220cfbfcbf Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 21 Jun 2018 14:50:20 +0900
Subject: [PATCH v2] Convert indexParams to partition's attnos before recursing

---
 src/backend/commands/indexcmds.c       | 21 +++++++++++++
 src/test/regress/expected/indexing.out | 54 ++++++++++++++++++++++++++++++++++
 src/test/regress/sql/indexing.sql      | 34 +++++++++++++++++++++
 3 files changed, 109 insertions(+)

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3a3223bffb..7bbdeb9e66 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -993,7 +993,28 @@ DefineIndex(Oid relationId,
 				{
 					IndexStmt  *childStmt = copyObject(stmt);
 					bool		found_whole_row;
+					ListCell   *lc;
 
+					/* Adjust Vars to match new table's column numbering */
+					foreach(lc, childStmt->indexParams)
+					{
+						IndexElem *ielem = lfirst(lc);
+
+						/*
+						 * If the index parameter is an expression, we must
+						 * translate it to contain child Vars.
+						 */
+						if (ielem->expr)
+						{
+							ielem->expr =
+									map_variable_attnos((Node *) ielem->expr,
+														1, 0, attmap, maplen,
+														InvalidOid,
+														&found_whole_row);
+							if (found_whole_row)
+								elog(ERROR, "cannot convert whole-row table reference");
+						}
+					}
 					childStmt->whereClause =
 						map_variable_attnos(stmt->whereClause, 1, 0,
 											attmap, maplen,
diff --git a/src/test/regress/expected/indexing.out b/src/test/regress/expected/indexing.out
index 2c2bf44aa8..02c5a8e5ed 100644
--- a/src/test/regress/expected/indexing.out
+++ b/src/test/regress/expected/indexing.out
@@ -1337,3 +1337,57 @@ insert into covidxpart values (4, 1);
 insert into covidxpart values (4, 1);
 ERROR:  duplicate key value violates unique constraint "covidxpart4_a_b_idx"
 DETAIL:  Key (a)=(4) already exists.
+-- tests covering expression indexes in a partition tree with differing attributes
+create table idx_part (a int) partition by hash (a);
+create table idx_part1 partition of idx_part for values with (modulus 2, remainder 0);
+create table idx_part2 partition of idx_part for values with (modulus 2, remainder 1);
+alter table idx_part detach partition idx_part2;
+alter table idx_part2 drop a, add a int;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+             Table "public.idx_part2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+Partition of: idx_part FOR VALUES WITH (modulus 2, remainder 1)
+Indexes:
+    "idx_part2_expr_idx" btree ((a + 1))
+
+drop index idx_part_expr;
+alter table idx_part detach partition idx_part2;
+create index idx_part2_expr on idx_part2 ((a + 1));
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+             Table "public.idx_part2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+Partition of: idx_part FOR VALUES WITH (modulus 2, remainder 1)
+Indexes:
+    "idx_part2_expr" btree ((a + 1))
+
+drop table idx_part2;
+create table idx_part2 (b int, a int);
+alter table idx_part2 drop b;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+\d idx_part2;
+             Table "public.idx_part2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+Partition of: idx_part FOR VALUES WITH (modulus 2, remainder 1)
+Indexes:
+    "idx_part2_expr_idx" btree ((a + 1))
+
+-- whole-row Vars in expressions are detected when converting and reported as unsupported
+drop table idx_part;
+create table idx_part (a int) partition by hash (a);
+create table idx_part1 (a int);
+create index idx_part_wholerow on idx_part ((idx_part));
+alter table idx_part attach partition idx_part1 for values with (modulus 2, remainder 0); -- error
+ERROR:  cannot convert whole-row table reference
+DETAIL:  Index "idx_part_wholerow" contains a whole-row table reference.
+drop table idx_part1;
+drop table idx_part;
diff --git a/src/test/regress/sql/indexing.sql b/src/test/regress/sql/indexing.sql
index 29333b31ef..1b1b0af029 100644
--- a/src/test/regress/sql/indexing.sql
+++ b/src/test/regress/sql/indexing.sql
@@ -720,3 +720,37 @@ create unique index on covidxpart4 (a);
 alter table covidxpart attach partition covidxpart4 for values in (4);
 insert into covidxpart values (4, 1);
 insert into covidxpart values (4, 1);
+
+-- tests covering expression indexes in a partition tree with differing attributes
+create table idx_part (a int) partition by hash (a);
+create table idx_part1 partition of idx_part for values with (modulus 2, remainder 0);
+create table idx_part2 partition of idx_part for values with (modulus 2, remainder 1);
+
+alter table idx_part detach partition idx_part2;
+alter table idx_part2 drop a, add a int;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+
+drop index idx_part_expr;
+alter table idx_part detach partition idx_part2;
+create index idx_part2_expr on idx_part2 ((a + 1));
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+create index idx_part_expr on idx_part ((a + 1));
+\d idx_part2
+
+drop table idx_part2;
+create table idx_part2 (b int, a int);
+alter table idx_part2 drop b;
+alter table idx_part attach partition idx_part2 for values with (modulus 2, remainder 1);
+\d idx_part2;
+
+-- whole-row Vars in expressions are detected when converting and reported as unsupported
+drop table idx_part;
+create table idx_part (a int) partition by hash (a);
+create table idx_part1 (a int);
+create index idx_part_wholerow on idx_part ((idx_part));
+alter table idx_part attach partition idx_part1 for values with (modulus 2, remainder 0); -- error
+drop table idx_part1;
+
+drop table idx_part;
-- 
2.11.0

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#2)
Re: bug with expression index on partition

On 2018/06/21 16:19, Amit Langote wrote:

I updated the patch so that even DefineIndex will check if any whole-row
Vars were encountered during conversion and error out if so.

I first thought of starting a new thread for this, but thought I'd just
reply here because the affected code is nearby.

I was wondering if it wouldn't hurt to allow whole-row vars to be present
in the expressions of inherited indexes. If we did allow it, the query
shown in the example below is able to use the indexes on partitions.

create table p (a int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);
create index on p ((p));

explain (costs off) select p from p order by p;
QUERY PLAN
---------------------------------------
Merge Append
Sort Key: ((p1.*)::p)
-> Index Scan using p1_p_idx on p1
-> Index Scan using p2_p_idx on p2
-> Index Scan using p3_p_idx on p3
(5 rows)

After applying the patch in my last email, each of
generateClonedIndexStmt, CompareIndexInfo, and DefineIndex reject
inheriting an index if its expressions are found to contain whole-row
vars. Now, one can create those indexes on partitions individually, but
they cannot be matched to an ORDER BY clause of a query accessing those
partitions via the parent table.

drop index p_p_idx;
create index on p1 ((p1));
create index on p2 ((p2));
create index on p3 ((p3));

explain (costs off) select p from p order by p;
QUERY PLAN
----------------------------
Sort
Sort Key: ((p1.*)::p)
-> Append
-> Seq Scan on p1
-> Seq Scan on p2
-> Seq Scan on p3
(6 rows)

It is of course usable if partition's accessed directly.

explain (costs off) select p1 from p1 order by p1;
QUERY PLAN
----------------------------------
Index Scan using p1_p1_idx on p1
(1 row)

OTOH, an inherited index with whole-row vars (if we decide to start
allowing them as I'm proposing) cannot be used if partition's accessed
directly.

drop index p1_p1_idx;
create index on p ((p));
explain (costs off) select p1 from p1 order by p1;
QUERY PLAN
----------------------
Sort
Sort Key: p1.*
-> Seq Scan on p1
(3 rows)

but maybe that's tolerable.

Thoughts?

Thanks,
Amit

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Langote (#2)
Re: bug with expression index on partition

On 2018-Jun-21, Amit Langote wrote:

On 2018/06/21 15:35, Amit Langote wrote:

So, CompareIndexInfo and generateClonedIndexStmt are both doing the right
thing, but DefineIndex is not. Attached is a patch to fix DefineIndex so
that it converts indexParams before recursing to create the index on a
partition.

I noticed that while CompareIndexInfo and generateClonedIndexStmt would
reject the case where index expressions contain a whole-row Var, my patch
didn't teach to do the same to DefineIndex, causing asymmetric behavior.
So, whereas ATTACH PARTITION would error out when trying to clone a
parent's index that contains a whole-row Var, recursively creating an
index on partition won't.

I updated the patch so that even DefineIndex will check if any whole-row
Vars were encountered during conversion and error out if so.

Thanks. I pushed this version, although I tweaked the test so that this
condition is verified in the test that is supposed to do so, rather than
creating a whole new set of tables for this purpose. The way it would
fail with unpatched code is perhaps less noisy that what you proposed,
but I think it's quite enough.

I think the whole-row vars issue is worthy of some more thinking. I'm
letting that one go for now.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Langote (#3)
Re: bug with expression index on partition

On 2018-Jun-21, Amit Langote wrote:

explain (costs off) select p from p order by p;
QUERY PLAN
---------------------------------------
Merge Append
Sort Key: ((p1.*)::p)
-> Index Scan using p1_p_idx on p1
-> Index Scan using p2_p_idx on p2
-> Index Scan using p3_p_idx on p3
(5 rows)

Nice, but try adding a row > operator in the where clause.

I think it's clearly desirable to allow this row-based search to use indexes;
as I recall, we mostly enable pagination of results via this kind of
constructs. However, we're lacking planner or executor features apparently,
because a query using a row > operator does not use indexes:

create table partp (a int, b int) partition by range (a);
create table partp1 partition of partp for values from (0) to (35);
create table partp2 partition of partp for values from (35) to (100);
create index on partp1 ((partp1.*));
create index on partp2 ((partp2.*));
explain select * from partp where partp > row(0,0) order by partp limit 25 ;
QUERY PLAN
──────────────────────────────────────────────────────────────────────────
Limit (cost=6.69..6.75 rows=25 width=40)
-> Sort (cost=6.69..6.86 rows=66 width=40)
Sort Key: ((partp1.*)::partp)
-> Append (cost=0.00..4.83 rows=66 width=40)
-> Seq Scan on partp1 (cost=0.00..1.88 rows=23 width=40)
Filter: ((partp1.*)::partp > '(0,0)'::record)
-> Seq Scan on partp2 (cost=0.00..2.62 rows=43 width=40)
Filter: ((partp2.*)::partp > '(0,0)'::record)
(8 filas)

Note the indexes are ignored, as opposed to what it does in a non-partitioned
table:

create table p (a int, b int);
create index on p((p.*));
explain select * from p where p > row(0,0) order by p limit 25 ;
QUERY PLAN
───────────────────────────────────────────────────────────────────────────
Limit (cost=0.15..2.05 rows=25 width=40)
-> Index Scan using p_p_idx on p (cost=0.15..57.33 rows=753 width=40)
Index Cond: (p.* > '(0,0)'::record)
(3 filas)

So it would be good to fix this, but there are more pieces missing. Or
there is some trick to enable the indexes to be used in that example --
if so I'm all ears.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alvaro Herrera (#4)
Re: bug with expression index on partition

On 2018/06/23 5:54, Alvaro Herrera wrote:

On 2018-Jun-21, Amit Langote wrote:

On 2018/06/21 15:35, Amit Langote wrote:

So, CompareIndexInfo and generateClonedIndexStmt are both doing the right
thing, but DefineIndex is not. Attached is a patch to fix DefineIndex so
that it converts indexParams before recursing to create the index on a
partition.

I noticed that while CompareIndexInfo and generateClonedIndexStmt would
reject the case where index expressions contain a whole-row Var, my patch
didn't teach to do the same to DefineIndex, causing asymmetric behavior.
So, whereas ATTACH PARTITION would error out when trying to clone a
parent's index that contains a whole-row Var, recursively creating an
index on partition won't.

I updated the patch so that even DefineIndex will check if any whole-row
Vars were encountered during conversion and error out if so.

Thanks. I pushed this version, although I tweaked the test so that this
condition is verified in the test that is supposed to do so, rather than
creating a whole new set of tables for this purpose. The way it would
fail with unpatched code is perhaps less noisy that what you proposed,
but I think it's quite enough.

Thanks. The test changes you committed seem fine.

Regards,
Amit

#7Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alvaro Herrera (#5)
Re: bug with expression index on partition

On 2018/06/23 6:51, Alvaro Herrera wrote:

On 2018-Jun-21, Amit Langote wrote:

explain (costs off) select p from p order by p;
QUERY PLAN
---------------------------------------
Merge Append
Sort Key: ((p1.*)::p)
-> Index Scan using p1_p_idx on p1
-> Index Scan using p2_p_idx on p2
-> Index Scan using p3_p_idx on p3
(5 rows)

Nice, but try adding a row > operator in the where clause.

I think it's clearly desirable to allow this row-based search to use indexes;
as I recall, we mostly enable pagination of results via this kind of
constructs. However, we're lacking planner or executor features apparently,
because a query using a row > operator does not use indexes:

create table partp (a int, b int) partition by range (a);
create table partp1 partition of partp for values from (0) to (35);
create table partp2 partition of partp for values from (35) to (100);
create index on partp1 ((partp1.*));
create index on partp2 ((partp2.*));
explain select * from partp where partp > row(0,0) order by partp limit 25 ;
QUERY PLAN
──────────────────────────────────────────────────────────────────────────
Limit (cost=6.69..6.75 rows=25 width=40)
-> Sort (cost=6.69..6.86 rows=66 width=40)
Sort Key: ((partp1.*)::partp)
-> Append (cost=0.00..4.83 rows=66 width=40)
-> Seq Scan on partp1 (cost=0.00..1.88 rows=23 width=40)
Filter: ((partp1.*)::partp > '(0,0)'::record)
-> Seq Scan on partp2 (cost=0.00..2.62 rows=43 width=40)
Filter: ((partp2.*)::partp > '(0,0)'::record)
(8 filas)

Note the indexes are ignored, as opposed to what it does in a non-partitioned
table:

Ah, yes.

IIUC, that happens because any whole-row Vars in WHERE quals and
EquivalenceClass expressions corresponding to child relations each has a
ConvertRowtypeExpr on top, whereas, a child index's expressions read off
pg_index doesn't contain ConvertRowtypeExpr expressions. So, WHERE quals
and/or ORDER BY expressions containing references to the parent's
whole-row Vars cannot be matched to a child index containing same
whole-row Vars.

It's a bit unfortunate that the WHERE quals and EC expressions are
transformed such that they contain ConvertRowtypeExpr nodes at a point
where they're perhaps not necessary (such as the point when a WHERE clause
or EC expression is matched with an index expression). A related
discussion is underway on a nearby thread titled "Expression errors with
"FOR UPDATE" and postgres_fdw with partition wise join enabled", so it'd
be nice if that thread concludes such that whole-row child indexes start
becoming useful.

Thanks,
Amit