Partitioned tables and covering indexes

Started by Jaime Casanovaalmost 8 years ago22 messageshackers

jcasanov@systemguards.com.ec

almost 8 years ago

Hi,

Trying covering indexes on partitioned tables i get this error
"""
postgres=# create index on t1_part (i) include (t);
ERROR: cache lookup failed for opclass 0
"""

To reproduce:

create table t1_part (i int, t text) partition by hash (i);
create table t1_part_0 partition of t1_part for values with (modulus
2, remainder 0);
create table t1_part_1 partition of t1_part for values with (modulus
2, remainder 1);
insert into t1_part values (1, repeat('abcdefquerty', 20));

create index on t1_part (i) include (t);

--
Jaime Casanova www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Jaime Casanova (#1)

Re: Partitioned tables and covering indexes

On 2018/04/10 16:07, Jaime Casanova wrote:

Hi,

Trying covering indexes on partitioned tables i get this error
"""
postgres=# create index on t1_part (i) include (t);
ERROR: cache lookup failed for opclass 0
"""

To reproduce:

create table t1_part (i int, t text) partition by hash (i);
create table t1_part_0 partition of t1_part for values with (modulus
2, remainder 0);
create table t1_part_1 partition of t1_part for values with (modulus
2, remainder 1);
insert into t1_part values (1, repeat('abcdefquerty', 20));

create index on t1_part (i) include (t);

It seems that the bug is caused due to the original IndexStmt that
DefineIndex receives being overwritten when processing the INCLUDE
columns. Also, the original copy of it should be used when recursing for
defining the index in partitions.

Does the attached fix look correct? Haven't checked the fix with ATTACH
PARTITION though.

Maybe add this to open items list?

Thanks,
Amit

Alexander Korotkov

aekorotkov@gmail.com

almost 8 years ago

In reply to: Amit Langote (#2)

Re: Partitioned tables and covering indexes

On Tue, Apr 10, 2018 at 12:47 PM, Amit Langote <
Langote_Amit_f8@lab.ntt.co.jp> wrote:

On 2018/04/10 16:07, Jaime Casanova wrote:

Hi,

Trying covering indexes on partitioned tables i get this error
"""
postgres=# create index on t1_part (i) include (t);
ERROR: cache lookup failed for opclass 0
"""

To reproduce:

create table t1_part (i int, t text) partition by hash (i);
create table t1_part_0 partition of t1_part for values with (modulus
2, remainder 0);
create table t1_part_1 partition of t1_part for values with (modulus
2, remainder 1);
insert into t1_part values (1, repeat('abcdefquerty', 20));

create index on t1_part (i) include (t);

It seems that the bug is caused due to the original IndexStmt that
DefineIndex receives being overwritten when processing the INCLUDE
columns. Also, the original copy of it should be used when recursing for
defining the index in partitions.

Does the attached fix look correct? Haven't checked the fix with ATTACH
PARTITION though.

Attached patch seems to fix the problem. However, I would rather get
rid of modifying stmt->indexParams. That seems to be more logical
for me. Also, it would be good to check some covering indexes on
partitioned tables. See the attached patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Alexander Korotkov (#3)

Re: Partitioned tables and covering indexes

Does the attached fix look correct?О©╫ Haven't checked the fix with ATTACH
PARTITION though.

Attached patch seems to fix the problem.О©╫ However, I would rather get
rid of modifyingО©╫stmt->indexParams.О©╫ That seems to be more logical
for me.О©╫ Also, it would be good to check some covering indexes on
partitioned tables.О©╫ See the attached patch.

Seems right way, do not modify incoming object and do not copy rather large and
deep nested structure as suggested by Amit.

But it will be better to have a ATTACH PARTITION test too.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Jaime Casanova

jcasanov@systemguards.com.ec

almost 8 years ago

In reply to: Teodor Sigaev (#4)

Re: Partitioned tables and covering indexes

On 10 April 2018 at 10:36, Teodor Sigaev <teodor@sigaev.ru> wrote:

Does the attached fix look correct? Haven't checked the fix with
ATTACH
PARTITION though.

Attached patch seems to fix the problem. However, I would rather get
rid of modifying stmt->indexParams. That seems to be more logical
for me. Also, it would be good to check some covering indexes on
partitioned tables. See the attached patch.

Seems right way, do not modify incoming object and do not copy rather large
and deep nested structure as suggested by Amit.

But it will be better to have a ATTACH PARTITION test too.

the patch worked for me, i also tried some combinations using ATTACH
PARTITION and found no problems

--
Jaime Casanova www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Teodor Sigaev (#4)

Re: Partitioned tables and covering indexes

Hi.

On 2018/04/11 0:36, Teodor Sigaev wrote:

О©╫О©╫О©╫ Does the attached fix look correct?О©╫ Haven't checked the fix with
ATTACH
О©╫О©╫О©╫ PARTITION though.

Attached patch seems to fix the problem.О©╫ However, I would rather get
rid of modifyingО©╫stmt->indexParams.О©╫ That seems to be more logical
for me.О©╫ Also, it would be good to check some covering indexes on
partitioned tables.О©╫ See the attached patch.

Seems right way, do not modify incoming object and do not copy rather
large and deep nested structure as suggested by Amit.

Yeah, Alexander's suggested way of using a separate variable for
indexParams is better.

But it willО©╫ be better to have a ATTACH PARTITION test too.

I have added tests. Actually, instead of modifying existing tests, I
think it might be better to have a separate section at the end of
indexing.sql to test covering indexes feature for partitioned tables.

Attached find updated patch.

Thanks,
Amit

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Amit Langote (#6)

Re: Partitioned tables and covering indexes

Thank you, pushed

Amit Langote wrote:

Hi.

On 2018/04/11 0:36, Teodor Sigaev wrote:

О©╫О©╫О©╫ Does the attached fix look correct?О©╫ Haven't checked the fix with
ATTACH
О©╫О©╫О©╫ PARTITION though.

Attached patch seems to fix the problem.О©╫ However, I would rather get
rid of modifyingО©╫stmt->indexParams.О©╫ That seems to be more logical
for me.О©╫ Also, it would be good to check some covering indexes on
partitioned tables.О©╫ See the attached patch.

Seems right way, do not modify incoming object and do not copy rather
large and deep nested structure as suggested by Amit.

Yeah, Alexander's suggested way of using a separate variable for
indexParams is better.

But it willО©╫ be better to have a ATTACH PARTITION test too.

I have added tests. Actually, instead of modifying existing tests, I
think it might be better to have a separate section at the end of
indexing.sql to test covering indexes feature for partitioned tables.

Attached find updated patch.

Thanks,
Amit

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Amit Langote (#6)

Re: Partitioned tables and covering indexes

Patch makes buildfarm almost red, patch is temporary reverted.

Actually, discovered bug is not related to patch except new test faces with it,
problem is: CompareIndexInfo() checks rd_opfamily for equality for all
attributes, not only for key attribute.

Obviously, CompareIndexInfo() needs more work:
for (i = 0; i < info1->ii_NumIndexAttrs; i++)
{
if (maplen < info2->ii_KeyAttrNumbers[i])

Seems, we can go out from ii_KeyAttrNumbers array.

Amit Langote wrote:

Hi.

On 2018/04/11 0:36, Teodor Sigaev wrote:

О©╫О©╫О©╫ Does the attached fix look correct?О©╫ Haven't checked the fix with
ATTACH
О©╫О©╫О©╫ PARTITION though.

Attached patch seems to fix the problem.О©╫ However, I would rather get
rid of modifyingО©╫stmt->indexParams.О©╫ That seems to be more logical
for me.О©╫ Also, it would be good to check some covering indexes on
partitioned tables.О©╫ See the attached patch.

Seems right way, do not modify incoming object and do not copy rather
large and deep nested structure as suggested by Amit.

Yeah, Alexander's suggested way of using a separate variable for
indexParams is better.

But it willО©╫ be better to have a ATTACH PARTITION test too.

I have added tests. Actually, instead of modifying existing tests, I
think it might be better to have a separate section at the end of
indexing.sql to test covering indexes feature for partitioned tables.

Attached find updated patch.

Thanks,
Amit

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Teodor Sigaev (#8)

Re: Partitioned tables and covering indexes

Actually, discovered bug is not related to patch except new test faces
with it,
problem is: CompareIndexInfo() checks rd_opfamily for equality for all
attributes, not only for key attribute.

Patch attached. But it seems to me, field's names of
IndexInfo structure are a bit confused now:
int ii_NumIndexAttrs; /* total number of columns in index */
int ii_NumIndexKeyAttrs; /* number of key columns in
index */
AttrNumber ii_KeyAttrNumbers[INDEX_MAX_KEYS];

ii_KeyAttrNumbers contains all columns, i.e. it contains
ii_NumIndexAttrs number of columns, not a ii_NumIndexKeyAttrs number as
easy to think.

I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or
ii_IndexAttrNumbers. Opinions?

О©╫О©╫О©╫ for (i = 0; i < info1->ii_NumIndexAttrs; i++)
О©╫О©╫О©╫ {
О©╫О©╫О©╫О©╫О©╫О©╫О©╫ if (maplen < info2->ii_KeyAttrNumbers[i])

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#10

Alvaro Herrera

alvherre@2ndquadrant.com

almost 8 years ago

In reply to: Teodor Sigaev (#9)

Re: Partitioned tables and covering indexes

Teodor Sigaev wrote:

Patch attached.

I wonder why this is a problem in opfamilies but not collations.
If we don't compare collations, wouldn't it make more sense to break out
of the loop once the number of keys is reached?

When this code was written, there was no question as to what length the
opfamilies/collations the arrays were; it was obvious that they must be
of the length of the index attributes. It's not obvious now. Maybe add
a comment about that?

But it seems to me, field's names of
IndexInfo structure are a bit confused now:
int ii_NumIndexAttrs; /* total number of columns in index */
int ii_NumIndexKeyAttrs; /* number of key columns in index */
AttrNumber ii_KeyAttrNumbers[INDEX_MAX_KEYS];

ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs
number of columns, not a ii_NumIndexKeyAttrs number as easy to think.

I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or ii_IndexAttrNumbers.
Opinions?

Yeah, the current situation seems very odd.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11

Alexander Korotkov

aekorotkov@gmail.com

almost 8 years ago

In reply to: Alvaro Herrera (#10)

Re: Partitioned tables and covering indexes

On Wed, Apr 11, 2018 at 10:47 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

Teodor Sigaev wrote:

Patch attached.

I wonder why this is a problem in opfamilies but not collations.
If we don't compare collations, wouldn't it make more sense to break out
of the loop once the number of keys is reached?

It appears that INCLUDE columns might have collation defined.
For instance, following query is working:

create index t_s1_s2_idx on t (s1) include (s2 collate "en_US.UTF-8");

However, I don't see any point in defining collations here, because
INCLUDE attributes exist solely for index-only scans. So, index just
can return value of INCLUDE attribute "as is", no point to do something
with collation.

So, I propose to disable collations for INCLUDE attributes.

When this code was written, there was no question as to what length the

opfamilies/collations the arrays were; it was obvious that they must be
of the length of the index attributes. It's not obvious now. Maybe add
a comment about that?

In b3b7f789 Tom have resized one array size from total number of
attributes to number of key attributes. And I like idea of applying the
same to all other array: make them sized to total number of attributes
with filling of absent attributes with 0.

But it seems to me, field's names of
IndexInfo structure are a bit confused now:
int ii_NumIndexAttrs; /* total number of columns in index

*/

int ii_NumIndexKeyAttrs; /* number of key columns in

index */

AttrNumber ii_KeyAttrNumbers[INDEX_MAX_KEYS];

ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs
number of columns, not a ii_NumIndexKeyAttrs number as easy to think.

I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or

ii_IndexAttrNumbers.

Opinions?

Yeah, the current situation seems very odd.

+1 for ii_IndexAttrNumbers.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#12

Peter Geoghegan

pg@bowt.ie

almost 8 years ago

In reply to: Alexander Korotkov (#11)

Re: Partitioned tables and covering indexes

On Wed, Apr 11, 2018 at 1:58 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

It appears that INCLUDE columns might have collation defined.
For instance, following query is working:

create index t_s1_s2_idx on t (s1) include (s2 collate "en_US.UTF-8");

However, I don't see any point in defining collations here, because
INCLUDE attributes exist solely for index-only scans. So, index just
can return value of INCLUDE attribute "as is", no point to do something
with collation.

So, I propose to disable collations for INCLUDE attributes.

Hmm. I'm not sure that that's exactly the right thing to do. We seem
to want to have case-insensitive collations in the future. The fact
that you can spell out collation name in ON CONFLICT's unique index
inference specification suggests this, for example. I think that a
collation is theoretically allowed to affect the behavior of equality,
even though so far we've never tried to make that work for any
collatable datatype.

Maybe the right thing to do is to say that any collation will work
equally well (as will any opclass). Maybe that's the same thing as
what you said, though.

+1 for ii_IndexAttrNumbers.

--
Peter Geoghegan

#13

Peter Eisentraut

peter_e@gmx.net

almost 8 years ago

In reply to: Peter Geoghegan (#12)

Re: Partitioned tables and covering indexes

On 4/11/18 17:08, Peter Geoghegan wrote:

However, I don't see any point in defining collations here, because
INCLUDE attributes exist solely for index-only scans. So, index just
can return value of INCLUDE attribute "as is", no point to do something
with collation.

So, I propose to disable collations for INCLUDE attributes.

Hmm. I'm not sure that that's exactly the right thing to do. We seem
to want to have case-insensitive collations in the future. The fact
that you can spell out collation name in ON CONFLICT's unique index
inference specification suggests this, for example. I think that a
collation is theoretically allowed to affect the behavior of equality,
even though so far we've never tried to make that work for any
collatable datatype.

But in this case it doesn't even do equality comparison, it just returns
the value.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14

Peter Geoghegan

pg@bowt.ie

almost 8 years ago

In reply to: Peter Eisentraut (#13)

Re: Partitioned tables and covering indexes

On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

But in this case it doesn't even do equality comparison, it just returns
the value.

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

--
Peter Geoghegan

#15

Peter Eisentraut

peter_e@gmx.net

almost 8 years ago

In reply to: Peter Geoghegan (#14)

Re: Partitioned tables and covering indexes

On 4/11/18 17:38, Peter Geoghegan wrote:

On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

But in this case it doesn't even do equality comparison, it just returns
the value.

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Or maybe it should be the collation of the underlying table columns.
Otherwise the collation returned by an index-only scan would be
different from a table scan, no?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Peter Geoghegan (#14)

Re: Partitioned tables and covering indexes

Peter Geoghegan wrote:

On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

But in this case it doesn't even do equality comparison, it just returns
the value.

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Any operation over including columns is a deal for filter, not an index,
so collation will not be used somewhere inside index.

2Alexander: ComputeIndexAttrs() may use whole vectors (for including
columns too) just to simplify coding, in other places, seems, better to
have exact size of vectors. Using full-sized vectors could mask a error,
for exmaple, with setting opfamily to InvalidOid for key column. But if
you insist on that, I'll modify attached patch to use full-sized vectors
anywhere.

In attached path I did:
1) fix a bug in CompareIndexInfo() which checks opfamily for including
column
2) correct size for all vectors
3) Add assertion in various places to be sure that we don't try to use
including column as key column
4) per Peter Geoghegan idea add a error message if somebody adds options
to include column instead silently ignore it.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#17

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Peter Eisentraut (#15)

Re: Partitioned tables and covering indexes

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Or maybe it should be the collation of the underlying table columns.
Otherwise the collation returned by an index-only scan would be
different from a table scan, no?

+1, dangerous
-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/

#18

Alexander Korotkov

aekorotkov@gmail.com

almost 8 years ago

In reply to: Teodor Sigaev (#17)

Re: Partitioned tables and covering indexes

On Thu, Apr 12, 2018 at 1:14 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:

That's the idea that I tried to express. The point is that we need to

tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Or maybe it should be the collation of the underlying table columns.
Otherwise the collation returned by an index-only scan would be
different from a table scan, no?

+1, dangerous

I'm OK with collation of included columns to be the same as collation
of underlying table columns. But I still think we should throw an error
when user is trying to specify his own collation of included columns.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#19

Alexander Korotkov

aekorotkov@gmail.com

almost 8 years ago

In reply to: Teodor Sigaev (#16)

Re: Partitioned tables and covering indexes

On Thu, Apr 12, 2018 at 1:14 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:

Peter Geoghegan wrote:

On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

But in this case it doesn't even do equality comparison, it just returns
the value.

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Any operation over including columns is a deal for filter, not an index,
so collation will not be used somewhere inside index.

2Alexander: ComputeIndexAttrs() may use whole vectors (for including
columns too) just to simplify coding, in other places, seems, better to
have exact size of vectors. Using full-sized vectors could mask a error,
for exmaple, with setting opfamily to InvalidOid for key column. But if you
insist on that, I'll modify attached patch to use full-sized vectors
anywhere.

In attached path I did:
1) fix a bug in CompareIndexInfo() which checks opfamily for including
column
2) correct size for all vectors
3) Add assertion in various places to be sure that we don't try to use
including column as key column
4) per Peter Geoghegan idea add a error message if somebody adds options
to include column instead silently ignore it.

I don't insist on using full-sized vectors everywhere. Attached patch
looks OK for me. I haven't test it though.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#20

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Teodor Sigaev (#16)

Re: Partitioned tables and covering indexes

Thanks to everyone, this part is pushed. I will waiting a bit before pushing
topic-stater patch to have a time to look on buildfarm.

Alvaro, could you add a comment to CompareIndexInfo() to clarify why it doesn't
compare indoptions (like DESC/ASC etc)? It doesn't matter for uniqueness of
index but changes an order and it's not clear at least for me why we don't pay
attention for that.

In attached path I did:
1) fix a bug in CompareIndexInfo() which checks opfamily for including column
2) correct size for all vectors
3) Add assertion in various places to be sure that we don't try to use including
column as key column
4) per Peter Geoghegan idea add a error message if somebody adds options to
include column instead silently ignore it.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#21

Teodor Sigaev

teodor@sigaev.ru

almost 8 years ago

In reply to: Teodor Sigaev (#7)

#22