missing indexes in indexlist with partitioned tables
Hi,
I did a bit of testing today and noticed that we don't set indexlist properly at the right time in some cases when using partitioned tables.
I attached a simple case where the indexlist doesn't seems to be set at the right time. get_relation_info in plancat.c seems to process it only after analyzejoins.c checked for it.
Can someone explain to me why it is deferred at all?
Regards
Arne
Attachments:
Hi,
I stumbled across a few places that depend on the inheritance appends being applied at a later date, so I quickly abandoned that idea. I thought a bit about the indexlist, in particular the inhparent, and I am not sure what depends on get_relation_info working in that way.
Therefore I propose a new attribute partIndexlist of RelOptInfo to include information about uniqueness, in the case the executor can't use the structure that causes the uniqueness to begin with. Said attribute can be used by relation_has_unique_index_for and rel_supports_distinctness.
The attached patch takes that route. I'd appreciate feedback!
Regards
Arne
Attachments:
partIndexlistClean.patchtext/x-patch; name=partIndexlistClean.patchDownload+277-229
Hi,
On Thu, Oct 28, 2021 at 01:44:31PM +0000, Arne Roland wrote:
The attached patch takes that route. I'd appreciate feedback!
The cfbot reports that the patch doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3452.log
=== Applying patches on top of PostgreSQL commit ID 025b920a3d45fed441a0a58fdcdf05b321b1eead ===
=== applying patch ./partIndexlistClean.patch
patching file src/backend/access/heap/vacuumlazy.c
Hunk #1 FAILED at 2375.
1 out of 1 hunk FAILED -- saving rejects to file src/backend/access/heap/vacuumlazy.c.rej
patching file src/backend/access/transam/xlog.c
Hunk #1 succeeded at 911 with fuzz 1 (offset 5 lines).
Hunk #2 FAILED at 5753.
[...]
1 out of 6 hunks FAILED -- saving rejects to file src/backend/access/transam/xlog.c.rej
[...]
patching file src/backend/commands/publicationcmds.c
Hunk #1 FAILED at 813.
1 out of 1 hunk FAILED -- saving rejects to file src/backend/commands/publicationcmds.c.rej
patching file src/include/nodes/pathnodes.h
Hunk #9 FAILED at 1516.
[...]
1 out of 17 hunks FAILED -- saving rejects to file src/include/nodes/pathnodes.h.rej
Could you send a rebased version? In the meantime I will switch the cf entry
to Waiting on Author.
Hi,
thank you for the heads up! Those files ended up accidentally in the dump from me running pg_indent. The file count was truly excessive, I should have noticed this sooner.
I attached the patch without the excessive modifications. That should be way easier to read.
Regards
Arne
Attachments:
0002-partIndexlistClean.patchtext/x-patch; name=0002-partIndexlistClean.patchDownload+90-51
Using some valuable feedback from Zhihong Yu, I fixed a flipped negation error and updated the comments.
Regards
Arne
________________________________
From: Arne Roland
Sent: Monday, January 17, 2022 12:25
To: Julien Rouhaud
Cc: pgsql-hackers
Subject: Re: missing indexes in indexlist with partitioned tables
Hi,
thank you for the heads up! Those files ended up accidentally in the dump from me running pg_indent. The file count was truly excessive, I should have noticed this sooner.
I attached the patch without the excessive modifications. That should be way easier to read.
Regards
Arne
Attachments:
0003-partIndexlistClean.patchtext/x-patch; name=0003-partIndexlistClean.patchDownload+91-62
Hmm, can you show cases of queries for which having this new
partIndexlist changes plans?
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Saca el libro que tu religión considere como el indicado para encontrar la
oración que traiga paz a tu alma. Luego rebootea el computador
y ve si funciona" (Carlos Duclós)
Hi!
Afaiac the join pruning where the outer table is a partitioned table is the relevant case.
I am not sure whether there are other cases.
The join pruning, which works great for plain relations since 9.0, falls short for partitioned tables, since the optimizer fails to prove uniqueness there.
In practical cases inner and outer tables are almost surely different ones, but I reattached a simpler example. It's the one, I came up with back in September.
I've seen this can be a reason to avoid partitioning for the time being, if the application relies on join pruning. I think generic views make it almost necessary to have it. If you had a different answer in mind, please don't hesitate to ask again.
Regards
Arne
________________________________
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Sent: Monday, January 17, 2022 7:16:08 PM
To: Arne Roland
Cc: Julien Rouhaud; pgsql-hackers
Subject: Re: missing indexes in indexlist with partitioned tables
Hmm, can you show cases of queries for which having this new
partIndexlist changes plans?
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Saca el libro que tu religión considere como el indicado para encontrar la
oración que traiga paz a tu alma. Luego rebootea el computador
y ve si funciona" (Carlos Duclós)
Attachments:
Hi,
On Mon, Jan 17, 2022 at 08:32:40PM +0000, Arne Roland wrote:
Afaiac the join pruning where the outer table is a partitioned table is the relevant case.
The last version of the patch now fails on all platform, with plan changes.
For instance:
https://cirrus-ci.com/task/4825629131538432
https://api.cirrus-ci.com/v1/artifact/task/4825629131538432/regress_diffs/src/test/regress/regression.diffs
diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/partition_join.out /tmp/cirrus-ci-build/src/test/regress/results/partition_join.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/partition_join.out 2022-01-17 23:08:47.158198249 +0000
+++ /tmp/cirrus-ci-build/src/test/regress/results/partition_join.out 2022-01-17 23:12:34.163488567 +0000
@@ -4887,37 +4887,23 @@
SET enable_partitionwise_join = on;
EXPLAIN (COSTS OFF)
SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
- QUERY PLAN
------------------------------------------------------------------------
+ QUERY PLAN
+-----------------------------------------------------------------
Limit
- -> Merge Append
- Sort Key: x.id
- -> Merge Left Join
- Merge Cond: (x_1.id = y_1.id)
- -> Index Only Scan using fract_t0_pkey on fract_t0 x_1
- -> Index Only Scan using fract_t0_pkey on fract_t0 y_1
- -> Merge Left Join
- Merge Cond: (x_2.id = y_2.id)
- -> Index Only Scan using fract_t1_pkey on fract_t1 x_2
- -> Index Only Scan using fract_t1_pkey on fract_t1 y_2
-(11 rows)
+ -> Append
+ -> Index Only Scan using fract_t0_pkey on fract_t0 x_1
+ -> Index Only Scan using fract_t1_pkey on fract_t1 x_2
+(4 rows)
EXPLAIN (COSTS OFF)
SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
- QUERY PLAN
---------------------------------------------------------------------------------
+ QUERY PLAN
+--------------------------------------------------------------------------
Limit
- -> Merge Append
- Sort Key: x.id DESC
- -> Nested Loop Left Join
- -> Index Only Scan Backward using fract_t0_pkey on fract_t0 x_1
- -> Index Only Scan using fract_t0_pkey on fract_t0 y_1
- Index Cond: (id = x_1.id)
- -> Nested Loop Left Join
- -> Index Only Scan Backward using fract_t1_pkey on fract_t1 x_2
- -> Index Only Scan using fract_t1_pkey on fract_t1 y_2
- Index Cond: (id = x_2.id)
-(11 rows)
+ -> Append
+ -> Index Only Scan Backward using fract_t1_pkey on fract_t1 x_2
+ -> Index Only Scan Backward using fract_t0_pkey on fract_t0 x_1
+(4 rows)
On 2022-Jan-18, Julien Rouhaud wrote:
SET enable_partitionwise_join = on; EXPLAIN (COSTS OFF) SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10; - QUERY PLAN ------------------------------------------------------------------------ + QUERY PLAN +----------------------------------------------------------------- Limit - -> Merge Append - Sort Key: x.id - -> Merge Left Join - Merge Cond: (x_1.id = y_1.id) - -> Index Only Scan using fract_t0_pkey on fract_t0 x_1 - -> Index Only Scan using fract_t0_pkey on fract_t0 y_1 - -> Merge Left Join - Merge Cond: (x_2.id = y_2.id) - -> Index Only Scan using fract_t1_pkey on fract_t1 x_2 - -> Index Only Scan using fract_t1_pkey on fract_t1 y_2 -(11 rows) + -> Append + -> Index Only Scan using fract_t0_pkey on fract_t0 x_1 + -> Index Only Scan using fract_t1_pkey on fract_t1 x_2 +(4 rows)
Hmm, these plan changes look valid to me. A left self-join using the
primary key column? That looks optimizable all right.
I suspect that the author of partition-wise joins would want to change
these queries so that whatever was being tested by these self-joins is
tested by some other means (maybe just create an identical partitioned
table via CREATE TABLE fract_t2 ... ; INSERT INTO fract_t2 SELECT FROM
fract_t). But at the same time, the author of this patch should a) make
sure that the submitted patch updates these test results so that the
test pass, and also b) add some test cases to verify that his desired
behavior is tested somewhere, not just in a test case that's
incidentally broken by his patch.
What I still don't know is whether this patch is actually desirable or
not. If the only cases it affects is self-joins, is there much actual
value?
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Hi!
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
[...]
Hmm, these plan changes look valid to me. A left self-join using the
primary key column? That looks optimizable all right.
[...]
What I still don't know is whether this patch is actually desirable or
not. If the only cases it affects is self-joins, is there much actual
value?
This is not really about self joins. That was just the most simple example, because otherwise we need a second table.
It's unique, it's not relevant whether it's the same table. In most cases it won't. I was talking about join pruning.
I suspect that the author of partition-wise joins would want to change
these queries so that whatever was being tested by these self-joins is
tested by some other means (maybe just create an identical partitioned
table via CREATE TABLE fract_t2 ... ; INSERT INTO fract_t2 SELECT FROM
fract_t). But at the same time, the author of this patch should
Your suggestion doesn't work, because with my patch we solve that case as well. We solve the general join pruning case. If we make the index non-unique however, we should be able to test the fractional case the same way.
b) add some test cases to verify that his desired
behavior is tested somewhere, not just in a test case that's
incidentally broken by his patch.
My patch already includes such a test, look at @@ -90,6 +90,13 @@ src/test/regress/sql/partition_join.sql
Since the selfjoin part was confusing to you, it might be worthwhile to do test that with two different tables. While I see no need to test in that way, I will adjust the patch so. Just to make it more clear for people looking at those tests in the future.
a) make
sure that the submitted patch updates these test results so that the
test pass [...]
Just for the record: I did run the tests, but I did miss that the commit of Tomas fix for fractional optimization is already on the master. Please note that this is a very new test case from a patch committed less than one week ago.
I'm glad Julien Rouhaud pointed out, that Tomas patch is committed it by now. That was very helpful to me, as I can now integrate the two tests.
@Álvaro Herrera:
If you want to help me, please don't put forward an abstract list of responsibilities. If anything I obviously need practical help, on how I can catch on recent changes quicker and without manual intervention. I don't have a modified buildfarm animal running, that tries to apply my patch and run regression tests for my patch on a daily basis.
Is there a simple way for me to check for that?
I will probably integrate those two tests, since they can work of similar structures without need to recreate the tables again and again. I have clear understanding how that new test works. I have to attend a few calls now, but I should be able to update the tests later.
Regards
Arne
Hi,
I came up with a slightly more intuitive way to force the different outcome for the fractional test, that does hardly change anything.
I'm not sure, whether the differentiation between fract_x and fract_t is worth it, since there shouldn't be any difference, but as mentioned before I added it for potential future clarity.
Thanks for your feedback again!
Regards
Arne
________________________________
From: Arne Roland
Sent: Wednesday, January 19, 2022 10:13:55 PM
To: Alvaro Herrera; Julien Rouhaud
Cc: pgsql-hackers
Subject: Re: missing indexes in indexlist with partitioned tables
Hi!
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
[...]
Hmm, these plan changes look valid to me. A left self-join using the
primary key column? That looks optimizable all right.
[...]
What I still don't know is whether this patch is actually desirable or
not. If the only cases it affects is self-joins, is there much actual
value?
This is not really about self joins. That was just the most simple example, because otherwise we need a second table.
It's unique, it's not relevant whether it's the same table. In most cases it won't. I was talking about join pruning.
I suspect that the author of partition-wise joins would want to change
these queries so that whatever was being tested by these self-joins is
tested by some other means (maybe just create an identical partitioned
table via CREATE TABLE fract_t2 ... ; INSERT INTO fract_t2 SELECT FROM
fract_t). But at the same time, the author of this patch should
Your suggestion doesn't work, because with my patch we solve that case as well. We solve the general join pruning case. If we make the index non-unique however, we should be able to test the fractional case the same way.
b) add some test cases to verify that his desired
behavior is tested somewhere, not just in a test case that's
incidentally broken by his patch.
My patch already includes such a test, look at @@ -90,6 +90,13 @@ src/test/regress/sql/partition_join.sql
Since the selfjoin part was confusing to you, it might be worthwhile to do test that with two different tables. While I see no need to test in that way, I will adjust the patch so. Just to make it more clear for people looking at those tests in the future.
a) make
sure that the submitted patch updates these test results so that the
test pass [...]
Just for the record: I did run the tests, but I did miss that the commit of Tomas fix for fractional optimization is already on the master. Please note that this is a very new test case from a patch committed less than one week ago.
I'm glad Julien Rouhaud pointed out, that Tomas patch is committed it by now. That was very helpful to me, as I can now integrate the two tests.
@Álvaro Herrera:
If you want to help me, please don't put forward an abstract list of responsibilities. If anything I obviously need practical help, on how I can catch on recent changes quicker and without manual intervention. I don't have a modified buildfarm animal running, that tries to apply my patch and run regression tests for my patch on a daily basis.
Is there a simple way for me to check for that?
I will probably integrate those two tests, since they can work of similar structures without need to recreate the tables again and again. I have clear understanding how that new test works. I have to attend a few calls now, but I should be able to update the tests later.
Regards
Arne
Attachments:
0004-partIndexlistClean.patchtext/x-patch; name=0004-partIndexlistClean.patchDownload+126-70
On Wed, Jan 19, 2022 at 1:50 PM Arne Roland <A.Roland@index.de> wrote:
Hi,
I came up with a slightly more intuitive way to force the different
outcome for the fractional test, that does hardly change anything.I'm not sure, whether the differentiation between fract_x and fract_t is
worth it, since there shouldn't be any difference, but as mentioned before
I added it for potential future clarity.Thanks for your feedback again!
Regards
Arne
------------------------------
*From:* Arne Roland
*Sent:* Wednesday, January 19, 2022 10:13:55 PM
*To:* Alvaro Herrera; Julien Rouhaud
*Cc:* pgsql-hackers
*Subject:* Re: missing indexes in indexlist with partitioned tablesHi!
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
[...]
Hmm, these plan changes look valid to me. A left self-join using the
primary key column? That looks optimizable all right.
[...]
What I still don't know is whether this patch is actually desirable or
not. If the only cases it affects is self-joins, is there much actual
value?This is not really about self joins. That was just the most simple
example, because otherwise we need a second table.
It's unique, it's not relevant whether it's the same table. In most cases
it won't. I was talking about join pruning.I suspect that the author of partition-wise joins would want to change
these queries so that whatever was being tested by these self-joins is
tested by some other means (maybe just create an identical partitioned
table via CREATE TABLE fract_t2 ... ; INSERT INTO fract_t2 SELECT FROM
fract_t). But at the same time, the author of this patch shouldYour suggestion doesn't work, because with my patch we solve that case as
well. We solve the general join pruning case. If we make the index
non-unique however, we should be able to test the fractional case the
same way.b) add some test cases to verify that his desired
behavior is tested somewhere, not just in a test case that's
incidentally broken by his patch.My patch already includes such a test, look at @@ -90,6 +90,13 @@
src/test/regress/sql/partition_join.sql
Since the selfjoin part was confusing to you, it might be worthwhile to do
test that with two different tables. While I see no need to test in that
way, I will adjust the patch so. Just to make it more clear for people
looking at those tests in the future.a) make
sure that the submitted patch updates these test results so that the
test pass [...]Just for the record: I did run the tests, but I did miss that the commit
of Tomas fix for fractional optimization is already on the master. Please
note that this is a very new test case from a patch committed less than one
week ago.I'm glad Julien Rouhaud pointed out, that Tomas patch is committed it by
now. That was very helpful to me, as I can now integrate the two tests.@Álvaro Herrera:
If you want to help me, please don't put forward an abstract list of
responsibilities. If anything I obviously need practical help, on how I can
catch on recent changes quicker and without manual intervention. I don't
have a modified buildfarm animal running, that tries to apply my patch and
run regression tests for my patch on a daily basis.
Is there a simple way for me to check for that?I will probably integrate those two tests, since they can work of similar
structures without need to recreate the tables again and again. I have
clear understanding how that new test works. I have to attend a few calls
now, but I should be able to update the tests later.Regards
ArneHi,
- if (indexRelation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
+ if (inhparent && (!index->indisunique ||
indexRelation->rd_rel->relkind != RELKIND_PARTITIONED_INDEX))
The check on RELKIND_PARTITIONED_INDEX seems to negate what the comment
above says:
+ * Don't add partitioned indexes to the indexlist
Cheers
On 2022-Jan-19, Arne Roland wrote:
a) make sure that the submitted patch updates these test results so
that the test pass [...]Just for the record: I did run the tests, but I did miss that the
commit of Tomas fix for fractional optimization is already on the
master. Please note that this is a very new test case from a patch
committed less than one week ago.
Ah, apologies, I didn't realize that that test was so new.
If anything I obviously need practical help, on how
I can catch on recent changes quicker and without manual intervention.
I don't have a modified buildfarm animal running, that tries to apply
my patch and run regression tests for my patch on a daily basis.
See src/tools/ci/README (for multi-platform testing of patches on
several platforms) and http://commitfest.cputube.org/
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Hi!
From: Zhihong Yu <zyu@yugabyte.com>
Subject: Re: missing indexes in indexlist with partitioned tablesHi,
- if (indexRelation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX) + if (inhparent && (!index->indisunique || indexRelation->rd_rel->relkind != RELKIND_PARTITIONED_INDEX))The check on RELKIND_PARTITIONED_INDEX seems to negate what the comment above says:
+ * Don't add partitioned indexes to the indexlist
Cheers
The comment at my end goes on:
/*
* Don't add partitioned indexes to the indexlist, since they are
* not usable by the executor. If they are unique add them to the
* partindexlist instead, to use for further pruning. If they
* aren't that either, simply skip them.
*/
Regarding the structure: I think, that we probably should remove the first two sentences here. They reoccur 50 lines below anyways, which seems a dubious practice. The logic that enforces the first two sentences is mainly down below, so that place is probably on one to keep.
Regarding the semantics: This is sort of what the statement checks for (skip for inhparent, if not unique or not partitioned index), i.e. it checks for the case, where the index shouldn't be added to either list.
Side note: I personally think the name inhparent is mildly confusing, since it's not really about inheritance. I don't have a significantly better idea though.
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Sent: Wednesday, January 19, 2022 23:26
Ah, apologies, I didn't realize that that test was so new.
No offense taken. Unless one was involved in the creation of the corresponding patch, it's unreasonable to know that. I like the second part of your message very much:
See src/tools/ci/README (for multi-platform testing of patches on
several platforms) and http://commitfest.cputube.org/
Thanks, I didn't know of cputube. Neat! That's pretty much what I was looking for!
Is there a way to get an email notification if some machine fails (turns bright red)? For the threads I'm explicitly subscribed to, that would seem helpful to me.
Regards
Arne
Hi,
On Mon, Jan 24, 2022 at 9:30 PM Arne Roland <A.Roland@index.de> wrote:
The comment at my end goes on:
/*
* Don't add partitioned indexes to the indexlist, since they are
* not usable by the executor. If they are unique add them to the
* partindexlist instead, to use for further pruning. If they
* aren't that either, simply skip them.
*/
"partindexlist" really made me think about a list of "partial indexes"
for some reason. I think maybe "partedindexlist" is what you are
looking for; "parted" is commonly used as short for "partitioned" when
naming variables.
The comment only mentions "further pruning" as to what partitioned
indexes are to be remembered in RelOptInfo, but it's not clear what
that means. It may help to be more specific.
Finally, I don't understand why we need a separate field to store
indexes found in partitioned base relations. AFAICS, nothing but the
sites you are interested in (relation_has_unique_index_for() and
rel_supports_distinctness()) would ever bother to look at a
partitioned base relation's indexlist. Do you think putting them into
in indexlist might break something?
Regarding the semantics: This is sort of what the statement checks for (skip for inhparent, if not unique or not partitioned index), i.e. it checks for the case, where the index shouldn't be added to either list.
Side note: I personally think the name inhparent is mildly confusing, since it's not really about inheritance. I don't have a significantly better idea though.
Partitioned tables are "inheritance parent", so share the same code as
what traditional inheritance parents have always used for planning.
--
Amit Langote
EDB: http://www.enterprisedb.com
Hi!
From: Amit Langote <amitlangote09@gmail.com>
Sent: Tuesday, January 25, 2022 09:04
Subject: Re: missing indexes in indexlist with partitioned tables
[...]
"partindexlist" really made me think about a list of "partial indexes"
for some reason. I think maybe "partedindexlist" is what you are
looking for; "parted" is commonly used as short for "partitioned" when
naming variables.The comment only mentions "further pruning" as to what partitioned
indexes are to be remembered in RelOptInfo, but it's not clear what
that means. It may help to be more specific.
Thanks for the feedback! I've changed that. The current version is attached.
Finally, I don't understand why we need a separate field to store
indexes found in partitioned base relations. AFAICS, nothing but the
sites you are interested in (relation_has_unique_index_for() and
rel_supports_distinctness()) would ever bother to look at a
partitioned base relation's indexlist. Do you think putting them into
in indexlist might break something?
I have thought about that before. AFAICT there is nothing in core, which breaks. However I am not sure, I want to mix those two kinds of index nodes. First of all the structure is different, partedIndexes don't have physical attributes after all. This is technical implementation detail relating to the current promise, that entries of the indexlist are indexes we can use. And by use, I mean use for statistics or the executor.
I'm more concerned about future changes regarding the order and optimization of processing harder here. The order in which we do things in the planner is a bit messy, and I wouldn't mind seeing details about that change. Looking at the current wacky order in the optimizer, I'm not convinced, that nothing will want to have a look at the indexlist, before partitioned tables are unpacked.
Since it would be easy to introduce this new variable later, wouldn't mind adding it to the indexlist directly for now. But changing the underlying promise of what it contains, seems noteworthy and more intrusive to me.
Side note: I personally think the name inhparent is mildly confusing, since it's not really about inheritance. I don't have a significantly better idea though.
Partitioned tables are "inheritance parent", so share the same code as
what traditional inheritance parents have always used for planning.
I recall that manual partitioning via inheritance, that was cumbersome. Though that minor historical detail was not, what I was referring to. There are a lot of other cases, that cause us to set inhparent. IIRC we use this flag in some ddl commands, which have nothing to do with inheritance. It essentially is used as a variant to skip the indexlist creation. If such hacks weren't there, we could simply check for the relkind and indisunique.
Regards
Arne
Attachments:
0005-partIndexlistClean.patchtext/x-patch; name=0005-partIndexlistClean.patchDownload+127-70
Hi!
Attached a rebased version of the patch.
Regards
Arne
________________________________
From: Arne Roland
Sent: Monday, January 31, 2022 19:14
To: Amit Langote
Cc: Zhihong Yu; Alvaro Herrera; Julien Rouhaud; pgsql-hackers
Subject: Re: missing indexes in indexlist with partitioned tables
Hi!
From: Amit Langote <amitlangote09@gmail.com>
Sent: Tuesday, January 25, 2022 09:04
Subject: Re: missing indexes in indexlist with partitioned tables
[...]
"partindexlist" really made me think about a list of "partial indexes"
for some reason. I think maybe "partedindexlist" is what you are
looking for; "parted" is commonly used as short for "partitioned" when
naming variables.The comment only mentions "further pruning" as to what partitioned
indexes are to be remembered in RelOptInfo, but it's not clear what
that means. It may help to be more specific.
Thanks for the feedback! I've changed that. The current version is attached.
Finally, I don't understand why we need a separate field to store
indexes found in partitioned base relations. AFAICS, nothing but the
sites you are interested in (relation_has_unique_index_for() and
rel_supports_distinctness()) would ever bother to look at a
partitioned base relation's indexlist. Do you think putting them into
in indexlist might break something?
I have thought about that before. AFAICT there is nothing in core, which breaks. However I am not sure, I want to mix those two kinds of index nodes. First of all the structure is different, partedIndexes don't have physical attributes after all. This is technical implementation detail relating to the current promise, that entries of the indexlist are indexes we can use. And by use, I mean use for statistics or the executor.
I'm more concerned about future changes regarding the order and optimization of processing harder here. The order in which we do things in the planner is a bit messy, and I wouldn't mind seeing details about that change. Looking at the current wacky order in the optimizer, I'm not convinced, that nothing will want to have a look at the indexlist, before partitioned tables are unpacked.
Since it would be easy to introduce this new variable later, wouldn't mind adding it to the indexlist directly for now. But changing the underlying promise of what it contains, seems noteworthy and more intrusive to me.
Side note: I personally think the name inhparent is mildly confusing, since it's not really about inheritance. I don't have a significantly better idea though.
Partitioned tables are "inheritance parent", so share the same code as
what traditional inheritance parents have always used for planning.
I recall that manual partitioning via inheritance, that was cumbersome. Though that minor historical detail was not, what I was referring to. There are a lot of other cases, that cause us to set inhparent. IIRC we use this flag in some ddl commands, which have nothing to do with inheritance. It essentially is used as a variant to skip the indexlist creation. If such hacks weren't there, we could simply check for the relkind and indisunique.
Regards
Arne
Attachments:
0006-partIndexlistClean.patchtext/x-patch; name=0006-partIndexlistClean.patchDownload+107-49
On Wed, 3 Aug 2022 at 11:07, Arne Roland <A.Roland@index.de> wrote:
Attached a rebased version of the patch.
Firstly, I agree that we should fix the issue of join removals not
working with partitioned tables.
I had a quick look over this and the first thing that I thought was
the same as what Amit mentioned in:
On Tue, 25 Jan 2022 at 21:04, Amit Langote <amitlangote09@gmail.com> wrote:
Finally, I don't understand why we need a separate field to store
indexes found in partitioned base relations. AFAICS, nothing but the
sites you are interested in (relation_has_unique_index_for() and
rel_supports_distinctness()) would ever bother to look at a
partitioned base relation's indexlist. Do you think putting them into
in indexlist might break something?
I kinda disagree with Alvaro's fix in 05fb5d661. I think indexlist is
the place to store these details. That commit added the following
comment:
/*
* Ignore partitioned indexes, since they are not usable for
* queries.
*/
But neither are hypothetical indexes either, yet they're added to
RelOptInfo.indexlist.
I think the patch should be changed so that the existing list is used
and we find another fix for the problems Alvaro fixed in 05fb5d661.
Unfortunately, there was no discussion marked on that commit message,
so it's not quite clear what the problem was. I'm unsure if there was
anything other than CLUSTER that was broken. I see that cfdd03f45
added CLUSTER for partitioned tables in v15. I think the patch would
need to go over the usages of RelOptInfo.indexlist to make sure that
we don't need to add any further conditions to skip their usage for
partitioned tables.
I wrote the attached patch as I wanted to see what would break if we
did this. The only problem I got from running make check was in
get_actual_variable_range(), so I just changed that so it returns
false when the given rel is a partitioned table. I only quickly did
the changes to get_relation_info() and didn't give much thought to
what the can* bool flags should be set to. I just mostly skipped all
that code because it was crashing on
relation->rd_tableam->scan_bitmap_next_block due to the rd_tableam
being NULL.
Also, just a friendly tip, Arne; I saw you named your patch 0006 for
version 6. You'll see many 000n patches around the list, but those
are generally done with git format-patch. That number normally means
the patch in the patch series, not the version of the patch. I'm not
sure if it'll help any, but my workflow for writing new patches
against master tends to be:
git checkout master
git checkout -b some_feature_branch
# write some code
git commit -a
# maybe more code
git commit -a
git format-patch -v1 master
That'll create v1-0001 and v1-0002 patches. When I'm onto v2, I just
change the version number from -v1 to -v2.
David
Attachments:
v7_include_partitioned_indexes_in_indexlist.patchtext/plain; charset=US-ASCII; name=v7_include_partitioned_indexes_in_indexlist.patchDownload+135-115
On 2022-Sep-16, David Rowley wrote:
I kinda disagree with Alvaro's fix in 05fb5d661. I think indexlist is
the place to store these details. That commit added the following
comment:/*
* Ignore partitioned indexes, since they are not usable for
* queries.
*/But neither are hypothetical indexes either, yet they're added to
RelOptInfo.indexlist.I think the patch should be changed so that the existing list is used
and we find another fix for the problems Alvaro fixed in 05fb5d661.
Unfortunately, there was no discussion marked on that commit message,
so it's not quite clear what the problem was. I'm unsure if there was
anything other than CLUSTER that was broken.
After a bit of trawling through the archives, I found it here:
/messages/by-id/20180124162006.pmapfiznhgngwtjf@alvherre.pgsql
I think there was insufficient discussion and you're probably right that
it wasn't the best fix. I don't object to finding another fix.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2022-Sep-16, David Rowley wrote:
I kinda disagree with Alvaro's fix in 05fb5d661. I think indexlist is
the place to store these details. That commit added the following
comment:/*
* Ignore partitioned indexes, since they are not usable for
* queries.
*/But neither are hypothetical indexes either, yet they're added to
RelOptInfo.indexlist.I think the patch should be changed so that the existing list is used
and we find another fix for the problems Alvaro fixed in 05fb5d661.
Unfortunately, there was no discussion marked on that commit message,
so it's not quite clear what the problem was. I'm unsure if there was
anything other than CLUSTER that was broken.
After a bit of trawling through the archives, I found it here:
/messages/by-id/20180124162006.pmapfiznhgngwtjf@alvherre.pgsql
I think there was insufficient discussion and you're probably right that
it wasn't the best fix. I don't object to finding another fix.
FWIW, I don't see any big problem with what you did. We'd need to
do something more like what David suggests if the planner ever has
a reason to consider partitioned indexes. But as long as it does
not, why expend the time to build data structures representing them?
And we'd have to add code in quite a few places to ignore them,
once they're in indexlist.
regards, tom lane