pgsql: Clarify use of temporary tables within partition trees

Started by Michael Paquierover 7 years ago28 messages
#1Michael Paquier
michael@paquier.xyz

Clarify use of temporary tables within partition trees

Since their introduction, partition trees have been a bit lossy
regarding temporary relations. Inheritance trees respect the following
patterns:
1) a child relation can be temporary if the parent is permanent.
2) a child relation can be temporary if the parent is temporary.
3) a child relation cannot be permanent if the parent is temporary.
4) The use of temporary relations also imply that when both parent and
child need to be from the same sessions.

Partitions share many similar patterns with inheritance, however the
handling of the partition bounds make the situation a bit tricky for
case 1) as the partition code bases a lot of its lookup code upon
PartitionDesc which does not really look after relpersistence. This
causes for example a temporary partition created by session A to be
visible by another session B, preventing this session B to create an
extra partition which overlaps with the temporary one created by A with
a non-intuitive error message. There could be use-cases where mixing
permanent partitioned tables with temporary partitions make sense, but
that would be a new feature. Partitions respect 2), 3) and 4) already.

It is a bit depressing to see those error checks happening in
MergeAttributes() whose purpose is different, but that's left as future
refactoring work.

Back-patch down to 10, which is where partitioning has been introduced,
except that default partitions do not apply there. Documentation also
includes limitations related to the use of temporary tables with
partition trees.

Reported-by: David Rowley
Author: Amit Langote, Michael Paquier
Reviewed-by: Ashutosh Bapat, Amit Langote, Michael Paquier
Discussion: /messages/by-id/CAKJS1f94Ojk0og9GMkRHGt8wHTW=ijq5KzJKuoBoqWLwSVwGmw@mail.gmail.com

Branch
------
REL_10_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/5862174ec78a173c41710c5ef33feb993ae45cc7

Modified Files
--------------
doc/src/sgml/ddl.sgml | 10 ++++++++++
src/backend/commands/tablecmds.c | 21 +++++++++++++++++++++
src/test/regress/expected/alter_table.out | 16 ++++++++++++++++
src/test/regress/expected/create_table.out | 10 ++++++++++
src/test/regress/expected/foreign_data.out | 12 ++++++++++++
src/test/regress/sql/alter_table.sql | 15 +++++++++++++++
src/test/regress/sql/create_table.sql | 9 +++++++++
src/test/regress/sql/foreign_data.sql | 11 +++++++++++
8 files changed, 104 insertions(+)

#2David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#1)
1 attachment(s)
Re: pgsql: Clarify use of temporary tables within partition trees

On 20 June 2018 at 13:53, Michael Paquier <michael@paquier.xyz> wrote:

Clarify use of temporary tables within partition trees

Hi,

Thanks for committing this fix.

I think slightly more should have been done. There's still some dead
code in expand_partitioned_rtentry that I think should be removed.

The code won't cost much performance wise, but it may mislead someone
into thinking they can add some other condition there to skip
partitions.

The attached removes it.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

remove_dead_code_from_expand_partitioned_rtentry.patchapplication/octet-stream; name=remove_dead_code_from_expand_partitioned_rtentry.patchDownload
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 2d470240d5..4a9f54f1b0 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1681,7 +1681,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 	int			i;
 	RangeTblEntry *childrte;
 	Index		childRTindex;
-	bool		has_child = false;
 	PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
 
 	check_stack_depth();
@@ -1707,6 +1706,13 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 									top_parentrc, parentrel,
 									appinfos, &childrte, &childRTindex);
 
+	/* if there are no partitions then treat this as non-inheritance case. */
+	if (partdesc->nparts == 0)
+	{
+		parentrte->inh = false;
+		return;
+	}
+
 	for (i = 0; i < partdesc->nparts; i++)
 	{
 		Oid			childOID = partdesc->oids[i];
@@ -1715,16 +1721,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Open rel; we already have required locks */
 		childrel = heap_open(childOID, NoLock);
 
-		/* As in expand_inherited_rtentry, skip non-local temp tables */
-		if (RELATION_IS_OTHER_TEMP(childrel))
-		{
-			heap_close(childrel, lockmode);
-			continue;
-		}
-
-		/* We have a real partition. */
-		has_child = true;
-
 		expand_single_inheritance_child(root, parentrte, parentRTindex,
 										parentrel, top_parentrc, childrel,
 										appinfos, &childrte, &childRTindex);
@@ -1738,14 +1734,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Close child relation, but keep locks */
 		heap_close(childrel, NoLock);
 	}
-
-	/*
-	 * If the partitioned table has no partitions or all the partitions are
-	 * temporary tables from other backends, treat this as non-inheritance
-	 * case.
-	 */
-	if (!has_child)
-		parentrte->inh = false;
 }
 
 /*
#3Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#2)
Re: pgsql: Clarify use of temporary tables within partition trees

On Fri, Jun 29, 2018 at 5:13 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 20 June 2018 at 13:53, Michael Paquier <michael@paquier.xyz> wrote:

Clarify use of temporary tables within partition trees

Thanks for committing this fix.

I think slightly more should have been done. There's still some dead
code in expand_partitioned_rtentry that I think should be removed.

The code won't cost much performance wise, but it may mislead someone
into thinking they can add some other condition there to skip
partitions.

The attached removes it.

I'd rather keep an elog(ERROR) than completely remove the check.

Also, for the record, I think the subject line of Michael's commit
message was pretty unclear about what it was actually doing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#3)
Re: pgsql: Clarify use of temporary tables within partition trees

On Mon, Jul 02, 2018 at 02:07:37PM -0400, Robert Haas wrote:

I'd rather keep an elog(ERROR) than completely remove the check.

+1.

Also, for the record, I think the subject line of Michael's commit
message was pretty unclear about what it was actually doing.

How would you formulate it? Perhaps the error message did not emphasize
enough on the fast that it actually blocked a behavior, say "Block mix
of temporary and permanent relations in partition trees" or such?
--
Michael

#5David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#4)
1 attachment(s)
Re: pgsql: Clarify use of temporary tables within partition trees

On 3 July 2018 at 10:16, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 02, 2018 at 02:07:37PM -0400, Robert Haas wrote:

I'd rather keep an elog(ERROR) than completely remove the check.

+1.

Attached

Also, for the record, I think the subject line of Michael's commit
message was pretty unclear about what it was actually doing.

How would you formulate it? Perhaps the error message did not emphasize
enough on the fast that it actually blocked a behavior, say "Block mix
of temporary and permanent relations in partition trees" or such?

For me, reading the subject line of the commit I'd have expected a doc
change, or improved/new code comments.

This is really more "Disallow mixed temp/permanent partitioned hierarchies".

"Clarify" does not really involve a change of behaviour. It's an
explanation of what the behaviour is.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

remove_dead_code_from_expand_partitioned_rtentry_v2.patchapplication/octet-stream; name=remove_dead_code_from_expand_partitioned_rtentry_v2.patchDownload
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 2d470240d5..cc0014c2d6 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1681,7 +1681,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 	int			i;
 	RangeTblEntry *childrte;
 	Index		childRTindex;
-	bool		has_child = false;
 	PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
 
 	check_stack_depth();
@@ -1707,6 +1706,13 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 									top_parentrc, parentrel,
 									appinfos, &childrte, &childRTindex);
 
+	/* if there are no partitions then treat this as non-inheritance case. */
+	if (partdesc->nparts == 0)
+	{
+		parentrte->inh = false;
+		return;
+	}
+
 	for (i = 0; i < partdesc->nparts; i++)
 	{
 		Oid			childOID = partdesc->oids[i];
@@ -1714,16 +1720,12 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 
 		/* Open rel; we already have required locks */
 		childrel = heap_open(childOID, NoLock);
-
-		/* As in expand_inherited_rtentry, skip non-local temp tables */
+		/*
+		 * Temp partitions belonging to other sessions should have been
+		 * disallowed, but for paranoia's sake, let's double check.
+		 */
 		if (RELATION_IS_OTHER_TEMP(childrel))
-		{
-			heap_close(childrel, lockmode);
-			continue;
-		}
-
-		/* We have a real partition. */
-		has_child = true;
+			elog(ERROR, "temporary partition found belonging to another session");
 
 		expand_single_inheritance_child(root, parentrte, parentRTindex,
 										parentrel, top_parentrc, childrel,
@@ -1738,14 +1740,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Close child relation, but keep locks */
 		heap_close(childrel, NoLock);
 	}
-
-	/*
-	 * If the partitioned table has no partitions or all the partitions are
-	 * temporary tables from other backends, treat this as non-inheritance
-	 * case.
-	 */
-	if (!has_child)
-		parentrte->inh = false;
 }
 
 /*
#6Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#5)
Re: pgsql: Clarify use of temporary tables within partition trees

On Mon, Jul 2, 2018 at 8:59 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

How would you formulate it? Perhaps the error message did not emphasize
enough on the fast that it actually blocked a behavior, say "Block mix
of temporary and permanent relations in partition trees" or such?

Yes.

For me, reading the subject line of the commit I'd have expected a doc
change, or improved/new code comments.

This is really more "Disallow mixed temp/permanent partitioned hierarchies".

Yes.

"Clarify" does not really involve a change of behaviour. It's an
explanation of what the behaviour is.

And yes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#7Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#5)
1 attachment(s)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 12:59:33PM +1200, David Rowley wrote:

On 3 July 2018 at 10:16, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 02, 2018 at 02:07:37PM -0400, Robert Haas wrote:

I'd rather keep an elog(ERROR) than completely remove the check.

+1.

Attached

Okay, the patch looks logically correct to me, I just tweaked the
comments as per the attached. I would also back-patch that down to v11
to keep the code consistent with HEAD.. What do you think?
--
Michael

Attachments:

remove_dead_code_from_expand_partitioned_rtentry_v3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 2d470240d5..71e256f8fd 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1681,7 +1681,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 	int			i;
 	RangeTblEntry *childrte;
 	Index		childRTindex;
-	bool		has_child = false;
 	PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
 
 	check_stack_depth();
@@ -1707,6 +1706,16 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 									top_parentrc, parentrel,
 									appinfos, &childrte, &childRTindex);
 
+	/*
+	 * If the partitioned table has no partitions, treat this as the
+	 * non-inheritance case.
+	 */
+	if (partdesc->nparts == 0)
+	{
+		parentrte->inh = false;
+		return;
+	}
+
 	for (i = 0; i < partdesc->nparts; i++)
 	{
 		Oid			childOID = partdesc->oids[i];
@@ -1715,15 +1724,13 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Open rel; we already have required locks */
 		childrel = heap_open(childOID, NoLock);
 
-		/* As in expand_inherited_rtentry, skip non-local temp tables */
+		/*
+		 * Temporation partitions belonging to other sessions should have been
+		 * disallowed at definition, but for paranoia's sake, let's double
+		 * check.
+		 */
 		if (RELATION_IS_OTHER_TEMP(childrel))
-		{
-			heap_close(childrel, lockmode);
-			continue;
-		}
-
-		/* We have a real partition. */
-		has_child = true;
+			elog(ERROR, "temporary relation from another session found as partition");
 
 		expand_single_inheritance_child(root, parentrte, parentRTindex,
 										parentrel, top_parentrc, childrel,
@@ -1738,14 +1745,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Close child relation, but keep locks */
 		heap_close(childrel, NoLock);
 	}
-
-	/*
-	 * If the partitioned table has no partitions or all the partitions are
-	 * temporary tables from other backends, treat this as non-inheritance
-	 * case.
-	 */
-	if (!has_child)
-		parentrte->inh = false;
 }
 
 /*
#8David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#7)
Re: pgsql: Clarify use of temporary tables within partition trees

On 3 July 2018 at 16:55, Michael Paquier <michael@paquier.xyz> wrote:

Okay, the patch looks logically correct to me, I just tweaked the
comments as per the attached. I would also back-patch that down to v11
to keep the code consistent with HEAD.. What do you think?

Thanks for fixing it up. It looks fine apart from "Temporation" should
be "Temporary".

I think it should be backpatched to v11 and v10. Your original commit
went there too. I don't see any reason to do any different here than
what you did with the original commit.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#9Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#8)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 06:00:46PM +1200, David Rowley wrote:

Thanks for fixing it up. It looks fine apart from "Temporation" should
be "Temporary".

Of course, thanks.

I think it should be backpatched to v11 and v10. Your original commit
went there too. I don't see any reason to do any different here than
what you did with the original commit.

expand_partitioned_rtentry is new as of v11. Or you mean to tweak
expand_inherited_rtentry() perhaps? I am not sure that it is worth it
as the code has already diverged between 10 and 11.
--
Michael

#10David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#9)
Re: pgsql: Clarify use of temporary tables within partition trees

On 3 July 2018 at 18:11, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jul 03, 2018 at 06:00:46PM +1200, David Rowley wrote:

I think it should be backpatched to v11 and v10. Your original commit
went there too. I don't see any reason to do any different here than
what you did with the original commit.

expand_partitioned_rtentry is new as of v11. Or you mean to tweak
expand_inherited_rtentry() perhaps? I am not sure that it is worth it
as the code has already diverged between 10 and 11.

Oh right. I'd forgotten that changed in v11. I think the v10 code is
fine as is then.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#11Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Rowley (#10)
Re: pgsql: Clarify use of temporary tables within partition trees

On 2018/07/03 15:16, David Rowley wrote:

On 3 July 2018 at 18:11, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jul 03, 2018 at 06:00:46PM +1200, David Rowley wrote:

I think it should be backpatched to v11 and v10. Your original commit
went there too. I don't see any reason to do any different here than
what you did with the original commit.

expand_partitioned_rtentry is new as of v11. Or you mean to tweak
expand_inherited_rtentry() perhaps? I am not sure that it is worth it
as the code has already diverged between 10 and 11.

Oh right. I'd forgotten that changed in v11. I think the v10 code is
fine as is then.

Sorry for jumping in late here. I have a comment on the patch.

+	/* if there are no partitions then treat this as non-inheritance case. */
+	if (partdesc->nparts == 0)
+	{
+		parentrte->inh = false;
+		return;
+	}
+

Why is this not near the beginning of expand_partitioned_rtentry()?

Also, ISTM, this code would be unreachable because
expand_inherited_rtentry would not call here if the above if statement is
true, no?

I see the following two blocks in expand_inherited_rtentry before one gets
to the call to expand_partitioned_rtentry:

if (!has_subclass(parentOID))
{
/* Clear flag before returning */
rte->inh = false;
return;
}

and

if (list_length(inhOIDs) < 2)
{
/* Clear flag before returning */
rte->inh = false;
return;
}

Thanks,
Amit

#12Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#11)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 03:29:36PM +0900, Amit Langote wrote:

Why is this not near the beginning of expand_partitioned_rtentry()?

Also, ISTM, this code would be unreachable because
expand_inherited_rtentry would not call here if the above if statement is
true, no?

FWIW, I understood that the intention here is to be careful,
particularly if expand_partitioned_rtentry begins to get called from a
different code path in the future, which is something that would likely
happen. We could replace that by an assertion or even an elog(), and
change again this code in the future, now what's proposed here makes
quite some sense to me as well.
--
Michael

#13Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#11)
Re: pgsql: Clarify use of temporary tables within partition trees

Just realized something...

On 2018/07/03 15:29, Amit Langote wrote:

Sorry for jumping in late here. I have a comment on the patch.

+	/* if there are no partitions then treat this as non-inheritance case. */
+	if (partdesc->nparts == 0)
+	{
+		parentrte->inh = false;
+		return;
+	}
+

Why is this not near the beginning of expand_partitioned_rtentry()?

This one still stands I think.

Also, ISTM, this code would be unreachable because
expand_inherited_rtentry would not call here if the above if statement is
true, no?

I forgot that expand_partitioned_rtentry() will recursively call itself if
a partition is itself a partitioned table, in which case the above code helps.

Thanks,
Amit

#14Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#13)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 03:49:44PM +0900, Amit Langote wrote:

I forgot that expand_partitioned_rtentry() will recursively call itself if
a partition is itself a partitioned table, in which case the above
code helps.

Actually look at the coverage reports:
https://coverage.postgresql.org/src/backend/optimizer/prep/prepunion.c.gcov.html
1742 : /*
1743 : * If the partitioned table has no partitions or all the partitions are
1744 : * temporary tables from other backends, treat this as non-inheritance
1745 : * case.
1746 : */
1747 4920 : if (!has_child)
1748 0 : parentrte->inh = false;
1749 4920 : }

expand_partitioned_rtentry() never disables this flag on recursive calls
with a multi-level tree. Could it be possible to get a test which
closes the gap?
--
Michael

#15Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Rowley (#2)
Re: pgsql: Clarify use of temporary tables within partition trees

On 2018/07/03 17:31, Amit Langote wrote:

On 2018/07/03 16:05, Michael Paquier wrote:

On Tue, Jul 03, 2018 at 03:49:44PM +0900, Amit Langote wrote:

I forgot that expand_partitioned_rtentry() will recursively call itself if
a partition is itself a partitioned table, in which case the above
code helps.

Actually look at the coverage reports:
https://coverage.postgresql.org/src/backend/optimizer/prep/prepunion.c.gcov.html
1742 : /*
1743 : * If the partitioned table has no partitions or all the partitions are
1744 : * temporary tables from other backends, treat this as non-inheritance
1745 : * case.
1746 : */
1747 4920 : if (!has_child)
1748 0 : parentrte->inh = false;
1749 4920 : }

expand_partitioned_rtentry() never disables this flag on recursive calls
with a multi-level tree. Could it be possible to get a test which
closes the gap?

I guess that it will be hard as you mentioned before on the thread that
led to this commit. We can't write regression tests which require using
temporary partitions from other sessions.

Anyway, I just wanted to say that I was wrong when I said that the block
added by the patch is unreachable. It *is* reachable for multi-level
partitioning. For example, it will execute in the following case:

create table p (a int, b int) partition by list (a);
create table pd partition of p default partition by list (b);
select * from p;

expand_partitioned_rtentry will get called twice and the newly added code
would result in early return from the function in the 2nd invocation which
is for 'pd'.

But,

1. I still insist that it's better for the newly added code to be near the
top of the function body than in the middle, which brings me to...

2. While we're at fixing the code around here, I think we should think
about trying to get rid of the *non-dead* code that produces a structure
that isn't used anywhere, which I was under the impression, 0a480502b09
[1] already did (cc'ing Ashutosh). To clarify, we still unnecessarily
create a "duplicate" RTE for partitioned tables in a partition tree
(non-leaf tables) in its role as a child. So, for the above query, there
end up being created 4 entries in the query's range table (2 for 'p' and 2
for 'pd'). That makes sense for plain inheritance, because even the root
parent table in a plain inheritance tree is a regular table containing
data. That's not true for partition inheritance trees, where non-leaf
tables contain no data, so we don't create a plan to scan them (see
d3cc37f1d80 [2]), which in turn means we don't need to create the
redundant "duplicate" child RTEs for them either.

See attached my delta patch to address both 1 and 2.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0a480502b09

[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d3cc37f1d80

For some reason, ML address got removed from the list of address when
sending the above message.

Thanks,
Amit

#16David Rowley
david.rowley@2ndquadrant.com
In reply to: David Rowley (#2)
Re: pgsql: Clarify use of temporary tables within partition trees

(re-adding committers list)

On 3 July 2018 at 20:31, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

1. I still insist that it's better for the newly added code to be near the
top of the function body than in the middle, which brings me to...

That will cause the AppendRelInfo not to be built for a partitioned
table with no partitions. I'm not personally certain that doing that
comes without consequence. Are you certain of that? If not 100%,
then I think that's more of a task for PG12. I'm trying to propose
removing dead code for PG11 and on.

My current thoughts are that moving this test up a few lines is
unlikely to improve performance in a real-world situation, so it does
not seem worth the risk for a backpatch to me.

2. While we're at fixing the code around here, I think we should think
about trying to get rid of the *non-dead* code that produces a structure
that isn't used anywhere, which I was under the impression, 0a480502b09
[1] already did (cc'ing Ashutosh). To clarify, we still unnecessarily
create a "duplicate" RTE for partitioned tables in a partition tree
(non-leaf tables) in its role as a child. So, for the above query, there
end up being created 4 entries in the query's range table (2 for 'p' and 2
for 'pd'). That makes sense for plain inheritance, because even the root
parent table in a plain inheritance tree is a regular table containing
data. That's not true for partition inheritance trees, where non-leaf
tables contain no data, so we don't create a plan to scan them (see
d3cc37f1d80 [2]), which in turn means we don't need to create the
redundant "duplicate" child RTEs for them either.

See attached my delta patch to address both 1 and 2.

I recently saw this too and wondered about it. I then wrote a patch to
just have it create a single RangeTblEntry for each partitioned table.
The tests all passed.

I'd categorise this one the same as I have #1 above, i.e. not
backpatch material. It seems like something useful to look into for
v12 though. I assumed this was done for a reason and that I just
didn't understand what that reason was. I don't recall any comments to
explain the reason why we build two RangeTblEntrys for each
partitioned table.

In light of what Amit has highlighted, I'm still standing by the v3
patch assuming the typo is fixed.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#17Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#16)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 09:05:41PM +1200, David Rowley wrote:

[...]

I'd categorise this one the same as I have #1 above, i.e. not
backpatch material. It seems like something useful to look into for
v12 though. I assumed this was done for a reason and that I just
didn't understand what that reason was. I don't recall any comments to
explain the reason why we build two RangeTblEntrys for each
partitioned table.

I agree. Please let's keep v11 stable, and discuss further more on
future optimizations like the previous two items for v12, which has
plenty of time to be broken.

In light of what Amit has highlighted, I'm still standing by the v3
patch assuming the typo is fixed.

Yeah. Actually I'd like to add a test as well to test the recursion
call of expand_partitioned_rtentry. If you have an idea, please let me
know or I'll figure out one by myself and add it probably in
create_table.sql.
--
Michael

#18David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#17)
Re: pgsql: Clarify use of temporary tables within partition trees

On 3 July 2018 at 21:15, Michael Paquier <michael@paquier.xyz> wrote:

Yeah. Actually I'd like to add a test as well to test the recursion
call of expand_partitioned_rtentry. If you have an idea, please let me
know or I'll figure out one by myself and add it probably in
create_table.sql.

What specifically do you want to test? There are plenty of partitioned
tests with sub-partitioned tables. Going by [1]https://coverage.postgresql.org/src/backend/optimizer/prep/prepunion.c.gcov.html, there's no shortage
of coverage.

Of course, the dead code I'm proposing we remove is not covered.
There's no way to cover it... it's dead.

[1]: https://coverage.postgresql.org/src/backend/optimizer/prep/prepunion.c.gcov.html

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#19Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Rowley (#16)
Re: pgsql: Clarify use of temporary tables within partition trees

On 2018/07/03 18:05, David Rowley wrote:

(re-adding committers list)

On 3 July 2018 at 20:31, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

1. I still insist that it's better for the newly added code to be near the
top of the function body than in the middle, which brings me to...

That will cause the AppendRelInfo not to be built for a partitioned
table with no partitions. I'm not personally certain that doing that
comes without consequence. Are you certain of that? If not 100%,
then I think that's more of a task for PG12. I'm trying to propose
removing dead code for PG11 and on.

What to use the AppendRelInfo for if there is no partition?

My current thoughts are that moving this test up a few lines is
unlikely to improve performance in a real-world situation, so it does
not seem worth the risk for a backpatch to me.

It's just that needless structures get created which we could avoid, and
once I realized that, I found out about the 2 below.

If you look at all of what expand_single_inheritance_child does, I'm a bit
surprised you don't think we should try to avoid calling it if we don't
need any of that processing.

2. While we're at fixing the code around here, I think we should think
about trying to get rid of the *non-dead* code that produces a structure
that isn't used anywhere, which I was under the impression, 0a480502b09
[1] already did (cc'ing Ashutosh). To clarify, we still unnecessarily
create a "duplicate" RTE for partitioned tables in a partition tree
(non-leaf tables) in its role as a child. So, for the above query, there
end up being created 4 entries in the query's range table (2 for 'p' and 2
for 'pd'). That makes sense for plain inheritance, because even the root
parent table in a plain inheritance tree is a regular table containing
data. That's not true for partition inheritance trees, where non-leaf
tables contain no data, so we don't create a plan to scan them (see
d3cc37f1d80 [2]), which in turn means we don't need to create the
redundant "duplicate" child RTEs for them either.

See attached my delta patch to address both 1 and 2.

I recently saw this too and wondered about it. I then wrote a patch to
just have it create a single RangeTblEntry for each partitioned table.
The tests all passed.

Which is how I was thinking the commit 0a480502b09 had done it, but
apparently not. That's a commit in PG11. None of what I'm suggesting is
for PG 10.

I'd categorise this one the same as I have #1 above, i.e. not
backpatch material. It seems like something useful to look into for
v12 though. I assumed this was done for a reason and that I just
didn't understand what that reason was. I don't recall any comments to
explain the reason why we build two RangeTblEntrys for each
partitioned table.

Maybe because that's what's done for the root parent in a plain
inheritance hierarchy, which is always a plain table. In that case, one
RTE is for its role as the parent (with rte->inh = true) and another is
for its role as a child (with rte->inh = false). The former is processed
as an append rel and the latter as a plain rel that will get assigned scan
paths such as SeqScan, etc.

For partitioned table parent(s), we need only the former because they can
only be processed as append rels. That's why I'm proposing we could
adjust the commit in PG 11 that introduced expand_partitioned_rtentry such
that the duplicate child RTE and other objects are not created.

Thanks,
Amit

#20Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#18)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 09:30:20PM +1200, David Rowley wrote:

On 3 July 2018 at 21:15, Michael Paquier <michael@paquier.xyz> wrote:

Yeah. Actually I'd like to add a test as well to test the recursion
call of expand_partitioned_rtentry. If you have an idea, please let me
know or I'll figure out one by myself and add it probably in
create_table.sql.

What specifically do you want to test? There are plenty of partitioned
tests with sub-partitioned tables. Going by [1], there's no shortage
of coverage.

Of course, the dead code I'm proposing we remove is not covered.
There's no way to cover it... it's dead.

Your patch removes this part:
- /*
- * If the partitioned table has no partitions or all the partitions are
- * temporary tables from other backends, treat this as non-inheritance
- * case.
- */
- if (!has_child)
- parentrte->inh = false;

And adds this equivalent part:
+   /*
+    * If the partitioned table has no partitions, treat this as the
+    * non-inheritance case.
+    */
+   if (partdesc->nparts == 0)
+   {
+       parentrte->inh = false;
+       return;
+   }

As far as I can see from the coverage report, the former is not tested,
and corresponds to the case of a partition leaf which is itself
partitioned but has no partitions, and the new portion is equivalent to
the part removed. That ought to be tested, particularly as Amit
mentions that there could be improvements with moving it around in
future versions.
--
Michael

#21David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#20)
1 attachment(s)
Re: pgsql: Clarify use of temporary tables within partition trees

On 3 July 2018 at 21:53, Michael Paquier <michael@paquier.xyz> wrote:

Your patch removes this part:
- /*
- * If the partitioned table has no partitions or all the partitions are
- * temporary tables from other backends, treat this as non-inheritance
- * case.
- */
- if (!has_child)
- parentrte->inh = false;

And adds this equivalent part:
+   /*
+    * If the partitioned table has no partitions, treat this as the
+    * non-inheritance case.
+    */
+   if (partdesc->nparts == 0)
+   {
+       parentrte->inh = false;
+       return;
+   }

As far as I can see from the coverage report, the former is not tested,
and corresponds to the case of a partition leaf which is itself
partitioned but has no partitions, and the new portion is equivalent to
the part removed. That ought to be tested, particularly as Amit
mentions that there could be improvements with moving it around in
future versions.

Oh okay. Yeah, you can hit that with a partitionless sub-partitioned table.

I've added a test in the attached v4.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

remove_dead_code_from_expand_partitioned_rtentry_v4.patchapplication/octet-stream; name=remove_dead_code_from_expand_partitioned_rtentry_v4.patchDownload
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 2d470240d5..7d75e1eda9 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1681,7 +1681,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 	int			i;
 	RangeTblEntry *childrte;
 	Index		childRTindex;
-	bool		has_child = false;
 	PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
 
 	check_stack_depth();
@@ -1707,6 +1706,16 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 									top_parentrc, parentrel,
 									appinfos, &childrte, &childRTindex);
 
+	/*
+	 * If the partitioned table has no partitions, treat this as the
+	 * non-inheritance case.
+	 */
+	if (partdesc->nparts == 0)
+	{
+		parentrte->inh = false;
+		return;
+	}
+
 	for (i = 0; i < partdesc->nparts; i++)
 	{
 		Oid			childOID = partdesc->oids[i];
@@ -1715,15 +1724,13 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Open rel; we already have required locks */
 		childrel = heap_open(childOID, NoLock);
 
-		/* As in expand_inherited_rtentry, skip non-local temp tables */
+		/*
+		 * Temporary partitions belonging to other sessions should have been
+		 * disallowed at definition, but for paranoia's sake, let's double
+		 * check.
+		 */
 		if (RELATION_IS_OTHER_TEMP(childrel))
-		{
-			heap_close(childrel, lockmode);
-			continue;
-		}
-
-		/* We have a real partition. */
-		has_child = true;
+			elog(ERROR, "temporary relation from another session found as partition");
 
 		expand_single_inheritance_child(root, parentrte, parentRTindex,
 										parentrel, top_parentrc, childrel,
@@ -1738,14 +1745,6 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
 		/* Close child relation, but keep locks */
 		heap_close(childrel, NoLock);
 	}
-
-	/*
-	 * If the partitioned table has no partitions or all the partitions are
-	 * temporary tables from other backends, treat this as non-inheritance
-	 * case.
-	 */
-	if (!has_child)
-		parentrte->inh = false;
 }
 
 /*
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 1fab5136d2..319d03ca0b 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -951,3 +951,18 @@ select * from (values (2),(null),(1)) v(k) where k = k;
  1
 (2 rows)
 
+-- test partitioned tables behave sanely when there are no leaf partitions
+create table list_parted_tbl (a int,b int) partition by list (a);
+create table list_parted_tbl1 partition of list_parted_tbl for values in(1) partition by list(b);
+select * from list_parted_tbl;
+ a | b 
+---+---
+(0 rows)
+
+explain (costs off) select * from list_parted_tbl;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
diff --git a/src/test/regress/sql/select.sql b/src/test/regress/sql/select.sql
index c80429e7d0..b93c0738f2 100644
--- a/src/test/regress/sql/select.sql
+++ b/src/test/regress/sql/select.sql
@@ -254,3 +254,9 @@ drop function sillysrf(int);
 -- (see bug #5084)
 select * from (values (2),(null),(1)) v(k) where k = k order by k;
 select * from (values (2),(null),(1)) v(k) where k = k;
+
+-- test partitioned tables behave sanely when there are no leaf partitions
+create table list_parted_tbl (a int,b int) partition by list (a);
+create table list_parted_tbl1 partition of list_parted_tbl for values in(1) partition by list(b);
+select * from list_parted_tbl;
+explain (costs off) select * from list_parted_tbl;
#22Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#21)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 11:16:55PM +1200, David Rowley wrote:

Oh okay. Yeah, you can hit that with a partitionless sub-partitioned
table.

Thanks for the patch and fixing the typo ;)

+create table list_parted_tbl (a int,b int) partition by list (a);
+create table list_parted_tbl1 partition of list_parted_tbl for values
in(1) partition by list(b);
+select * from list_parted_tbl;
+explain (costs off) select * from list_parted_tbl;

I am not sure if it is much interesting to keep around this table set
for pg_upgrade, so I would drop it. Except for that, the result looks
fine. I'll double-check and wrap it tomorrow on HEAD and REL_11_STABLE.
The optimizations mentioned sound interesting, though I would recommend
to not risk the stability of v11 at this point, so let's keep them for
v12~.
--
Michael

#23Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#19)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 3, 2018 at 3:20 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Maybe because that's what's done for the root parent in a plain
inheritance hierarchy, which is always a plain table. In that case, one
RTE is for its role as the parent (with rte->inh = true) and another is
for its role as a child (with rte->inh = false). The former is processed
as an append rel and the latter as a plain rel that will get assigned scan
paths such as SeqScan, etc.

Yes that's true. I remember we had some discussion about these two
RTEs and that the one marked as child was extraneous, but I can not
spot that in the mail thread. It's one of the things we did as part of
partition-wise join and that thread is pretty long. It was probably
kept without changing it because a. we wanted to get the bigger patch
committed without breaking anything and this was a small thing which
we couldn't decide whether was safe or not b. if it was safe not to
create that entry, it should have been done in a commit which avoided
creating scans for partitioned tables, but didn't

For partitioned table parent(s), we need only the former because they can
only be processed as append rels. That's why I'm proposing we could
adjust the commit in PG 11 that introduced expand_partitioned_rtentry such
that the duplicate child RTE and other objects are not created.

FWIW, I think this would be ok before beta, but not now. I see it as a
PG12 item.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ashutosh Bapat (#23)
Re: pgsql: Clarify use of temporary tables within partition trees

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

On Tue, Jul 3, 2018 at 3:20 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Maybe because that's what's done for the root parent in a plain
inheritance hierarchy, which is always a plain table. In that case, one
RTE is for its role as the parent (with rte->inh = true) and another is
for its role as a child (with rte->inh = false). The former is processed
as an append rel and the latter as a plain rel that will get assigned scan
paths such as SeqScan, etc.

Yes that's true.

Yes, that's exactly why there are two RTEs for the parent table in normal
inheritance cases. I concur with the idea that it shouldn't be necessary
to create a child RTE for a partitioning parent table --- we should really
only need the appendrel RTE plus RTEs for tables that will be scanned.

However, it's not clear to me that this is a trivial change for multilevel
partitioning cases. Do we need RTEs for the intermediate nonleaf levels?
In the abstract, the planner and executor might not need them. But the
code that deals with partitioning constraint management might expect them
to exist.

Another point is that executor-start-time privilege checking is driven
off the RTE list, so we need an RTE for any table that requires priv
checks, so we might need RTEs for intermediate levels just for that.

Also, IIRC, the planner sets up the near-duplicate RTEs for inheritance
cases so that only one of them is privilege-checked. Be careful that
you don't end up with zero privilege checks on the partition root :-(

regards, tom lane

#25Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#23)
Re: pgsql: Clarify use of temporary tables within partition trees

On 2018/07/03 21:15, Ashutosh Bapat wrote:

On Tue, Jul 3, 2018 at 3:20 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Maybe because that's what's done for the root parent in a plain
inheritance hierarchy, which is always a plain table. In that case, one
RTE is for its role as the parent (with rte->inh = true) and another is
for its role as a child (with rte->inh = false). The former is processed
as an append rel and the latter as a plain rel that will get assigned scan
paths such as SeqScan, etc.

Yes that's true. I remember we had some discussion about these two
RTEs and that the one marked as child was extraneous, but I can not
spot that in the mail thread. It's one of the things we did as part of
partition-wise join and that thread is pretty long. It was probably
kept without changing it because a. we wanted to get the bigger patch
committed without breaking anything and this was a small thing which
we couldn't decide whether was safe or not b. if it was safe not to
create that entry, it should have been done in a commit which avoided
creating scans for partitioned tables, but didn't

About (b), maybe yes. Perhaps, I/we decided to put it off until we got
around to writing a patch for making inheritance expansion step-wise for
partitioned tables. We didn't get to that until 11dev branch opened up
for development. The patch that I had proposed for step-wise expansion
was such that the duplicate RTE for parents would not get created, but it
wasn't committed. That was one of the things that was different from your
patch for step-wise expansion which was committed.

Thanks,
Amit

#26Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#24)
Re: pgsql: Clarify use of temporary tables within partition trees

On 2018/07/04 1:21, Tom Lane wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

On Tue, Jul 3, 2018 at 3:20 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Maybe because that's what's done for the root parent in a plain
inheritance hierarchy, which is always a plain table. In that case, one
RTE is for its role as the parent (with rte->inh = true) and another is
for its role as a child (with rte->inh = false). The former is processed
as an append rel and the latter as a plain rel that will get assigned scan
paths such as SeqScan, etc.

Yes that's true.

Yes, that's exactly why there are two RTEs for the parent table in normal
inheritance cases. I concur with the idea that it shouldn't be necessary
to create a child RTE for a partitioning parent table --- we should really
only need the appendrel RTE plus RTEs for tables that will be scanned.

However, it's not clear to me that this is a trivial change for multilevel
partitioning cases. Do we need RTEs for the intermediate nonleaf levels?
In the abstract, the planner and executor might not need them. But the
code that deals with partitioning constraint management might expect them
to exist.

We do need RTEs for *all* parents (non-leaf tables) in a partition tree,
each of which we need to process as an append rel (partition pruning is
invoked separately for each non-leaf table). What we *don't* need for
each of them is the duplicate RTE with inh = false, because we don't need
to process them as plain rels.

Another point is that executor-start-time privilege checking is driven
off the RTE list, so we need an RTE for any table that requires priv
checks, so we might need RTEs for intermediate levels just for that.

Also, IIRC, the planner sets up the near-duplicate RTEs for inheritance
cases so that only one of them is privilege-checked.

Yeah, I see in prepunion.c that the child RTE's requiredPerms is set to 0,
with the following comment justifying it:

"Also, set requiredPerms to zero since all required permissions checks are
done on the original RTE."

Be careful that
you don't end up with zero privilege checks on the partition root :-(

The original RTE belongs to the partition root and it's already in the
range table, so its privileges *are* checked.

Thanks,
Amit

#27Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#22)
Re: pgsql: Clarify use of temporary tables within partition trees

On Tue, Jul 03, 2018 at 08:31:27PM +0900, Michael Paquier wrote:

I am not sure if it is much interesting to keep around this table set
for pg_upgrade, so I would drop it. Except for that, the result looks
fine. I'll double-check and wrap it tomorrow on HEAD and REL_11_STABLE.
The optimizations mentioned sound interesting, though I would recommend
to not risk the stability of v11 at this point, so let's keep them for
v12~.

So at the end I have dropped the table from the test, and pushed the
patch to HEAD and REL_11_STABLE. Thanks David for the patch, and others
for the reviews.
--
Michael

#28David Rowley
david.rowley@2ndquadrant.com
In reply to: Michael Paquier (#27)
Re: pgsql: Clarify use of temporary tables within partition trees

On 4 July 2018 at 13:48, Michael Paquier <michael@paquier.xyz> wrote:

So at the end I have dropped the table from the test, and pushed the
patch to HEAD and REL_11_STABLE. Thanks David for the patch, and others
for the reviews.

Thanks for pushing it.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services