Feedback on table expansion hook (including patch)

Started by Erik Nordströmover 5 years ago13 messages
#1Erik Nordström
erik@timescale.com
1 attachment(s)

Hi,

I am looking for feedback on the possibility of adding a table expansion
hook to PostgreSQL (see attached patch). The motivation for this is to
allow extensions to optimize table expansion. In particular, TimescaleDB
does its own table expansion in order to apply a number of optimizations,
including partition pruning (note that TimescaleDB uses inheritance since
PostgreSQL 9.6 rather than declarative partitioning ). There's currently no
official hook for table expansion, but TimescaleDB has been using the
get_relation_info hook for this purpose. Unfortunately, PostgreSQL 12 broke
this for us since it moved expansion to a later stage where we can no
longer control it without some pretty bad hacks. Given that PostgreSQL 12
changed the expansion state of a table for the get_relation_info hook, we
are thinking about this as a regression and are wondering if this could be
considered against the head of PG 12 or maybe even PG 13 (although we
realize feature freeze has been reached)?

The attached patch is against PostgreSQL master (commit fb544735) and is
about ~10 lines of code. It doesn't change any existing behavior; it only
allows getting control of expand_inherited_rtentry, which would make a huge
difference for TimescaleDB.

Best regards,

Erik
Engineering team lead
Timescale

Attachments:

table-expansion-hook.patchtext/x-patch; charset=US-ASCII; name=table-expansion-hook.patchDownload
From 775102469a4c7e480c167e30ffd66d6555163f40 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com>
Date: Thu, 16 Apr 2020 13:20:48 +0200
Subject: [PATCH] Add a table expansion hook

This change adds a table expansion hook that allows extensions to
control expansion of inheritence relations (e.g., tables using
inheritence or partitioning). Extensions might want to control
expansion to enable optimizations to constraint exclusion or for
access control on certain subtables.

In PostgreSQL 12, table expansion moved to a much later stage of
planning, which means that extensions that relied on expansion
happening prior to, e.g., get_relation_info_hook no longer have the
ability to control or editorialize table expansion.
---
 src/backend/optimizer/plan/initsplan.c | 7 ++++++-
 src/backend/optimizer/util/inherit.c   | 3 +++
 src/include/optimizer/inherit.h        | 8 ++++++++
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index e978b491f6..358adf7413 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -159,7 +159,12 @@ add_other_rels_to_query(PlannerInfo *root)
 
 		/* If it's marked as inheritable, look for children. */
 		if (rte->inh)
-			expand_inherited_rtentry(root, rel, rte, rti);
+		{
+			if (expand_inherited_rtentry_hook)
+				(*expand_inherited_rtentry_hook) (root, rel, rte, rti);
+			else
+				expand_inherited_rtentry(root, rel, rte, rti);
+		}
 	}
 }
 
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 3132fd35a5..2673e1a285 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -35,6 +35,9 @@
 #include "utils/rel.h"
 
 
+/* Hook for plugins to get control in expand_inherited_rtentry() */
+expand_inherited_rtentry_hook_type expand_inherited_rtentry_hook = NULL;
+
 static void expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
 									   RangeTblEntry *parentrte,
 									   Index parentRTindex, Relation parentrel,
diff --git a/src/include/optimizer/inherit.h b/src/include/optimizer/inherit.h
index 0ba6132652..f0ccff2039 100644
--- a/src/include/optimizer/inherit.h
+++ b/src/include/optimizer/inherit.h
@@ -16,6 +16,14 @@
 
 #include "nodes/pathnodes.h"
 
+/* Hook for plugins to get control over expand_inherited_rtentry() */
+typedef void (*expand_inherited_rtentry_hook_type)(PlannerInfo *root,
+												   RelOptInfo *rel,
+												   RangeTblEntry *rte,
+												   Index rti);
+
+extern PGDLLIMPORT expand_inherited_rtentry_hook_type expand_inherited_rtentry_hook;
+
 
 extern void expand_inherited_rtentry(PlannerInfo *root, RelOptInfo *rel,
 									 RangeTblEntry *rte, Index rti);
-- 
2.25.1

In reply to: Erik Nordström (#1)
Re: Feedback on table expansion hook (including patch)

that sounds really really useful.
i can see a ton of use cases for that.
we also toyed with the idea recently of having pluggable FSM strategies.
that one could be quite useful as well.

regards,

hans

On 07.05.2020, at 10:11, Erik Nordström <erik@timescale.com> wrote:

Hi,

I am looking for feedback on the possibility of adding a table expansion hook to PostgreSQL (see attached patch). The motivation for this is to allow extensions to optimize table expansion. In particular, TimescaleDB does its own table expansion in order to apply a number of optimizations, including partition pruning (note that TimescaleDB uses inheritance since PostgreSQL 9.6 rather than declarative partitioning ). There's currently no official hook for table expansion, but TimescaleDB has been using the get_relation_info hook for this purpose. Unfortunately, PostgreSQL 12 broke this for us since it moved expansion to a later stage where we can no longer control it without some pretty bad hacks. Given that PostgreSQL 12 changed the expansion state of a table for the get_relation_info hook, we are thinking about this as a regression and are wondering if this could be considered against the head of PG 12 or maybe even PG 13 (although we realize feature freeze has been reached)?

The attached patch is against PostgreSQL master (commit fb544735) and is about ~10 lines of code. It doesn't change any existing behavior; it only allows getting control of expand_inherited_rtentry, which would make a huge difference for TimescaleDB.

Best regards,

Erik
Engineering team lead
Timescale

<table-expansion-hook.patch>

--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com <https://www.cybertec-postgresql.com/&gt;

#3Euler Taveira
euler.taveira@2ndquadrant.com
In reply to: Erik Nordström (#1)
Re: Feedback on table expansion hook (including patch)

On Thu, 7 May 2020 at 05:11, Erik Nordström <erik@timescale.com> wrote:

I am looking for feedback on the possibility of adding a table expansion
hook to PostgreSQL (see attached patch). The motivation for this is to
allow extensions to optimize table expansion. In particular, TimescaleDB
does its own table expansion in order to apply a number of optimizations,
including partition pruning (note that TimescaleDB uses inheritance since
PostgreSQL 9.6 rather than declarative partitioning ). There's currently no
official hook for table expansion, but TimescaleDB has been using the
get_relation_info hook for this purpose. Unfortunately, PostgreSQL 12 broke
this for us since it moved expansion to a later stage where we can no
longer control it without some pretty bad hacks. Given that PostgreSQL 12
changed the expansion state of a table for the get_relation_info hook, we
are thinking about this as a regression and are wondering if this could be
considered against the head of PG 12 or maybe even PG 13 (although we
realize feature freeze has been reached)?

I reviewed your patch and it looks good to me. You mentioned that it would
be useful for partitioning using table inheritance but it could also be
used for declarative partitioning (at least until someone decides to detach
it from inheritance infrastructure).

Unfortunately, you showed up late here. Even though the hook is a
straightforward feature, Postgres does not add new features to released
versions or after the feature freeze.

The only point that I noticed was that you chose "control over" but similar
code uses "control in".

--
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Euler Taveira (#3)
Re: Feedback on table expansion hook (including patch)

Status update for a commitfest entry.

This patch implements useful improvement and the reviewer approved the code. It lacks a test, but looking at previously committed hooks, I think it is not mandatory.
So, I move it to RFC.

The new status of this patch is: Ready for Committer

#5Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Erik Nordström (#1)
Re: Feedback on table expansion hook (including patch)

On 07.05.20 10:11, Erik Nordström wrote:

I am looking for feedback on the possibility of adding a table expansion
hook to PostgreSQL (see attached patch). The motivation for this is to
allow extensions to optimize table expansion. In particular, TimescaleDB
does its own table expansion in order to apply a number of
optimizations, including partition pruning (note that TimescaleDB uses
inheritance since PostgreSQL 9.6 rather than declarative partitioning ).
There's currently no official hook for table expansion, but TimescaleDB
has been using the get_relation_info hook for this purpose.
Unfortunately, PostgreSQL 12 broke this for us since it moved expansion
to a later stage where we can no longer control it without some pretty
bad hacks.

Unlike the get_relation_info_hook, your proposed hook would *replace*
expand_inherited_rtentry() rather than just tack on additional actions.
That seems awfully fragile. Could you do with a hook that does
additional things rather than replace a whole chunk of built-in code?

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: Feedback on table expansion hook (including patch)

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 07.05.20 10:11, Erik Nordström wrote:

I am looking for feedback on the possibility of adding a table expansion
hook to PostgreSQL (see attached patch).

Unlike the get_relation_info_hook, your proposed hook would *replace*
expand_inherited_rtentry() rather than just tack on additional actions.
That seems awfully fragile. Could you do with a hook that does
additional things rather than replace a whole chunk of built-in code?

I suppose Erik is assuming that he could call expand_inherited_rtentry
(or better, the previous hook occupant) when his special case doesn't
apply. But I'm suspicious that he'd still end up duplicating large
chunks of optimizer/util/inherit.c in order to carry out the special
case, since almost all of that is private/static functions. It
does seem like a more narrowly-scoped hook might be better.

Would it be unreasonable of us to ask for a worked-out example making
use of the proposed hook? That'd go a long way towards resolving the
question of whether you can do anything useful without duplicating
lots of code.

I've also been wondering, given the table-AM projects that are
going on, whether we shouldn't refactor things to give partitioned
tables a special access method, and then shove most of the planner
and executor's hard-wired partitioning logic into access method
callbacks. That would make it a lot more feasible for extensions
to implement custom partitioning-like behavior ... or so I guess.

regards, tom lane

#7David Fetter
david@fetter.org
In reply to: Tom Lane (#6)
Re: Feedback on table expansion hook (including patch)

On Sat, Mar 06, 2021 at 01:09:10PM -0500, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 07.05.20 10:11, Erik Nordstr�m wrote:

I am looking for feedback on the possibility of adding a table expansion
hook to PostgreSQL (see attached patch).

Unlike the get_relation_info_hook, your proposed hook would *replace*
expand_inherited_rtentry() rather than just tack on additional actions.
That seems awfully fragile. Could you do with a hook that does
additional things rather than replace a whole chunk of built-in code?

I suppose Erik is assuming that he could call expand_inherited_rtentry
(or better, the previous hook occupant) when his special case doesn't
apply. But I'm suspicious that he'd still end up duplicating large
chunks of optimizer/util/inherit.c in order to carry out the special
case, since almost all of that is private/static functions. It
does seem like a more narrowly-scoped hook might be better.

Would it be unreasonable of us to ask for a worked-out example making
use of the proposed hook? That'd go a long way towards resolving the
question of whether you can do anything useful without duplicating
lots of code.

I've also been wondering, given the table-AM projects that are
going on, whether we shouldn't refactor things to give partitioned
tables a special access method, and then shove most of the planner
and executor's hard-wired partitioning logic into access method
callbacks. That would make it a lot more feasible for extensions
to implement custom partitioning-like behavior ... or so I guess.

That seems pretty reasonable. I suspect that that process will expose
bits of the planning and execution machinery that have gotten less
isolated than they should be.

More generally, and I'll start a separate thread on this, we should be
working up to including a reference implementation, however tiny, of
every extension point we supply in order to ensure that our APIs are
at a minimum reasonably usable and remain so.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#8Erik Nordström
erik@timescale.com
In reply to: David Fetter (#7)
Re: Feedback on table expansion hook (including patch)

Thank you all for the feedback and insights.

Yes, the intention is to *replace* expand_inherited_rtentry() in the same
way planner_hook replaces standard_planner().

Some background: TimescaleDB implements its own partitioning based on
inheritance that predates declarative partitioning. The extension would use
the table expansion hook to do its own table expansion based on
extension-specific metadata. There was a pretty clean (but still hacky) way
to do it via the get_relation_info_hook(), but PostgreSQL 12 changed the
order of events so that this hook no longer works for this purpose. For
PostgreSQL 12+, we'd have to copy/replace a lot of PostgreSQL code to make
our custom expansion still work, and the proposed hook would allow us to
get rid of this ugliness.

With the proposed table expansion hook, you could of course also first call
expand_inherited_rtentry() yourself and then modify the result or do
additional things. However, the way we'd like to use this in TimescaleDB is
to more-or-less replace the current expansion code since we do not rely on
declarative partitioning. I am not sure a more narrowly-scoped hook makes
sense, because it would tie you to a certain way of doing things. That
would defeat the purpose of the hook. Note that expand_inherited_rtenry()
immediately branches off based on type of relation: one branch for
inheritance, one for partitioning, and so on. So, doing this in a more
narrow scope would probably require one hook per relation type or at least
a common hook with some extra info on where you are in that expansion.
Another way of looking at this is to view TimescaleDB as offering a new
relation type for partitioning, so it is natural that it would have its own
expansion branch, just like inheritance and partitioning. There are a
couple of functions that might be useful to expose publicly, however, like
expand_single_inheritance_child() since it is called from both the
inheritance and the partitioning branch.

Looking at this problem more generally, though, it does seem to me that
PostgreSQL would benefit from a more general and official
table-partitioning API that would allow custom partitioning
implementations, or make it part of the table access method API. However,
while that is an interesting thing to explore, I think this table expansion
hook goes a long way until such an API is available.

I can provide a code example of how we'd use the table expansion hook in
TimescaleDB. A more standalone example probably requires a bit more work
though.

Best,

Erik

On Sat, Mar 6, 2021 at 7:38 PM David Fetter <david@fetter.org> wrote:

Show quoted text

On Sat, Mar 06, 2021 at 01:09:10PM -0500, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 07.05.20 10:11, Erik Nordström wrote:

I am looking for feedback on the possibility of adding a table

expansion

hook to PostgreSQL (see attached patch).

Unlike the get_relation_info_hook, your proposed hook would *replace*
expand_inherited_rtentry() rather than just tack on additional

actions.

That seems awfully fragile. Could you do with a hook that does
additional things rather than replace a whole chunk of built-in code?

I suppose Erik is assuming that he could call expand_inherited_rtentry
(or better, the previous hook occupant) when his special case doesn't
apply. But I'm suspicious that he'd still end up duplicating large
chunks of optimizer/util/inherit.c in order to carry out the special
case, since almost all of that is private/static functions. It
does seem like a more narrowly-scoped hook might be better.

Would it be unreasonable of us to ask for a worked-out example making
use of the proposed hook? That'd go a long way towards resolving the
question of whether you can do anything useful without duplicating
lots of code.

I've also been wondering, given the table-AM projects that are
going on, whether we shouldn't refactor things to give partitioned
tables a special access method, and then shove most of the planner
and executor's hard-wired partitioning logic into access method
callbacks. That would make it a lot more feasible for extensions
to implement custom partitioning-like behavior ... or so I guess.

That seems pretty reasonable. I suspect that that process will expose
bits of the planning and execution machinery that have gotten less
isolated than they should be.

More generally, and I'll start a separate thread on this, we should be
working up to including a reference implementation, however tiny, of
every extension point we supply in order to ensure that our APIs are
at a minimum reasonably usable and remain so.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#9Aleksander Alekseev
aleksander@timescale.com
In reply to: Erik Nordström (#8)
Re: Feedback on table expansion hook (including patch)

Hi Erik,

Thank you all for the feedback and insights.

Yes, the intention is to *replace* expand_inherited_rtentry() in the same way planner_hook replaces standard_planner().

This patch probably doesn't need yet another reviewer, but since there
is a little controversy about if the hook should replace a procedure
or be called after it, I decided to put my two cents in. The proposed
approach is very flexible - it allows to modify the arguments, the
result, to completely replace the procedure, etc. I don't think that
calling a hook after the procedure was called (or before) will be very
useful.

The patch applies to `master` branch (6d177e28) and passes all the
tests on MacOS.

--
Best regards,
Aleksander Alekseev

#10yuzuko
yuzukohosoya@gmail.com
In reply to: Erik Nordström (#8)
Re: Feedback on table expansion hook (including patch)

Hello,

Thank you all for the feedback and insights.

Yes, the intention is to *replace* expand_inherited_rtentry() in the same way planner_hook replaces standard_planner().

This patch is really useful. We are working on developing hypothetical
partitioning as a feature of HypoPG[1]https://github.com/HypoPG/hypopg/tree/REL1_STABLE[2]https://github.com/HypoPG/hypopg/tree/master -- Best regards, Yuzuko Hosoya NTT Open Source Software Center, but we hit the same problem
as TimescaleDB. Therefore we would also be thrilled to have that hook.

Hypothetical partitioning allows users to define multiple partitioning
schemes on real tables and real data hypothetically, and shows resulting
queries' plan/cost with EXPLAIN using hypothetical partitioning schemes.
Users can quickly check how their queries would behave if some tables
were partitioned, and try different partitioning schemes. HypoPG does
table expansion again according to the defined hypothetical partitioning
schemes. For this purpose, we used get_relation_info hook, but in PG12,
table expansion was moved, so we cannot do that using
get_relation_info hook. This is exactly the same problem Erik has.
Therefore the proposed hook would allow us to support hypothetical partitioning.

[1]: https://github.com/HypoPG/hypopg/tree/REL1_STABLE
[2]: https://github.com/HypoPG/hypopg/tree/master -- Best regards, Yuzuko Hosoya NTT Open Source Software Center
--
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center

#11Julien Rouhaud
rjuju123@gmail.com
In reply to: yuzuko (#10)
Re: Feedback on table expansion hook (including patch)

On Wed, May 12, 2021 at 04:48:29PM +0900, yuzuko wrote:

Hello,

Thank you all for the feedback and insights.

Yes, the intention is to *replace* expand_inherited_rtentry() in the same way planner_hook replaces standard_planner().

This patch is really useful. We are working on developing hypothetical
partitioning as a feature of HypoPG[1][2], but we hit the same problem
as TimescaleDB. Therefore we would also be thrilled to have that hook.

Hypothetical partitioning allows users to define multiple partitioning
schemes on real tables and real data hypothetically, and shows resulting
queries' plan/cost with EXPLAIN using hypothetical partitioning schemes.
Users can quickly check how their queries would behave if some tables
were partitioned, and try different partitioning schemes. HypoPG does
table expansion again according to the defined hypothetical partitioning
schemes. For this purpose, we used get_relation_info hook, but in PG12,
table expansion was moved, so we cannot do that using
get_relation_info hook. This is exactly the same problem Erik has.
Therefore the proposed hook would allow us to support hypothetical partitioning.

Sorry for missing that thread until now. And yes as Hosoya-san just mentioned,
we faced the exact same problem when implementing hypothetical partitioning,
and eventually had to stop as the changes in pg12 prevented it. So +1 for
introducing such a hook, it would also be useful for that usecase.

#12Amit Langote
amitlangote09@gmail.com
In reply to: Tom Lane (#6)
Re: Feedback on table expansion hook (including patch)

(Sorry about being very late to this thread.)

On Sun, Mar 7, 2021 at 3:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 07.05.20 10:11, Erik Nordström wrote:

I am looking for feedback on the possibility of adding a table expansion
hook to PostgreSQL (see attached patch).

Unlike the get_relation_info_hook, your proposed hook would *replace*
expand_inherited_rtentry() rather than just tack on additional actions.
That seems awfully fragile. Could you do with a hook that does
additional things rather than replace a whole chunk of built-in code?

I suppose Erik is assuming that he could call expand_inherited_rtentry
(or better, the previous hook occupant) when his special case doesn't
apply. But I'm suspicious that he'd still end up duplicating large
chunks of optimizer/util/inherit.c in order to carry out the special
case, since almost all of that is private/static functions. It
does seem like a more narrowly-scoped hook might be better.

Yeah, I do wonder if all of the things that are now done under
expand_inherited_rtentry() are not necessary for Timescale child
relations for the queries to work correctly? In 428b260f87, the
commit in v12 responsible for this discussion AFAICS, and more
recently in 86dc9005, we introduced a bunch of logic in the
exapnd_inherited_rtentry() path to do with adding junk columns to the
top-level query targetlist that was not there earlier. So I'd think
that expand_inherited_rtentry()'s job used to be much simpler pre-v12
so that an extension dealing with inheritance child relations could
more easily replicate its functionality, but that may not necessarily
be true anymore. Granted, a lot of that new logic is to account for
foreign table children, which perhaps doesn't matter to most
extensions. But I'd be more careful about the stuff added in
86dc9005, like add_row_identity_var/columns().

Would it be unreasonable of us to ask for a worked-out example making
use of the proposed hook? That'd go a long way towards resolving the
question of whether you can do anything useful without duplicating
lots of code.

I've also been wondering, given the table-AM projects that are
going on, whether we shouldn't refactor things to give partitioned
tables a special access method, and then shove most of the planner
and executor's hard-wired partitioning logic into access method
callbacks. That would make it a lot more feasible for extensions
to implement custom partitioning-like behavior ... or so I guess.

Interesting proposition...

--
Amit Langote
EDB: http://www.enterprisedb.com

#13Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Amit Langote (#12)
Re: Feedback on table expansion hook (including patch)

On Wed, May 12, 2021 at 10:19:17PM +0900, Amit Langote wrote:

(Sorry about being very late to this thread.)

Would it be unreasonable of us to ask for a worked-out example making
use of the proposed hook? That'd go a long way towards resolving the
question of whether you can do anything useful without duplicating
lots of code.

I've also been wondering, given the table-AM projects that are
going on, whether we shouldn't refactor things to give partitioned
tables a special access method, and then shove most of the planner
and executor's hard-wired partitioning logic into access method
callbacks. That would make it a lot more feasible for extensions
to implement custom partitioning-like behavior ... or so I guess.

Interesting proposition...

Since there is no clear definition here, we seems to be expecting an
example of how the hook will be used and there have been no activity
since may.

I suggest we move this to Returned with feedback. Which I'll do in a
couple hours.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL