Another way to fix inherited UPDATE/DELETE

Started by Tom Lanealmost 7 years ago16 messages
#1Tom Lane
Tom Lane
tgl@sss.pgh.pa.us

While contemplating the wreckage of
https://commitfest.postgresql.org/22/1778/
I had the beginnings of an idea of another way to fix that problem.

The issue largely arises from the fact that for UPDATE, we expect
the plan tree to emit a tuple that's ready to be stored back into
the target rel ... well, almost, because it also has a CTID or some
other row-identity column, so we have to do some work on it anyway.
But the point is this means we potentially need a different
targetlist for each child table in an inherited UPDATE.

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity? It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

DELETE would be even simpler, since it only needs the row identity
and nothing else.

Having done that, we could toss inheritance_planner into the oblivion
it so richly deserves, and just treat all types of inheritance or
partitioning queries as expand-at-the-bottom, as SELECT has always
done it.

Arguably, this would be more efficient even for non-inheritance join
situations, as less data (typically) would need to propagate through the
join tree. I'm not sure exactly how it'd shake out for trivial updates;
we might be paying for two tuple deconstructions not one, though perhaps
there's a way to finesse that. (One easy way would be to stick to the
old approach when there is no inheritance going on.)

In the case of a standard inheritance or partition tree, this seems to
go through really easily, since all the children could share the same
returned CTID column (I guess you'd also need a TABLEOID column so you
could figure out which table to direct the update back into). It gets
a bit harder if the tree contains some foreign tables, because they might
have different concepts of row identity, but I'd think in most cases you
could still combine those into a small number of output columns.

I have no idea how this might play with the pluggable-storage work.

Obviously this'd be a major rewrite with no chance of making it into v12,
but it doesn't sound too big to get done during v13.

Thoughts?

regards, tom lane

#2Andres Freund
Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#1)
Re: Another way to fix inherited UPDATE/DELETE

Hi,

On 2019-02-19 16:48:55 -0500, Tom Lane wrote:

I have no idea how this might play with the pluggable-storage work.

I don't think it'd have a meaningful impact, except for needing changes
to an overlapping set of lines. But given the different timeframes, I'd
not expect a problem with that.

Greetings,

Andres Freund

#3David Rowley
David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#1)
Re: Another way to fix inherited UPDATE/DELETE

On Wed, 20 Feb 2019 at 10:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity? It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

DELETE would be even simpler, since it only needs the row identity
and nothing else.

While I didn't look at the patch in great detail, I think this is how
Pavan must have made MERGE work for partitioned targets. I recall
seeing the tableoid being added to the target list and a lookup of the
ResultRelInfo by tableoid.

Maybe Pavan can provide more useful details than I can.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#4Amit Langote
Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#1)
Re: Another way to fix inherited UPDATE/DELETE

Hi,

On 2019/02/20 6:48, Tom Lane wrote:

While contemplating the wreckage of
https://commitfest.postgresql.org/22/1778/
I had the beginnings of an idea of another way to fix that problem.

The issue largely arises from the fact that for UPDATE, we expect
the plan tree to emit a tuple that's ready to be stored back into
the target rel ... well, almost, because it also has a CTID or some
other row-identity column, so we have to do some work on it anyway.
But the point is this means we potentially need a different
targetlist for each child table in an inherited UPDATE.

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity? It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

DELETE would be even simpler, since it only needs the row identity
and nothing else.

I had bookmarked link to an archived email of yours from about 5 years
ago, in which you described a similar attack plan for UPDATE planning:

/messages/by-id/1598.1399826841@sss.pgh.pa.us

It's been kind of in the back of my mind for a while, even considered
implementing it based on your sketch back then, but didn't have solutions
for some issues surrounding optimization of updates of foreign partitions
(see below). Maybe I should've mentioned that on this thread at some point.

Having done that, we could toss inheritance_planner into the oblivion
it so richly deserves, and just treat all types of inheritance or
partitioning queries as expand-at-the-bottom, as SELECT has always
done it.

Arguably, this would be more efficient even for non-inheritance join
situations, as less data (typically) would need to propagate through the
join tree. I'm not sure exactly how it'd shake out for trivial updates;
we might be paying for two tuple deconstructions not one, though perhaps
there's a way to finesse that. (One easy way would be to stick to the
old approach when there is no inheritance going on.)

In the case of a standard inheritance or partition tree, this seems to
go through really easily, since all the children could share the same
returned CTID column (I guess you'd also need a TABLEOID column so you
could figure out which table to direct the update back into). It gets
a bit harder if the tree contains some foreign tables, because they might
have different concepts of row identity, but I'd think in most cases you
could still combine those into a small number of output columns.

Regarding child target relations that are foreign tables, the
expand-target-inheritance-at-the-bottom approach perhaps leaves no way to
allow pushing the update (possibly with joins) to remote side?

-- no inheritance
explain (costs off, verbose) update ffoo f set a = f.a + 1 from fbar b
where f.a = b.a;
QUERY PLAN

──────────────────────────────────────────────────────────────────────────────────────────────────────
Update on public.ffoo f
-> Foreign Update
Remote SQL: UPDATE public.foo r1 SET a = (r1.a + 1) FROM
public.bar r2 WHERE ((r1.a = r2.a))
(3 rows)

-- inheritance
explain (costs off, verbose) update p set aa = aa + 1 from ffoo f where
p.aa = f.a;
QUERY PLAN

───────────────────────────────────────────────────────────────────────────────────────────────────────────
Update on public.p
Update on public.p1
Update on public.p2
Foreign Update on public.p3
-> Nested Loop
Output: (p1.aa + 1), p1.ctid, f.*
-> Seq Scan on public.p1
Output: p1.aa, p1.ctid
-> Foreign Scan on public.ffoo f
Output: f.*, f.a
Remote SQL: SELECT a FROM public.foo WHERE (($1::integer = a))
-> Nested Loop
Output: (p2.aa + 1), p2.ctid, f.*
-> Seq Scan on public.p2
Output: p2.aa, p2.ctid
-> Foreign Scan on public.ffoo f
Output: f.*, f.a
Remote SQL: SELECT a FROM public.foo WHERE (($1::integer = a))
-> Foreign Update
Remote SQL: UPDATE public.base3 r5 SET aa = (r5.aa + 1) FROM
public.foo r2 WHERE ((r5.aa = r2.a))
(20 rows)

Does that seem salvageable?

Thanks,
Amit

#5Amit Langote
Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#4)
Re: Another way to fix inherited UPDATE/DELETE

On 2019/02/20 10:55, Amit Langote wrote:

Maybe I should've mentioned that on this thread at some point.

I meant the other thread where we're discussing my patches.

Thanks,
Amit

#6Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#4)
Re: Another way to fix inherited UPDATE/DELETE

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

On 2019/02/20 6:48, Tom Lane wrote:

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity? It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

Regarding child target relations that are foreign tables, the
expand-target-inheritance-at-the-bottom approach perhaps leaves no way to
allow pushing the update (possibly with joins) to remote side?

That's something we'd need to think about. Obviously, anything
along this line breaks the existing FDW update APIs, but let's assume
that's acceptable. Is it impossible, or even hard, for an FDW to
support this definition of UPDATE rather than the existing one?
I don't think so --- it seems like it's just different --- but
I might well be missing something.

regards, tom lane

#7Pavan Deolasee
Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: David Rowley (#3)
Re: Another way to fix inherited UPDATE/DELETE

On Wed, Feb 20, 2019 at 4:23 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Wed, 20 Feb 2019 at 10:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity? It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

DELETE would be even simpler, since it only needs the row identity
and nothing else.

While I didn't look at the patch in great detail, I think this is how
Pavan must have made MERGE work for partitioned targets. I recall
seeing the tableoid being added to the target list and a lookup of the
ResultRelInfo by tableoid.

Maybe Pavan can provide more useful details than I can.

Yes, that's the approach I took in MERGE, primarily because of the hurdles
I faced in handling partitioned tables, which take entirely different route
for UPDATE/DELETE vs INSERT and in MERGE we had to do all three together.
But the approach also showed significant performance improvements.
UPDATE/DELETE via MERGE is far quicker as compared to regular UPDATE/DELETE
when there are non-trivial number of partitions. That's also a reason why I
recommended doing the same for regular UPDATE/DELETE, but that got lost in
the MERGE discussions. So +1 for the approach.

We will need to consider how this affects EvalPlanQual which currently
doesn't have to do anything special for partitioned tables. I solved that
via tracking the expanded-at-the-bottom child in a separate
mergeTargetRelation, but that approach has been criticised. May be Tom's
idea doesn't have the same problem or most likely he will have a far better
approach to address that.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#8Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavan Deolasee (#7)
Re: Another way to fix inherited UPDATE/DELETE

Pavan Deolasee <pavan.deolasee@gmail.com> writes:

We will need to consider how this affects EvalPlanQual which currently
doesn't have to do anything special for partitioned tables. I solved that
via tracking the expanded-at-the-bottom child in a separate
mergeTargetRelation, but that approach has been criticised. May be Tom's
idea doesn't have the same problem or most likely he will have a far better
approach to address that.

I did spend a few seconds thinking about that, and my gut says that
this wouldn't change anything interesting for EPQ. But the devil
is in the details as always, so maybe working out the patch would
find problems ...

regards, tom lane

#9Amit Langote
Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#6)
Re: Another way to fix inherited UPDATE/DELETE

On 2019/02/20 13:54, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

On 2019/02/20 6:48, Tom Lane wrote:

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity? It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

Regarding child target relations that are foreign tables, the
expand-target-inheritance-at-the-bottom approach perhaps leaves no way to
allow pushing the update (possibly with joins) to remote side?

That's something we'd need to think about. Obviously, anything
along this line breaks the existing FDW update APIs, but let's assume
that's acceptable. Is it impossible, or even hard, for an FDW to
support this definition of UPDATE rather than the existing one?
I don't think so --- it seems like it's just different --- but
I might well be missing something.

IIUC, in the new approach, only the root of the inheritance tree (target
table specified in the query) will appear in the query's join tree, not
the child target tables, so pushing updates with joins to the remote side
seems a bit hard, because we're not going to consider child joins. Maybe
I'm missing something though.

Thanks,
Amit

#10Etsuro Fujita
Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Tom Lane (#1)
Re: Another way to fix inherited UPDATE/DELETE

(2019/02/20 6:48), Tom Lane wrote:

In the case of a standard inheritance or partition tree, this seems to
go through really easily, since all the children could share the same
returned CTID column (I guess you'd also need a TABLEOID column so you
could figure out which table to direct the update back into). It gets
a bit harder if the tree contains some foreign tables, because they might
have different concepts of row identity, but I'd think in most cases you
could still combine those into a small number of output columns.

If this is the direction we go in, we should work on the row ID issue
[1]: /messages/by-id/1590.1542393315@sss.pgh.pa.us

Best regards,
Etsuro Fujita

[1]: /messages/by-id/1590.1542393315@sss.pgh.pa.us

#11Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#9)
Re: Another way to fix inherited UPDATE/DELETE

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

On 2019/02/20 13:54, Tom Lane wrote:

That's something we'd need to think about. Obviously, anything
along this line breaks the existing FDW update APIs, but let's assume
that's acceptable. Is it impossible, or even hard, for an FDW to
support this definition of UPDATE rather than the existing one?
I don't think so --- it seems like it's just different --- but
I might well be missing something.

IIUC, in the new approach, only the root of the inheritance tree (target
table specified in the query) will appear in the query's join tree, not
the child target tables, so pushing updates with joins to the remote side
seems a bit hard, because we're not going to consider child joins. Maybe
I'm missing something though.

Hm. Even if that's true (I'm not convinced), I don't think it's such a
significant use-case as to be considered a blocker.

regards, tom lane

#12Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Etsuro Fujita (#10)
Re: Another way to fix inherited UPDATE/DELETE

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:

(2019/02/20 6:48), Tom Lane wrote:

In the case of a standard inheritance or partition tree, this seems to
go through really easily, since all the children could share the same
returned CTID column (I guess you'd also need a TABLEOID column so you
could figure out which table to direct the update back into). It gets
a bit harder if the tree contains some foreign tables, because they might
have different concepts of row identity, but I'd think in most cases you
could still combine those into a small number of output columns.

If this is the direction we go in, we should work on the row ID issue
[1] before this?

Certainly, the more foreign tables can use a standardized concept of row
identity, the better this would work. What I'm loosely envisioning is
that we have one junk row-identity column for each distinct row-identity
datatype needed --- so, for instance, all ordinary tables could share
a TID column. Different FDWs might need different things, though one
would hope for no more than one datatype per FDW-type involved in the
inheritance tree. Where things could break down is if we have a lot
of tables that need a whole-row-variable for row identity, and they're
all different rowtypes --- eventually we'd run out of available columns.
So we'd definitely wish to encourage FDWs to have some more efficient
row-identity scheme than that one.

I don't see that as being something that constrains those two projects
to be done in a particular order; it'd just be a nice-to-have improvement.

regards, tom lane

#13Etsuro Fujita
Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Tom Lane (#12)
Re: Another way to fix inherited UPDATE/DELETE

(2019/02/21 0:14), Tom Lane wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> writes:

(2019/02/20 6:48), Tom Lane wrote:

In the case of a standard inheritance or partition tree, this seems to
go through really easily, since all the children could share the same
returned CTID column (I guess you'd also need a TABLEOID column so you
could figure out which table to direct the update back into). It gets
a bit harder if the tree contains some foreign tables, because they might
have different concepts of row identity, but I'd think in most cases you
could still combine those into a small number of output columns.

If this is the direction we go in, we should work on the row ID issue
[1] before this?

Certainly, the more foreign tables can use a standardized concept of row
identity, the better this would work. What I'm loosely envisioning is
that we have one junk row-identity column for each distinct row-identity
datatype needed --- so, for instance, all ordinary tables could share
a TID column. Different FDWs might need different things, though one
would hope for no more than one datatype per FDW-type involved in the
inheritance tree. Where things could break down is if we have a lot
of tables that need a whole-row-variable for row identity, and they're
all different rowtypes --- eventually we'd run out of available columns.
So we'd definitely wish to encourage FDWs to have some more efficient
row-identity scheme than that one.

Seems reasonable. I guess that that can address not only the issue [1]
but our restriction on FDW row locking that APIs for that currently only
allow TID for row identity, in a uniform way.

I don't see that as being something that constrains those two projects
to be done in a particular order; it'd just be a nice-to-have improvement.

OK, thanks for the explanation!

Best regards,
Etsuro Fujita

#14Amit Langote
Amit Langote
amitlangote09@gmail.com
In reply to: Tom Lane (#1)
Re: Another way to fix inherited UPDATE/DELETE

Hi Tom,

On Wed, Feb 20, 2019 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Obviously this'd be a major rewrite with no chance of making it into v12,
but it doesn't sound too big to get done during v13.

Are you planning to work on this?

Thanks,
Amit

#15Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#14)
Re: Another way to fix inherited UPDATE/DELETE

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Feb 20, 2019 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Obviously this'd be a major rewrite with no chance of making it into v12,
but it doesn't sound too big to get done during v13.

Are you planning to work on this?

It's on my list, but so are a lot of other things. If you'd like to
work on it, feel free.

regards, tom lane

#16Amit Langote
Amit Langote
amitlangote09@gmail.com
In reply to: Tom Lane (#15)
Re: Another way to fix inherited UPDATE/DELETE

On Wed, Jul 3, 2019 at 10:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Feb 20, 2019 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Obviously this'd be a major rewrite with no chance of making it into v12,
but it doesn't sound too big to get done during v13.

Are you planning to work on this?

It's on my list, but so are a lot of other things. If you'd like to
work on it, feel free.

Thanks for the reply. Let me see if I can get something done for the
September CF.

Regards,
Amit