9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Started by Rukh Meskialmost 12 years ago27 messages
#1Rukh Meski
Rukh Meski
rukh.meski@yahoo.ca
1 attachment(s)

Hello hackers,

I know you're busy wrapping up the 9.4 release, so please ignore this patch.

Attachments:

update_delete_order_by_limit_v0.diffapplication/octet-stream; name=update_delete_order_by_limit_v0.diff
#2Peter Geoghegan
Peter Geoghegan
pg@heroku.com
In reply to: Rukh Meski (#1)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Sat, Feb 22, 2014 at 3:46 PM, Rukh Meski <rukh.meski@yahoo.ca> wrote:

I know you're busy wrapping up the 9.4 release, so please ignore this patch.

I think you should describe what the patch does, why you believe the
feature is necessary, and perhaps how it compares to other, similar
things. You have documentation changes here, but that doesn't really
tell us why this is important.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Rukh Meski
Rukh Meski
rukh.meski@yahoo.ca
In reply to: Peter Geoghegan (#2)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Saturday, February 22, 2014 11:57:06 PM, Peter Geoghegan <pg@heroku.com> wrote:

I think you should describe what the patch does, why you believe the
feature is necessary, and perhaps how it compares to other, similar
things. You have documentation changes here, but that doesn't really
tell us why this is important.

Sorry, I wanted to minimize the attention my message attracts.  I mostly posted it to let people know I plan on working on this for 9.5 to avoid duplicated effort.  I will post more documentation and my reasons for wanting this feature in postgre later, if that's all right.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Rukh Meski (#3)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Sat, Feb 22, 2014 at 7:02 PM, Rukh Meski <rukh.meski@yahoo.ca> wrote:

Sorry, I wanted to minimize the attention my message attracts. I mostly posted it to let people know I plan on working on this for 9.5 to avoid duplicated effort. I will post more documentation and my reasons for wanting this feature in postgre later, if that's all right.

I've wanted this more than once. I suspect it's a pretty hard project, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Pavel Golub
Pavel Golub
pavel@microolap.com
In reply to: Robert Haas (#4)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Hello, Robert.

You wrote:

RH> On Sat, Feb 22, 2014 at 7:02 PM, Rukh Meski <rukh.meski@yahoo.ca> wrote:

Sorry, I wanted to minimize the attention my message attracts. I mostly posted it to let people know I plan on working on this for 9.5 to avoid duplicated effort. I will post more documentation and my reasons for wanting this feature in postgre later, if that's all right.

RH> I've wanted this more than once. I suspect it's a pretty hard project, though.

+1 from me. This is the exciting functionality. There was even a poll
in my blog year ago: http://pgolub.wordpress.com/2012/11/23/do-we-need-limit-clause-in-update-and-delete-statements-for-postgresql/

So the results were (for those who don't want check the post):
Yes, for functionality: 194 (61.4%)
No way! 78 (24.7%)
Do not care 44 (13.9%)

RH> --
RH> Robert Haas
RH> EnterpriseDB: http://www.enterprisedb.com
RH> The Enterprise PostgreSQL Company

--
With best wishes,
Pavel mailto:pavel@gf.microolap.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Rukh Meski
Rukh Meski
rukh.meski@yahoo.ca
In reply to: Rukh Meski (#1)
1 attachment(s)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Hi,

Here's an updated patch.  I had to push the LIMIT processing into ModifyTable to make the behaviour sane in parallel scenarios.  As usual, please ignore if you're busy with 9.4.  I will work on better docs and more tests from now on and am preparing to make a solid case for adding this.

Attachments:

update_delete_order_by_limit_v1.diffapplication/octet-stream; name=update_delete_order_by_limit_v1.diff
#7Rukh Meski
Rukh Meski
rukh.meski@yahoo.ca
In reply to: Rukh Meski (#6)
1 attachment(s)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Oops.  Of course shouldn't try and change how INSERT works.  Latest version attached.

Attachments:

update_delete_order_by_limit_v2.diffapplication/octet-stream; name=update_delete_order_by_limit_v2.diff
#8Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Rukh Meski (#7)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Thu, Mar 13, 2014 at 3:49 AM, Rukh Meski <rukh.meski@yahoo.ca> wrote:

Oops. Of course shouldn't try and change how INSERT works. Latest version attached.

I had given a brief look into this patch and found that the
implementation for Update .. ORDER BY is not appropriate for
inheritance tables.

It just tries to sort for individual tables in inheritance hierarchy
which can give wrong behaviour.

As an example try below case:
CREATE TABLE cities (
name text,
population real,
altitude int
);

CREATE TABLE capitals (
state char(2)
) INHERITS (cities);

insert rows in both tables and then try to see the plan of below
Update statement
postgres=# explain update cities set population=150 where altitude<25 order by n
ame limit 1;
QUERY PLAN
------------------------------------------------------------------------
Update on cities (cost=2.65..28.80 rows=396 width=47)
-> Sort (cost=2.65..2.74 rows=39 width=42)
Sort Key: cities.name
-> Seq Scan on cities (cost=0.00..2.45 rows=39 width=42)
Filter: (altitude < 25)
-> Sort (cost=25.16..26.05 rows=357 width=48)
Sort Key: capitals.name
-> Seq Scan on capitals (cost=0.00..23.38 rows=357 width=48)
Filter: (altitude < 25)
Planning time: 0.292 ms
(10 rows)

Here as it sorts for individual tables, the final result could be wrong.

As far as I can trace from the previous discussion of this feature,
you need to find the solution for below 2 key problems for
UPDATE ... ORDER BY:

1. How will you sort the rows from different tables in inheritance
hierarchy especially when they contain different columns as in
above example.

2. How would ModifyTable knows which table row came from.

Tom Lane has explained these problems in a very clear manner
in his below mail and shared his opinion about this feature as
well.
/messages/by-id/26819.1291133045@sss.pgh.pa.us

So I think if you want to implement this feature, then lets first
try to find a solution for above problems.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Simon Riggs
Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Kapila (#8)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 11 May 2014 07:37, Amit Kapila <amit.kapila16@gmail.com> wrote:

Tom Lane has explained these problems in a very clear manner
in his below mail and shared his opinion about this feature as
well.
/messages/by-id/26819.1291133045@sss.pgh.pa.us

I don't have Tom's wonderfully articulate way of saying things, so
I'll say it my way:

If you want to do this, you already can already write a query that has
the same effect. But supporting the syntax directly to execute a
statement with an undefinable outcome is a pretty bad idea, and worse
than that, there's a ton of useful things that we *do* want that would
be a much higher priority for work than this. If you support Postgres,
prioritise, please.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andres Freund
Andres Freund
andres@2ndquadrant.com
In reply to: Simon Riggs (#9)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 2014-05-11 10:33:10 +0200, Simon Riggs wrote:

On 11 May 2014 07:37, Amit Kapila <amit.kapila16@gmail.com> wrote:

Tom Lane has explained these problems in a very clear manner
in his below mail and shared his opinion about this feature as
well.
/messages/by-id/26819.1291133045@sss.pgh.pa.us

I don't have Tom's wonderfully articulate way of saying things, so
I'll say it my way:

If you want to do this, you already can already write a query that has
the same effect. But supporting the syntax directly to execute a
statement with an undefinable outcome is a pretty bad idea, and worse
than that, there's a ton of useful things that we *do* want that would
be a much higher priority for work than this. If you support Postgres,
prioritise, please.

I don't know. I'd find UPDATE/DELETE ORDER BY something rather
useful. It's required to avoid deadlocks in many scenarios and it's not
that obvious how to write the queries in a correct manner.
LIMIT would be a nice bonus for queues, especially if we can get SKIP
LOCKED.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Rukh Meski
Rukh Meski
rukh.meski@gmail.com
In reply to: Simon Riggs (#9)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Sun, May 11, 2014 at 10:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 11 May 2014 07:37, Amit Kapila <amit.kapila16@gmail.com> wrote:

Tom Lane has explained these problems in a very clear manner
in his below mail and shared his opinion about this feature as
well.
/messages/by-id/26819.1291133045@sss.pgh.pa.us

I don't have Tom's wonderfully articulate way of saying things, so
I'll say it my way:

If you want to do this, you already can already write a query that has
the same effect. But supporting the syntax directly to execute a
statement with an undefinable outcome is a pretty bad idea, and worse
than that, there's a ton of useful things that we *do* want that would
be a much higher priority for work than this. If you support Postgres,
prioritise, please.

Yes, you can already achieve what this feature can do by other means,
but this feature makes these cases 1) more efficient in terms of how
much work has to be done 2) simpler and more elegant to write. But
frankly I find it a bit insulting that I shouldn't work on this based
on other people's priorities. This is a high priority item for me.

I'll look at the problem reported by Amit and come back with a
solution and my rationale for adding this feature.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Simon Riggs
Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#10)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 11 May 2014 11:18, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-05-11 10:33:10 +0200, Simon Riggs wrote:

On 11 May 2014 07:37, Amit Kapila <amit.kapila16@gmail.com> wrote:

Tom Lane has explained these problems in a very clear manner
in his below mail and shared his opinion about this feature as
well.
/messages/by-id/26819.1291133045@sss.pgh.pa.us

I don't have Tom's wonderfully articulate way of saying things, so
I'll say it my way:

If you want to do this, you already can already write a query that has
the same effect. But supporting the syntax directly to execute a
statement with an undefinable outcome is a pretty bad idea, and worse
than that, there's a ton of useful things that we *do* want that would
be a much higher priority for work than this. If you support Postgres,
prioritise, please.

I don't know. I'd find UPDATE/DELETE ORDER BY something rather
useful.

Perhaps if an index exists to provide an ordering that makes it clear
what this means, then yes.

Forcing Rukh to implement ORDER BY when an index doesn't exist sounds
like useless work though, so if we were to accept that this statement
gives an ERROR in the absence of such an index, it would make sense
all round.

It's required to avoid deadlocks in many scenarios and it's not
that obvious how to write the queries in a correct manner.
LIMIT would be a nice bonus for queues, especially if we can get SKIP
LOCKED.

With SKIP LOCKED it begins to have some use. Thanks for the nudge.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#12)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Simon Riggs <simon@2ndQuadrant.com> writes:

On 11 May 2014 11:18, Andres Freund <andres@2ndquadrant.com> wrote:

I don't know. I'd find UPDATE/DELETE ORDER BY something rather
useful.

Perhaps if an index exists to provide an ordering that makes it clear
what this means, then yes.

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned). That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

It strikes me that a big part of the problem here is that the current
support for this case assumes that the children don't all have the
same rowtype. Which is important if you think of "child table" as
meaning "subclass in the OO sense". But for ordinary partitioning
cases it's just useless complexity, and ModifyTable isn't the only
thing that suffers from that useless complexity.

If we had a notion of "partitioned table" that involved a restriction
that all the child tables have the exact same rowtype, we could implement
UPDATE/DELETE in a much saner fashion --- just one plan tree, not one
per child table --- and it would be possible to support UPDATE/DELETE
ORDER BY LIMIT with no more work than for the single-table case.
So that might shift the calculation as to whether we're willing to
accept a partial implementation.

Another idea is that the main reason we do things like this is the
assumption that for UPDATE, ModifyTable receives complete new rows
that only need to be pushed back into the table (and hence have
to already match the rowtype of the specific child table). What if
we got rid of that and had the incoming tuples just have the target
row identifier (tableoid+TID) and the values for the updated columns?
ModifyTable then would have to visit the old row (something it must
do anyway, NB), pull out the values for the not-to-be-updated columns,
form the final tuple and store it. It could implement this separately
for each child table, with a different mapping of which columns receive
the updates. This eliminates the whole multiple-plan-tree business
at a stroke ... and TBH, it's not immediately obvious that this would
not be as efficient or more so than the way we do UPDATEs today, even
in the single-target-table case. Pumping all those not-updated column
values through the plan tree isn't free. The more I think about it,
the more promising this sounds --- though I confess to being badly
undercaffeinated at the moment, so maybe there's some fatal problem
I'm missing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Andres Freund
Andres Freund
andres@2ndQuadrant.com
In reply to: Tom Lane (#13)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 2014-05-11 12:47:21 -0400, Tom Lane wrote:

Another idea is that the main reason we do things like this is the
assumption that for UPDATE, ModifyTable receives complete new rows
that only need to be pushed back into the table (and hence have
to already match the rowtype of the specific child table). What if
we got rid of that and had the incoming tuples just have the target
row identifier (tableoid+TID) and the values for the updated columns?
ModifyTable then would have to visit the old row (something it must
do anyway, NB), pull out the values for the not-to-be-updated columns,
form the final tuple and store it. It could implement this separately
for each child table, with a different mapping of which columns receive
the updates. This eliminates the whole multiple-plan-tree business
at a stroke ... and TBH, it's not immediately obvious that this would
not be as efficient or more so than the way we do UPDATEs today, even
in the single-target-table case. Pumping all those not-updated column
values through the plan tree isn't free. The more I think about it,
the more promising this sounds --- though I confess to being badly
undercaffeinated at the moment, so maybe there's some fatal problem
I'm missing.

Yes, that sounds like a rather good plan. There's probably some fun
keeping the executor state straight when switching more frequently than
now and we'd probably need some (implicitly?) added type coercions? I
also agree, while there probably are some cases where'd be slower, that the
majority of cases will be faster.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#13)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Sun, May 11, 2014 at 10:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On 11 May 2014 11:18, Andres Freund <andres@2ndquadrant.com> wrote:

I don't know. I'd find UPDATE/DELETE ORDER BY something rather
useful.

Perhaps if an index exists to provide an ordering that makes it clear
what this means, then yes.

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned). That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

It strikes me that a big part of the problem here is that the current
support for this case assumes that the children don't all have the
same rowtype. Which is important if you think of "child table" as
meaning "subclass in the OO sense". But for ordinary partitioning
cases it's just useless complexity, and ModifyTable isn't the only
thing that suffers from that useless complexity.

If we had a notion of "partitioned table" that involved a restriction
that all the child tables have the exact same rowtype, we could implement
UPDATE/DELETE in a much saner fashion --- just one plan tree, not one
per child table --- and it would be possible to support UPDATE/DELETE
ORDER BY LIMIT with no more work than for the single-table case.
So that might shift the calculation as to whether we're willing to
accept a partial implementation.

I think there are many use cases where current inheritance mechanism
is used for partitioning the table without adding new columns in child
table, so if we could support UPDATE/DELETE .. ORDER BY for
those cases, then it will be quite useful, but not sure if it is viable to
see simpler implementation for this case along with keeping current logic.

Another idea is that the main reason we do things like this is the
assumption that for UPDATE, ModifyTable receives complete new rows
that only need to be pushed back into the table (and hence have
to already match the rowtype of the specific child table). What if
we got rid of that and had the incoming tuples just have the target
row identifier (tableoid+TID) and the values for the updated columns?
ModifyTable then would have to visit the old row (something it must
do anyway, NB), pull out the values for the not-to-be-updated columns,
form the final tuple and store it. It could implement this separately
for each child table, with a different mapping of which columns receive
the updates.

How about sorting step, are you thinking to have MergeAppend
node for it beneath ModifyTable?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#15)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Amit Kapila <amit.kapila16@gmail.com> writes:

How about sorting step, are you thinking to have MergeAppend
node for it beneath ModifyTable?

Well yeah, that's pretty much the point.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Rukh Meski
Rukh Meski
rukh.meski@gmail.com
In reply to: Tom Lane (#13)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Sun, May 11, 2014 at 4:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned). That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

It strikes me that a big part of the problem here is that the current
support for this case assumes that the children don't all have the
same rowtype. Which is important if you think of "child table" as
meaning "subclass in the OO sense". But for ordinary partitioning
cases it's just useless complexity, and ModifyTable isn't the only
thing that suffers from that useless complexity.

If we had a notion of "partitioned table" that involved a restriction
that all the child tables have the exact same rowtype, we could implement
UPDATE/DELETE in a much saner fashion --- just one plan tree, not one
per child table --- and it would be possible to support UPDATE/DELETE
ORDER BY LIMIT with no more work than for the single-table case.
So that might shift the calculation as to whether we're willing to
accept a partial implementation.

None of the use cases I have in mind would ever (have to) use this on
a parent table; in the worst case it might make sense to do it on the
child tables individually. Personally, I think that just refusing to
operate on tables with children is a reasonable start. I have no
interest in working on improving partitioning, but I don't think
pushing this feature back in the hopes that someone else will would
help anyone.

But definitely only my two cents on this issue.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#16)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Tue, May 13, 2014 at 8:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Kapila <amit.kapila16@gmail.com> writes:

How about sorting step, are you thinking to have MergeAppend
node for it beneath ModifyTable?

Well yeah, that's pretty much the point.

IIUC, the way new design will work is that for new tuple we will now
get tableoid+TID, modified column values as an input (for inheritance
tables we will get this for all child tables as well) for ModifyTable
and get old tuple (which in current case will be provided by MergeAppend
or in general by some scan node) from some node beneath the
ModifyTable. It then matches the tableoid from old tuple with appropriate
tableoid incase of child tables and then form the new tuple for that
tableoid using old tuple and modified column values.
In this case can we safely assume that we will always get tableoid from
old tuple, ideally it should be there but just not sure and another minor
point is won't we get TID from old tuple (tuple we get from node beneath
ModifyTable), what's the need to pass for new tuple?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#13)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Sun, May 11, 2014 at 12:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On 11 May 2014 11:18, Andres Freund <andres@2ndquadrant.com> wrote:

I don't know. I'd find UPDATE/DELETE ORDER BY something rather
useful.

Perhaps if an index exists to provide an ordering that makes it clear
what this means, then yes.

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned).

I'd say "no". Partitioning is important, and we need to make it more
seamless and better-integrated, not add new warts.

That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

I like this feature, but if I were searching for places where it makes
sense to loosen our project's usual quality expectations, this isn't
where I'd start.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Jeff Janes
Jeff Janes
jeff.janes@gmail.com
In reply to: Robert Haas (#19)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Wed, May 14, 2014 at 8:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, May 11, 2014 at 12:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On 11 May 2014 11:18, Andres Freund <andres@2ndquadrant.com> wrote:

I don't know. I'd find UPDATE/DELETE ORDER BY something rather
useful.

Perhaps if an index exists to provide an ordering that makes it clear
what this means, then yes.

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned).

I'd say "no". Partitioning is important, and we need to make it more
seamless and better-integrated, not add new warts.

I think the importance of partitioning argues the other way on this issue.
Where I most wanted a LIMIT clause on DELETE is where I was moving tuples
from one partition to a different one in a transactional way using
bite-size chunks that wouldn't choke the live system with locks or with IO.

So the DELETE was always running against either a child by name, or against
ONLY parent, not against the whole inheritance tree. Not being able to do
this on single partitions makes partitioning harder, not easier.

Sure, I can select the nth smallest ctid and then "WITH T AS (DELETE FROM
ONLY foo WHERE ctid < :that RETURNING *) INSERT INTO bar SELECT * from T",
but how annoying.

That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

I like this feature, but if I were searching for places where it makes
sense to loosen our project's usual quality expectations, this isn't
where I'd start.

In this case I'd much rather have half a loaf rather than none at all. We
wouldn't be adding warts to partitioning, but removing warts from the
simpler case.

Cheers,

Jeff

#21Heikki Linnakangas
Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Rukh Meski (#17)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 05/13/2014 10:45 PM, Rukh Meski wrote:

On Sun, May 11, 2014 at 4:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned). That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

It strikes me that a big part of the problem here is that the current
support for this case assumes that the children don't all have the
same rowtype. Which is important if you think of "child table" as
meaning "subclass in the OO sense". But for ordinary partitioning
cases it's just useless complexity, and ModifyTable isn't the only
thing that suffers from that useless complexity.

If we had a notion of "partitioned table" that involved a restriction
that all the child tables have the exact same rowtype, we could implement
UPDATE/DELETE in a much saner fashion --- just one plan tree, not one
per child table --- and it would be possible to support UPDATE/DELETE
ORDER BY LIMIT with no more work than for the single-table case.
So that might shift the calculation as to whether we're willing to
accept a partial implementation.

None of the use cases I have in mind would ever (have to) use this on
a parent table; in the worst case it might make sense to do it on the
child tables individually. Personally, I think that just refusing to
operate on tables with children is a reasonable start. I have no
interest in working on improving partitioning, but I don't think
pushing this feature back in the hopes that someone else will would
help anyone.

IMHO this needs to work with inheritance if we are to accept it. It
would be a rather strange limitation for no apparent reason, other than
that we didn't bother to implement it. It doesn't seem very difficult in
theory to add the table OID to the plan as a junk column, and use that
in the ModifyTable node to know which table a row came from.

In any case, the patch as it stands is clearly not acceptable, because
it just produces wrong results with inheritance. I'm marking it as
returned with feedback in the commitfest app. I'd suggest that you solve
the inheritance problems and resubmit.

Per the docs in the patch:

+  <para>
+   If the <literal>LIMIT</> (or <literal>FETCH FIRST</>) clause
+   is present, processing will stop after the system has attempted
+   to delete the specified amount of rows.  In particular, if a row
+   was concurrently changed not to match the given <literal>WHERE</>
+   clause, it will count towards the <literal>LIMIT</> despite it
+   not being actually deleted.  Unlike in <literal>SELECT</>, the
+   <literal>OFFSET</literal> clause is not available in
+   <literal>DELETE</>.
+  </para>

That behavior with READ COMMITTED mode and concurrent changes is
surprising. Do we really want it to behave like that, and if so, why?

Why is OFFSET not supported? Not that I care much for that, but I'm curious.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Marko Tiikkaja
Marko Tiikkaja
marko@joh.to
In reply to: Amit Kapila (#18)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

Hi Amit,

On 5/14/14 6:41 AM, Amit Kapila wrote:

IIUC, the way new design will work is that for new tuple we will now
get tableoid+TID, modified column values as an input (for inheritance
tables we will get this for all child tables as well) for ModifyTable
and get old tuple (which in current case will be provided by MergeAppend
or in general by some scan node) from some node beneath the
ModifyTable. It then matches the tableoid from old tuple with appropriate
tableoid incase of child tables and then form the new tuple for that
tableoid using old tuple and modified column values.

Having now read the discussion upthread a bit more carefully, I think
one of us is confused. AIUI, what was suggested was that the plan nodes
below the ModifyTable node would only give you back the modified
columns, the tableoid and the TID of the tuple, and no "old values" at
all. This might be a reasonable approach, but I haven't given it that
much thought yet.

In this case can we safely assume that we will always get tableoid from
old tuple, ideally it should be there but just not sure

It has to be there or otherwise the scheme won't work. Is there a
specific case you're worried about?

and another minor
point is won't we get TID from old tuple (tuple we get from node beneath
ModifyTable), what's the need to pass for new tuple?

I don't understand this part, could you rephrase?

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Marko Tiikkaja
Marko Tiikkaja
marko@joh.to
In reply to: Tom Lane (#13)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 5/11/14 6:47 PM, Tom Lane wrote:

The $64 question is whether we'd accept an implementation that fails
if the target table has children (ie, is partitioned). That seems
to me to not be up to the project's usual quality expectations, but
maybe if there's enough demand for a partial solution we should do so.

I think that partial support is better than no support unless there are
concerns about forwards compatibility. I don't see such concerns having
been expressed for this feature.

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Rukh Meski
Rukh Meski
rukh.meski@gmail.com
In reply to: Heikki Linnakangas (#21)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Tue, Jun 24, 2014 at 04:08 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

IMHO this needs to work with inheritance if we are to accept it. It would be
a rather strange limitation for no apparent reason, other than that we
didn't bother to implement it. It doesn't seem very difficult in theory to
add the table OID to the plan as a junk column, and use that in the
ModifyTable node to know which table a row came from.

I can have a go at that, but I'm somewhat afraid of the performance
implications this might have. And it's not just users of this feature
that would pay the penalty, it would be everyone.

Per the docs in the patch:

+  <para>
+   If the <literal>LIMIT</> (or <literal>FETCH FIRST</>) clause
+   is present, processing will stop after the system has attempted
+   to delete the specified amount of rows.  In particular, if a row
+   was concurrently changed not to match the given <literal>WHERE</>
+   clause, it will count towards the <literal>LIMIT</> despite it
+   not being actually deleted.  Unlike in <literal>SELECT</>, the
+   <literal>OFFSET</literal> clause is not available in
+   <literal>DELETE</>.
+  </para>

That behavior with READ COMMITTED mode and concurrent changes is surprising.
Do we really want it to behave like that, and if so, why?

Oh, oops. Looks like I didn't submit the latest version of the patch
for the commit fest, where I had fixed the documentation. It doesn't
work that way anymore, as we really don't want it to work that way.

Why is OFFSET not supported? Not that I care much for that, but I'm curious.

I thought it seemed weird. But it's supported for FOR UPDATE, so
maybe we should support it here as well.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Marko Tiikkaja (#22)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On Wed, Jul 9, 2014 at 8:42 PM, Marko Tiikkaja <marko@joh.to> wrote:

Hi Amit,

On 5/14/14 6:41 AM, Amit Kapila wrote:

IIUC, the way new design will work is that for new tuple we will now
get tableoid+TID, modified column values as an input (for inheritance
tables we will get this for all child tables as well) for ModifyTable
and get old tuple (which in current case will be provided by MergeAppend
or in general by some scan node) from some node beneath the
ModifyTable. It then matches the tableoid from old tuple with

appropriate

tableoid incase of child tables and then form the new tuple for that
tableoid using old tuple and modified column values.

Having now read the discussion upthread a bit more carefully, I think one

of us is confused. AIUI, what was suggested was that the plan nodes below
the ModifyTable node would only give you back the modified columns, the
tableoid and the TID of the tuple, and no "old values" at all.

Plan node below ModifyTable will be a scan node, it will give you old
tuple, whats the use of getting modified columns from it. We need
modified columns for new tuple which can be input for ModifyTuple.

In this case can we safely assume that we will always get tableoid from
old tuple, ideally it should be there but just not sure

It has to be there or otherwise the scheme won't work. Is there a

specific case you're worried about?

and another minor
point is won't we get TID from old tuple (tuple we get from node beneath
ModifyTable), what's the need to pass for new tuple?

I don't understand this part, could you rephrase?

Basically, I wanted to say that apart from modified columns, we just
need to pass table OID. If I am reading correctly, the same is
mentioned by Heikki as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#26Marko Tiikkaja
Marko Tiikkaja
marko@joh.to
In reply to: Amit Kapila (#25)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 7/10/14 5:44 AM, Amit Kapila wrote:

Basically, I wanted to say that apart from modified columns, we just
need to pass table OID. If I am reading correctly, the same is
mentioned by Heikki as well.

Yes, Heikki was talking about that approach. I was more interested in
the significantly more invasive approach Tom and Andres talked about
upthread, which your email was a response to.

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Marko Tiikkaja
Marko Tiikkaja
marko@joh.to
In reply to: Heikki Linnakangas (#21)
Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

On 2014-06-24 11:08, Heikki Linnakangas wrote:

IMHO this needs to work with inheritance if we are to accept it. It
would be a rather strange limitation for no apparent reason, other than
that we didn't bother to implement it. It doesn't seem very difficult in
theory to add the table OID to the plan as a junk column, and use that
in the ModifyTable node to know which table a row came from.

So I've been trying to look at this with Rukh, and it doesn't seem at
all as easy as you make it out to be. Consider the following example:

=# create table foo(a int);
=# create table foo_c1(a int, b int) inherits (foo);
=# create table foo_c2(a int, c text) inherits (foo);

=# update foo set a = a+1;

Currently, the way that works is that the ModifyTable node knows of
three plans, all with different target lists, targeting each of the
tables separately. To make $SUBJECT work with inheritance, we would
somehow have to get all three different types of tuples through an
Append node to the ModifyTable (with ctid and tableoid as junk columns).
I can see how the approach Tom and Andres talk about upthread could
work, but that's a helluva lot of work just so we can check the
inheritance check box for this feature, even though it's not clear
anyone would actually want to use this on inheritance sets. Other
approach I can think of would be to create some kind of a Frankenstein
tuple which would satisfy the needs of all tables in the set, but that
sounds really awful.

Or we could just forget that we ever had inheritance and assume that
partitioning is the only interesting use case. But it's not clear to me
that we could assume the output to be the same in every case even then.

If someone has a clear idea on how this could be implemented in a
reasonable time, we'd be happy to hear about it.

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers