[PATCH] Add UPDATE WHERE OFFSET IN clause

Started by Boszormenyi Zoltanabout 4 years ago4 messageshackers
Jump to latest

Hi!

A couple of years ago I was working on PostgreSQL
features and extensions. Some of them was in ECPG,
mainly in the area of improving Informix compatibility.

One of the patchsets I invented was the cursor readahead
support to improve performance in ECPG which used FETCH N.
This work was published at
https://github.com/zboszor/ecpg-readahead/commits

However, there was one use case for this patchset actually
decreased performance, namely if the cursor was used to
modify rows via UPDATE WHERE CURRENT OF. It was because
the cursor position on the server side was different from
the application so the ECPG readahead code had to use
MOVE then UPDATE WHERE CURRENT OF.

I brewed this idea for a while and yesterday I decided
to do something about it and here's the (admittedly, limited)
result. I added a new PostgreSQL extension to the UPDATE syntax:

UPDATE ... WHERE OFFSET n IN cursor;

This new syntax changes the feature disparity between
FETCH N and UPDATE WHERE CURRENT OF that existed in
PostgreSQL forever.

I only implemented this for cursors that use SELECT FOR UPDATE.
This part was quite straightforward to implement.

The behaviour of this syntax is as follows:
* the offset value may be 0 or negative, since it is a virtual
index into the rows returned by the last FETCH statement
and negative indexes are felt natural relative to the
cursor position and the cursor direction
* for offset value 0, the behaviour is identical to WHERE CURRENT OF
* negative indexes allow UPDATEs on previous rows even if
the cursor reached the end of the result set

I need clues for how to extend this for cursors that aren't
using FOR UPDATE queries, if it's possible at all.

Please, review.

Best regards,
Zoltán Böszörményi

Attachments:

0001-Add-UPDATE-.-WHERE-OFFSET-IN-clause.patchtext/x-patch; charset=UTF-8; name=0001-Add-UPDATE-.-WHERE-OFFSET-IN-clause.patchDownload+213-21
#2Chapman Flack
chap@anastigmatix.net
In reply to: Boszormenyi Zoltan (#1)
Re: [PATCH] Add UPDATE WHERE OFFSET IN clause

Hi!

On 02/07/22 00:59, Böszörményi Zoltán wrote:

UPDATE ... WHERE OFFSET n IN cursor;

If added to UPDATE, should this be added to DELETE also?

Regards,
-Chap

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chapman Flack (#2)
Re: [PATCH] Add UPDATE WHERE OFFSET IN clause

Chapman Flack <chap@anastigmatix.net> writes:

On 02/07/22 00:59, Böszörményi Zoltán wrote:

UPDATE ... WHERE OFFSET n IN cursor;

If added to UPDATE, should this be added to DELETE also?

FWIW, I think this is a really horrid hack. For one thing, it's not
robust against not-strictly-linear FETCH/MOVE of the cursor. It looks
to me like "OFFSET n" really means "the row that we read N reads ago",
not "the row N before the current cursor position". I see that the
documentation does explain it that way, but it fails to account for
optimizations such as whether we implement moves by reading backwards
or rewind-and-read-forwards. I don't think we want to expose that
sort of implementation detail.

I'm also pretty displeased with causing unbounded memory consumption for
every use of nodeLockRows, whether it has anything to do with a cursor or
not (never mind whether the cursor will ever be used for WHERE OFFSET IN).
Yeah, it's only a few bytes per row, but that will add up in queries that
process lots of rows.

regards, tom lane

In reply to: Tom Lane (#3)
Re: [PATCH] Add UPDATE WHERE OFFSET IN clause

2022. 02. 08. 2:05 keltezéssel, Tom Lane írta:

Chapman Flack <chap@anastigmatix.net> writes:

On 02/07/22 00:59, Böszörményi Zoltán wrote:

UPDATE ... WHERE OFFSET n IN cursor;

If added to UPDATE, should this be added to DELETE also?

Yes, it should be added, too.

FWIW, I think this is a really horrid hack.

Thanks for your kind words. :-D

For one thing, it's not
robust against not-strictly-linear FETCH/MOVE of the cursor. It looks
to me like "OFFSET n" really means "the row that we read N reads ago",
not "the row N before the current cursor position". I see that the
documentation does explain it that way, but it fails to account for
optimizations such as whether we implement moves by reading backwards
or rewind-and-read-forwards. I don't think we want to expose that
sort of implementation detail.

I'm also pretty displeased with causing unbounded memory consumption for
every use of nodeLockRows, whether it has anything to do with a cursor or
not (never mind whether the cursor will ever be used for WHERE OFFSET IN).
Yeah, it's only a few bytes per row, but that will add up in queries that
process lots of rows.

Does PostgreSQL have SQL hints now? I.e. some kind of "pragma"
parsed from SQL comments to indicate the subsequent usage pattern?
Such a hint would allow using either storing the single row information
for DELETE/UPDATE or the list.

Dumping the list to a disk file will be added later so memory
usage is not unbounded.

I was just testing the waters for the idea.

Best regards,
Zoltán Böszörményi

Show quoted text

regards, tom lane