delete statement returning too many results
Hello everyone,
I am seeing weird behaviour of a delete statement that is returning more results than I am expecting.
This is the query:
DELETE FROM queue
WHERE
id IN (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
RETURNING *;
My understanding is that the limit in the sub-select should prevent this query from ever
returning more than one result. Sadly I am seeing cases where there is more than one result.
This repository has a Java setup that pretty reliably reproduces my issue:
https://github.com/ArloL/postgres-query-error-demo
I checked the docs for select and delete and couldn’t find any hint for cases
where the behaviour of limit might be surprising.
Am I missing something?
Thanks,
Arlo
On 11/28/22 07:29, Arlo Louis O'Keeffe wrote:
Hello everyone,
I am seeing weird behaviour of a delete statement that is returning more results than I am expecting.
This is the query:
DELETE FROM queue
WHERE
id IN (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
RETURNING *;My understanding is that the limit in the sub-select should prevent this query from ever
returning more than one result. Sadly I am seeing cases where there is more than one result.This repository has a Java setup that pretty reliably reproduces my issue:
https://github.com/ArloL/postgres-query-error-demoI checked the docs for select and delete and couldn’t find any hint for cases
where the behaviour of limit might be surprising.Am I missing something?
More than one row will be deleted if there in more than one record in
"queue" for the specific value of "id" (i.e "id" is not unique).
--
Angular momentum makes the world go 'round.
On Mon, Nov 28, 2022 at 7:18 AM Ron <ronljohnsonjr@gmail.com> wrote:
On 11/28/22 07:29, Arlo Louis O'Keeffe wrote:
Hello everyone,
I am seeing weird behaviour of a delete statement that is returning more
results than I am expecting.
This is the query:
DELETE FROM queue
WHERE
id IN (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
RETURNING *;My understanding is that the limit in the sub-select should prevent this
query from ever
returning more than one result. Sadly I am seeing cases where there is
more than one result.
This repository has a Java setup that pretty reliably reproduces my
issue:
https://github.com/ArloL/postgres-query-error-demo
I checked the docs for select and delete and couldn’t find any hint for
cases
where the behaviour of limit might be surprising.
Am I missing something?
More than one row will be deleted if there in more than one record in
"queue" for the specific value of "id" (i.e "id" is not unique).
Given that the example code provided has "ID" as a PK on the queue table
this fact, while true, is unhelpful for this specific question.
There is a nice big caution regarding the default read committed isolation
mode, order by, and for update, in the documentation, but I cannot work out
exactly why this example seems to be triggering it.
https://www.postgresql.org/docs/current/sql-select.html
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
There is a nice big caution regarding the default read committed isolation
mode, order by, and for update, in the documentation, but I cannot work out
exactly why this example seems to be triggering it.
The <caution> is talking about a rather different scenario.
I managed to reproduce this locally. I find that initially, with an
empty queue table, you get a query plan like
Delete on queue (cost=0.38..8.42 rows=1 width=38)
-> Nested Loop (cost=0.38..8.42 rows=1 width=38)
-> HashAggregate (cost=0.23..0.24 rows=1 width=40)
Group Key: "ANY_subquery".id
-> Subquery Scan on "ANY_subquery" (cost=0.15..0.22 rows=1 width=40)
-> Limit (cost=0.15..0.21 rows=1 width=14)
-> LockRows (cost=0.15..74.15 rows=1200 width=14)
-> Index Scan using queue_pkey on queue queue_1 (cost=0.15..62.15 rows=1200 width=14)
-> Index Scan using queue_pkey on queue (cost=0.15..8.17 rows=1 width=14)
Index Cond: (id = "ANY_subquery".id)
which is fine because the LockRows bit will be run only once.
However, after the table's been stomped on for awhile (and probably
not till after autovacuum runs), that switches to
Delete on queue (cost=0.25..16.31 rows=1 width=38)
-> Nested Loop Semi Join (cost=0.25..16.31 rows=1 width=38)
Join Filter: (queue.id = "ANY_subquery".id)
-> Index Scan using queue_pkey on queue (cost=0.12..8.14 rows=1 width=14)
-> Subquery Scan on "ANY_subquery" (cost=0.12..8.16 rows=1 width=40)
-> Limit (cost=0.12..8.15 rows=1 width=14)
-> LockRows (cost=0.12..8.15 rows=1 width=14)
-> Index Scan using queue_pkey on queue queue_1 (cost=0.12..8.14 rows=1 width=14)
and then you start to get failures, because each re-execution of
the subquery produces a fresh row thanks to the silent SKIP LOCKED.
So basically it's unsafe to run the sub-select more than once,
but the query as written leaves it up to the planner whether
to do that. I'd suggest rephrasing as
WITH target_rows AS MATERIALIZED (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
DELETE FROM queue
WHERE id IN (SELECT * FROM target_rows)
RETURNING *;
regards, tom lane
On Mon, Nov 28, 2022 at 12:11:53PM -0500, Tom Lane wrote:
So basically it's unsafe to run the sub-select more than once,
but the query as written leaves it up to the planner whether
to do that. I'd suggest rephrasing asWITH target_rows AS MATERIALIZED (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
DELETE FROM queue
WHERE id IN (SELECT * FROM target_rows)
RETURNING *;
Thanks for the explanation and suggested fix, Tom.
I'm not the original poster, but I do use similar constructions for simple
postgres queues. I've been trying for a while, but I don't understand where the
extra rows come from, or what's "silent" about SKIP LOCKED.
Because we get different results depending on the plan postgres picks, I can
see two options: either the query is broken, or postgres is broken. Assuming it's
the former, would there be a way to make it clearer that the "obvious" (to me)
way to use SKIP LOCKED is wrong?
Thanks!
Harmen
Harmen <harmen@lijzij.de> writes:
On Mon, Nov 28, 2022 at 12:11:53PM -0500, Tom Lane wrote:
So basically it's unsafe to run the sub-select more than once,
but the query as written leaves it up to the planner whether
to do that. I'd suggest rephrasing as [...]
I'm not the original poster, but I do use similar constructions for simple
postgres queues. I've been trying for a while, but I don't understand where the
extra rows come from, or what's "silent" about SKIP LOCKED.
Sorry, I should not have blamed SKIP LOCKED in particular; this
construction will misbehave with or without that. The issue is with
using SELECT FOR UPDATE inside a DELETE or UPDATE that then modifies
the row that the subquery returned. The next execution of the subquery
will, or should, return a different row: either some not-deleted row,
or the modified row. So in this context, the result of the subquery
is volatile. The point of putting it in a MATERIALIZED CTE is to
lock the result down regardless of that.
Because we get different results depending on the plan postgres picks, I can
see two options: either the query is broken, or postgres is broken.
You can argue that the planner should treat volatile subqueries
differently than it does today. But the only reasonable way of
tightening the semantics would be to force re-execution of such a
subquery every time, even when it's not visibly dependent on the
outer query. That would be pretty bad for performance, and I doubt
it would make the OP happy in this example, because what it would
mean is that his query "fails" every time not just sometimes.
(Because of that, I don't have too much trouble concluding that
the query is broken, whether or not you feel that postgres is
also broken.)
The bigger picture here is that we long ago decided that the planner
should not inquire too closely into the volatility of subqueries,
primarily because there are use-cases where people intentionally rely
on them not to be re-executed. As an example, these queries give
different results:
regression=# select random() from generate_series(1,3);
random
---------------------
0.7637195395988317
0.09569374432524946
0.490132093120365
(3 rows)
regression=# select (select random()) from generate_series(1,3);
random
--------------------
0.9730230633436501
0.9730230633436501
0.9730230633436501
(3 rows)
In the second case, the sub-select is deemed to be independent
of the outer query and executed only once. You can argue that
if that's what you want you should be forced to put the sub-select
in a materialized CTE to make that plain. But we felt that that
would make many more people unhappy than happy, so we haven't
done it. Maybe the question could be revisited once all PG
versions lacking the MATERIALIZED syntax are long dead.
regards, tom lane
On Mon, Nov 28, 2022 at 9:18 AM Ron <ronljohnsonjr@gmail.com> wrote:
On 11/28/22 07:29, Arlo Louis O'Keeffe wrote:
Hello everyone,
I am seeing weird behaviour of a delete statement that is returning more
results than I am expecting.
This is the query:
DELETE FROM queue
WHERE
id IN (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
RETURNING *;My understanding is that the limit in the sub-select should prevent this
query from ever
returning more than one result. Sadly I am seeing cases where there is
more than one result.
This repository has a Java setup that pretty reliably reproduces my
issue:
https://github.com/ArloL/postgres-query-error-demo
I checked the docs for select and delete and couldn’t find any hint for
cases
where the behaviour of limit might be surprising.
Am I missing something?
If I reduce your delete statement to:
DELETE FROM queue WHERE ID IN (123);
And there are 2 rows with ID 123... Should it not delete both rows?
and if I wanted a queue like behavior in that situation, I would use a
cursor for update.
Then inside that cursor, use DELETE WHERE CURRENT OF?
Show quoted text
More than one row will be deleted if there in more than one record in
"queue" for the specific value of "id" (i.e "id" is not unique).--
Angular momentum makes the world go 'round.
On 29. Nov 2022, at 18:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Harmen <harmen@lijzij.de> writes:
On Mon, Nov 28, 2022 at 12:11:53PM -0500, Tom Lane wrote:
So basically it's unsafe to run the sub-select more than once,
but the query as written leaves it up to the planner whether
to do that. I'd suggest rephrasing as [...]I'm not the original poster, but I do use similar constructions for simple
postgres queues. I've been trying for a while, but I don't understand where the
extra rows come from, or what's "silent" about SKIP LOCKED.Sorry, I should not have blamed SKIP LOCKED in particular; this
construction will misbehave with or without that. The issue is with
using SELECT FOR UPDATE inside a DELETE or UPDATE that then modifies
the row that the subquery returned. The next execution of the subquery
will, or should, return a different row: either some not-deleted row,
or the modified row. So in this context, the result of the subquery
is volatile. The point of putting it in a MATERIALIZED CTE is to
lock the result down regardless of that.Because we get different results depending on the plan postgres picks, I can
see two options: either the query is broken, or postgres is broken.You can argue that the planner should treat volatile subqueries
differently than it does today. But the only reasonable way of
tightening the semantics would be to force re-execution of such a
subquery every time, even when it's not visibly dependent on the
outer query. That would be pretty bad for performance, and I doubt
it would make the OP happy in this example, because what it would
mean is that his query "fails" every time not just sometimes.
(Because of that, I don't have too much trouble concluding that
the query is broken, whether or not you feel that postgres is
also broken.)The bigger picture here is that we long ago decided that the planner
should not inquire too closely into the volatility of subqueries,
primarily because there are use-cases where people intentionally rely
on them not to be re-executed. As an example, these queries give
different results:regression=# select random() from generate_series(1,3);
random
---------------------
0.7637195395988317
0.09569374432524946
0.490132093120365
(3 rows)regression=# select (select random()) from generate_series(1,3);
random
--------------------
0.9730230633436501
0.9730230633436501
0.9730230633436501
(3 rows)In the second case, the sub-select is deemed to be independent
of the outer query and executed only once. You can argue that
if that's what you want you should be forced to put the sub-select
in a materialized CTE to make that plain. But we felt that that
would make many more people unhappy than happy, so we haven't
done it. Maybe the question could be revisited once all PG
versions lacking the MATERIALIZED syntax are long dead.regards, tom lane
Thanks for the thorough explanation. That seems very reasonable.
The CTE query works well for my use case.
Thanks!