Consequence of changes to CTE's in 12

Started by Steve Baldwinabout 5 years ago7 messagesgeneral
Jump to latest
#1Steve Baldwin
steve.baldwin@gmail.com

Hi,

I realise this is probably an edge case, but would appreciate some advice
or suggestions.

I have a table that has rows to be processed:

postgres=# create table lock_test (id uuid primary key default
gen_random_uuid(), lock_id bigint);
CREATE TABLE
postgres=# insert into lock_test (lock_id) values
(10),(10),(20),(30),(30),(30);
INSERT 0 6
postgres=#* select * from lock_test;
id | lock_id
--------------------------------------+---------
326a2d34-ecec-4c01-94bb-40f43f244d40 | 10
8ed1d680-6304-4fb2-a47c-9427c6d48622 | 10
04482ba1-7193-4e7f-a507-71fe6a351781 | 20
34003468-e959-4c7b-a48c-97195c43982e | 30
2d20394b-c79b-4867-8d0a-72044c370543 | 30
bef7b880-e7a0-4c07-8eab-182c9c1bd33a | 30
(6 rows)

My business rule says I need to process rows by lock_id in descending order
of the number of rows. In my test data, that would mean rows with a lock_id
of 30 would be processed first.

If another 'processor' wakes up while lock_id 30 is being processed, it
moves on to lock_id 10, etc.

My pre-12 solution was a view something like this:

postgres=# create or replace view lock_test_v
as
with g as (
select lock_id, count(*) as n_rows
from lock_test
group by lock_id
order by n_rows desc
), l as (
select lock_id
from g
where pg_try_advisory_xact_lock(lock_id)
limit 1)
select t.*
from lock_test as t
join l on t.lock_id = l.lock_id
;
CREATE VIEW

This works fine, and only creates one advisory lock (or zero) when querying
the view:

postgres=# begin;
BEGIN
postgres=#* select classid, objid from pg_locks where locktype = 'advisory'
and pid = pg_backend_pid();
classid | objid
---------+-------
(0 rows)

postgres=#* select * from lock_test_v;
id | lock_id
--------------------------------------+---------
34003468-e959-4c7b-a48c-97195c43982e | 30
2d20394b-c79b-4867-8d0a-72044c370543 | 30
bef7b880-e7a0-4c07-8eab-182c9c1bd33a | 30
(3 rows)

postgres=#* select classid, objid from pg_locks where locktype = 'advisory'
and pid = pg_backend_pid();
classid | objid
---------+-------
0 | 30
(1 row)

However in 12, the same view returns the same data, but generates multiple
advisory locks:

sns_publisher=# begin;
BEGIN
sns_publisher=#* select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 12.5 on x86_64-pc-linux-musl, compiled by gcc (Alpine
10.2.1_pre1) 10.2.1 20201203, 64-bit
(1 row)

sns_publisher=#* select classid, objid from pg_locks where locktype =
'advisory' and pid = pg_backend_pid();
classid | objid
---------+-------
(0 rows)

sns_publisher=#* select * from lock_test_v;
id | lock_id
--------------------------------------+---------
1a9f3f77-fcdc-4779-8fd9-30f274825e15 | 30
ac670997-9c23-44da-8eb8-e055f02a5f19 | 30
b5f939ac-7c7d-4975-811a-9af26aaa3a31 | 30
(3 rows)

sns_publisher=#* select classid, objid from pg_locks where locktype =
'advisory' and pid = pg_backend_pid();
classid | objid
---------+-------
0 | 20
0 | 30
0 | 10
(3 rows)

If I use 'as materialized' for my 'g' cte, I get the same outcome as with
pre-12 versions.

My 'dilemma' is that this functionality is packaged and the database it is
bundled into could be running on a pre-12 version or 12+. Is there any way
I can rewrite my view to achieve the same outcome (i.e. only creating 0 or
1 advisory locks) regardless of the server version? I realise I could have
two installation scripts but if it is installed into a pre-12 DB and that
DB is subsequently upgraded to 12+, my behaviour is broken.

Any suggestions greatly appreciated.

Steve

#2Michael Lewis
mlewis@entrata.com
In reply to: Steve Baldwin (#1)
Re: Consequence of changes to CTE's in 12

This functionality seems more a candidate for a set-returning function
rather than a view, but I like my views to be side effect free and read
only. It would be trivial to implement in plpgsql I believe.

If you move the limit 1 to the first CTE, does it not give you the same
behavior in both versions?

Show quoted text
#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Steve Baldwin (#1)
Re: Consequence of changes to CTE's in 12

On Thu, Feb 11, 2021 at 5:07 PM Steve Baldwin <steve.baldwin@gmail.com>
wrote:

My 'dilemma' is that this functionality is packaged and the database it is
bundled into could be running on a pre-12 version or 12+. Is there any way
I can rewrite my view to achieve the same outcome (i.e. only creating 0 or
1 advisory locks) regardless of the server version? I realise I could have
two installation scripts but if it is installed into a pre-12 DB and that
DB is subsequently upgraded to 12+, my behaviour is broken.

Pretty sure you will need to choose a location in which to make the
installation behave version-dependently. Within PostgreSQL itself that
would be most easily done by writing a pl/pgsql function that conditionally
adds the MATERIALIZED indicator on the textual representation of the query
before executing it. You can hide that function call within a view if
desired.

David J.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Lewis (#2)
Re: Consequence of changes to CTE's in 12

Michael Lewis <mlewis@entrata.com> writes:

If you move the limit 1 to the first CTE, does it not give you the same
behavior in both versions?

Not sure if that's exactly the same, but certainly adding a traditional
optimization fence (OFFSET 0) to the first CTE should do the trick.

regards, tom lane

#5Steve Baldwin
steve.baldwin@gmail.com
In reply to: Michael Lewis (#2)
Re: Consequence of changes to CTE's in 12

Thanks all. The fact that this is a view is not really relevant. I only
bundled as a view here to make testing simpler. The underlying query still
behaves differently pre-12 and 12+.

Is there a chance that the query optimiser should 'notice' the
pg_try_advisory_xact_lock function, and not be so clever when it sees it?

It makes me wonder what other queries we might have that are inadvertently
relying on the default materializing behaviour of pre-12.

Steve

On Fri, Feb 12, 2021 at 11:24 AM Michael Lewis <mlewis@entrata.com> wrote:

Show quoted text

This functionality seems more a candidate for a set-returning function
rather than a view, but I like my views to be side effect free and read
only. It would be trivial to implement in plpgsql I believe.

If you move the limit 1 to the first CTE, does it not give you the same
behavior in both versions?

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Baldwin (#5)
Re: Consequence of changes to CTE's in 12

Steve Baldwin <steve.baldwin@gmail.com> writes:

Is there a chance that the query optimiser should 'notice' the
pg_try_advisory_xact_lock function, and not be so clever when it sees it?

The general policy with respect to volatile functions in WHERE quals is
"here be dragons". You don't have enough control over when a WHERE clause
will be evaluated to be sure about what the semantics will be; and we
don't want to tie the optimizer's hands to the extent that would be needed
to make it fully predictable.

In this particular case, you can make it fairly safe by making sure there
are optimization fences both above and below where the WHERE clause is.
You have one above from the LIMIT 1, but (with the new interpretation of
CTEs) not one below it. Adding a fence -- either OFFSET 0 or LIMIT ALL --
to the first CTE should fix it in a reasonably version-independent
fashion.

regards, tom lane

#7Steve Baldwin
steve.baldwin@gmail.com
In reply to: Tom Lane (#6)
Re: Consequence of changes to CTE's in 12

Thanks Tom. This optimization fences concept is a new one to me, so great
to know about.

This does indeed give me a nice version-independent solution, and make me a
very happy camper ;-)

Steve

On Fri, Feb 12, 2021 at 11:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Steve Baldwin <steve.baldwin@gmail.com> writes:

Is there a chance that the query optimiser should 'notice' the
pg_try_advisory_xact_lock function, and not be so clever when it sees it?

The general policy with respect to volatile functions in WHERE quals is
"here be dragons". You don't have enough control over when a WHERE clause
will be evaluated to be sure about what the semantics will be; and we
don't want to tie the optimizer's hands to the extent that would be needed
to make it fully predictable.

In this particular case, you can make it fairly safe by making sure there
are optimization fences both above and below where the WHERE clause is.
You have one above from the LIMIT 1, but (with the new interpretation of
CTEs) not one below it. Adding a fence -- either OFFSET 0 or LIMIT ALL --
to the first CTE should fix it in a reasonably version-independent
fashion.

regards, tom lane