Function Volatility and Views Unexpected Behavior

Started by David Kohnover 8 years ago4 messages

djk447@gmail.com

over 8 years ago

(Not sure whether I should be sending this to hackers or somewhere else,
let me know if I should move, feel like I should say long time listener
first time caller or something)

I encountered some unexpected behavior when debugging a query that was
taking longer than expected, basically, a volatile function that makes a
column in a view is called even when that column is not selected in the
query, making it so that the function is called for every row in the view,
I'm not sure that that would necessarily be the expected behavior, as it
was my understanding that columns that are not selected are not evaluated,
for instance if there was a join in a view that produced some columns and
said columns were not selected, I would expect it to be optimized away. I
understand that for volatile functions, that might not be possible, but it
might still be unexpected. See the below example:
```
--setup
create table table1 (id int primary key, data1 float);
create table table2 (id int primary key, table1_id int references table1
(id), data2 float);
insert into table1 select generate_series(1, 10000), random();
insert into table2 select generate_series(1,100000),
floor(random()*(10000-2))+1, random();

create view table1_view as select * from table1;
create or replace function something_silly(int) returns float as $$select
0.002342::float from (select pg_sleep(0.001)) b $$ language SQL;
create view table1_silly_view as select id, data1, something_silly(id) from
table1;
```

`explain analyze select t1.data1 from table2 t2 left join table1_silly_view
t1 on t2.table1_id = t1.id where t2.data2 >0.5 and t2.data2<0.52; `
Hash Left Join (cost=2880.00..4949.79 rows=2094 width=8) (actual
time=22813.018..22841.189 rows=1995 loops=1)
Hash Cond: (t2.table1_id = t1.id)
-> Seq Scan on table2 t2 (cost=0.00..2041.00 rows=2094 width=4) (actual
time=0.018..26.836 rows=1995 loops=1)
Filter: ((data2 > '0.5'::double precision) AND (data2 <
'0.52'::double precision))
Rows Removed by Filter: 98005
-> Hash (cost=2755.00..2755.00 rows=10000 width=12) (actual
time=22812.977..22812.977 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 558kB
-> Subquery Scan on t1 (cost=0.00..2755.00 rows=10000 width=12)
(actual time=2.447..22768.199 rows=10000 loops=1)
-> Seq Scan on table1 (cost=0.00..2655.00 rows=10000
width=20) (actual time=2.446..22751.809 rows=10000 loops=1)
Planning time: 0.476 ms
Execution time: 22841.362 ms

```
create or replace function something_less_silly(int) returns float as
$$select 0.002342::float from (select pg_sleep(0.001)) b $$ language SQL
stable;
create view table1_slightly_less_silly_view as select id, data1,
something_less_silly(id) from table1;
```
`explain analyze select data1 from table2 t2 left join
table1_slightly_less_silly_view t1 on t2.table1_id = t1.id where t2.data2

0.5 and t2.data2<0.52;`

Hash Left Join (cost=280.00..2349.79 rows=2094 width=8) (actual
time=8.634..48.834 rows=1995 loops=1)
Hash Cond: (t2.table1_id = table1.id)
-> Seq Scan on table2 t2 (cost=0.00..2041.00 rows=2094 width=4) (actual
time=0.027..38.253 rows=1995 loops=1)
Filter: ((data2 > '0.5'::double precision) AND (data2 <
'0.52'::double precision))
Rows Removed by Filter: 98005
-> Hash (cost=155.00..155.00 rows=10000 width=12) (actual
time=8.579..8.579 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 558kB
-> Seq Scan on table1 (cost=0.00..155.00 rows=10000 width=12)
(actual time=0.012..3.984 rows=10000 loops=1)
Planning time: 0.468 ms
Execution time: 49.068 ms

The other problem is that the function call does not appear in the query
plan. You see lots of extra time in the sequential scan part, but no idea
of why that might be occurring, so it can be hard to determine that the
culprit is the function call in the view in a column that you're not
selecting. It is also unclear to me whether the function would be evaluated
for every row in the view even when a where clause restricts what you are
retrieving from the view. Anyway, I'm not sure whether this is a bug or a
feature, but I think it might be good to document it somewhere, I had a
real case where someone had made a function to do some calculations on a
json blob, and hadn't marked it stable/immutable, which it should have
been, they weren't selecting that column from the view, but the query was
taking 700ms when a query to the underlying table took 1.5ms. We marked the
function stable and the view started working fine, it was just really hard
to figure out that that was what was going on.

Would it be possible to have the explain output show that function call? It
appears that the planner knows about it because it estimates a
significantly higher cost for the seq scan on table1, but just doesn't
mention that that's why.

That might be a longer term fix, in the meantime, given that the default
volatility category is volatile, I can imagine a newer user being truly
perplexed as to what was going on. Does anyone have thoughts as to where
documentation on that could be added to provide users some guidance? CREATE
FUNCTION? CREATE VIEW? I'm happy to write up a note on that behavior if
people think that would be useful.

Best,
David Kohn

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: David Kohn (#1)

Re: Function Volatility and Views Unexpected Behavior

David Kohn <djk447@gmail.com> writes:

I encountered some unexpected behavior when debugging a query that was
taking longer than expected, basically, a volatile function that makes a
column in a view is called even when that column is not selected in the
query, making it so that the function is called for every row in the view,
I'm not sure that that would necessarily be the expected behavior, as it
was my understanding that columns that are not selected are not evaluated,
for instance if there was a join in a view that produced some columns and
said columns were not selected, I would expect it to be optimized away.

No, this is the expected behavior; we don't like optimization to change
the number of calls of a volatile function from what would occur in naive
evaluation of the query. If that prospect doesn't bother you, it's
likely because your function isn't really volatile ...

The other problem is that the function call does not appear in the query
plan.

I think "explain verbose" will fix that for you.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Kohn

djk447@gmail.com

over 8 years ago

In reply to: Tom Lane (#2)

Re: Function Volatility and Views Unexpected Behavior

Thanks for the reminder about explain verbose, that's helpful.

But optimization does massively change the number of calls of a volatile
function in a naive evaluation of a query:

`explain analyze verbose select data1 from table1_silly_view where id >=10
and id <= 100;`

does an index scan and only runs the volatile function for rows in the view
where id >= 10 and id <=100

Subquery Scan on table1_silly_view (cost=0.29..33.77 rows=91 width=8)
(actual time=2.552..206.563 rows=91 loops=1)

Output: table1_silly_view.data1

-> Index Scan using table1_pkey on public.table1 (cost=0.29..32.86
rows=91 width=20) (actual time=2.550..206.425 rows=91 loops=1)

Output: NULL::integer, table1.data1, something_silly(table1.id)

Index Cond: ((table1.id >= 10) AND (table1.id <= 100))

Planning time: 0.526 ms

Execution time: 206.724 ms

whereas

`explain analyze verbose select data1 from table1_silly_view where id in (
select id from table1 where id >= 10 and id <=100);`

does a full sequential scan, over the view, producing whatever side effects
the volatile function does for every row in the view even though they
produce the same output and have what should be equivalent quals.

Hash Semi Join (cost=11.24..2793.50 rows=91 width=8) (actual
time=23.603..22759.297 rows=91 loops=1)

Output: table1_1.data1

Hash Cond: (table1_1.id = table1.id)

-> Seq Scan on public.table1 table1_1 (cost=0.00..2655.00 rows=10000
width=20) (actual time=2.468..22720.942 rows=10000 loops=1)

Output: table1_1.id, table1_1.data1, something_silly(table1_1.id)

-> Hash (cost=10.11..10.11 rows=91 width=4) (actual time=0.484..0.484
rows=91 loops=1)

Output: table1.id

Buckets: 1024 Batches: 1 Memory Usage: 12kB

-> Index Only Scan using table1_pkey on public.table1
(cost=0.29..10.11 rows=91 width=4) (actual time=0.383..0.430 rows=91
loops=1)

Output: table1.id

Index Cond: ((table1.id >= 10) AND (table1.id <= 100))

Heap Fetches: 91

Planning time: 0.877 ms

Execution time: 22759.448 ms

I recognize that it is an anti-pattern to put a volatile function call in a
view, and don't know that there's a better way of dealing with it, as not
using indexes in a view that has a volatile function call in it at all
seems like a very bad choice, but still think it might be something to
document better.

-David

On Wed, Jul 12, 2017 at 3:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

David Kohn <djk447@gmail.com> writes:

I encountered some unexpected behavior when debugging a query that was
taking longer than expected, basically, a volatile function that makes a
column in a view is called even when that column is not selected in the
query, making it so that the function is called for every row in the

view,

I'm not sure that that would necessarily be the expected behavior, as it
was my understanding that columns that are not selected are not

evaluated,

for instance if there was a join in a view that produced some columns and
said columns were not selected, I would expect it to be optimized away.

No, this is the expected behavior; we don't like optimization to change
the number of calls of a volatile function from what would occur in naive
evaluation of the query. If that prospect doesn't bother you, it's
likely because your function isn't really volatile ...

The other problem is that the function call does not appear in the query
plan.

I think "explain verbose" will fix that for you.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tom Lane (#2)

Re: Function Volatility and Views Unexpected Behavior

On Wed, Jul 12, 2017 at 3:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Kohn <djk447@gmail.com> writes:

I encountered some unexpected behavior when debugging a query that was
taking longer than expected, basically, a volatile function that makes a
column in a view is called even when that column is not selected in the
query, making it so that the function is called for every row in the view,
I'm not sure that that would necessarily be the expected behavior, as it
was my understanding that columns that are not selected are not evaluated,
for instance if there was a join in a view that produced some columns and
said columns were not selected, I would expect it to be optimized away.

No, this is the expected behavior; we don't like optimization to change
the number of calls of a volatile function from what would occur in naive
evaluation of the query. If that prospect doesn't bother you, it's
likely because your function isn't really volatile ...

I don't think I agree with that. If something is VOLATILE, that means
you want it to be recalculated each time, but it doesn't necessarily
mean that you want it calculated if it in no way changes the result
set.

I guess maybe there's a difference between a VOLATILE function like
random(), which is expected to produce a different answer each time
but probably has no side effects that you care about (unless you care
about the fact that the state of the PRNG has changed) and pg_sleep(),
whose return value is always the same but whose side effects are of
critical importance. Maybe we need separate terms for
volatile-because-the-answer-is-unstable and
volatile-because-it-has-side-effects.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers