BUG #14715: Constraint exclusion isn't used in function using language sql

Started by Clive Evansalmost 9 years ago4 messagesbugs
Jump to latest
#1Clive Evans
cliveevans@ntlworld.com

The following bug has been logged on the website:

Bug reference: 14715
Logged by: Clive Evans
Email address: cliveevans@ntlworld.com
PostgreSQL version: 9.6.3
Operating system: CentOS Linux release 7.3.1611
Description:

The same query written using PL/PGSQL will only scan the expected partition
tables.

For example:
partition_test=# DO

$$
BEGIN
CREATE TABLE customer_reviews(
customer_id TEXT,
review_date DATE,
review_rating INTEGER,
review_votes INTEGER,
review_helpful_votes INTEGER,
product_id CHAR(10),
product_title TEXT,

product_sales_rank BIGINT,
product_group TEXT,
product_category TEXT,
product_subcategory TEXT,
similar_product_ids CHAR(10)[]
);
FOR n in 2000..2004
LOOP
EXECUTE 'CREATE TABLE customer_reviews_' || n || ' (check (review_date >=
''' || n || '-01-01'' AND review_date < ''' || n + 1 || '-01-01'')) INHERITS
(customer_reviews) ;';
END LOOP;
END
$$ ;
DO
Time: 15.647 ms
partition_test=# CREATE OR REPLACE FUNCTION title_vs_review_sql(start_date
DATE, end_date DATE)
RETURNS TABLE (
title_length_bucket INTEGER,
review_average NUMERIC,
count BIGINT
) AS
$BODY$
SELECT
width_bucket(length(product_title), 1, 50, 5) title_length_bucket,
round(avg(review_rating), 2) AS review_average,
count(*)
FROM
customer_reviews
WHERE
product_group = 'Book'
AND
review_date BETWEEN start_date AND end_date
GROUP BY
title_length_bucket
ORDER BY

title_length_bucket

$BODY$
LANGUAGE SQL;
CREATE FUNCTION
Time: 0.918 ms
partition_test=# CREATE OR REPLACE FUNCTION
title_vs_review_plpgsql(start_date DATE, end_date DATE)
RETURNS TABLE (
title_length_bucket INTEGER,
review_average NUMERIC,
count BIGINT
) AS
$BODY$
BEGIN
RETURN QUERY
SELECT
width_bucket(length(product_title), 1, 50, 5)
title_length_bucket,
round(avg(review_rating), 2) AS review_average,
count(*)
FROM
customer_reviews
WHERE
product_group = 'Book'
AND
review_date BETWEEN start_date AND end_date
GROUP BY
title_length_bucket
ORDER BY
title_length_bucket;
END;
$BODY$
LANGUAGE PLPGSQL;
CREATE FUNCTION
Time: 1.375 ms
partition_test=# set auto_explain.log_min_duration to 0;
SET
Time: 0.190 ms
partition_test=# set auto_explain.log_nested_statements to true ;
SET
Time: 0.197 ms
partition_test=# select title_vs_review_plpgsql('2001-01-01',
'2001-12-31');
title_vs_review_plpgsql
-------------------------
(0 rows)

Time: 0.899 ms
partition_test=# select title_vs_review_sql('2001-01-01', '2001-12-31');
title_vs_review_sql
---------------------
(0 rows)

Time: 0.949 ms

When we check the logs for the query plans, I expect them both to only scan
the one child table, however:
< 2017-06-21 13:40:21.086 UTC > LOG: duration: 0.013 ms plan:
Query Text: SELECT
width_bucket(length(product_title), 1, 50, 5) title_length_bucket,
round(avg(review_rating), 2) AS review_average,
count(*)
FROM
customer_reviews
WHERE
product_group = 'Book'
AND
review_date BETWEEN start_date AND end_date
GROUP BY
title_length_bucket
ORDER BY
title_length_bucket
GroupAggregate (cost=14.95..15.01 rows=2 width=44) (actual
time=0.011..0.011 rows=0 loops=1)
Group Key: (width_bucket((length(customer_reviews.product_title))::double
precision, '1'::double precision, '50'::double precision, 5))
-> Sort (cost=14.95..14.95 rows=2 width=8) (actual time=0.010..0.010
rows=0 loops=1)
Sort Key:
(width_bucket((length(customer_reviews.product_title))::double precision,
'1'::double precision, '50'::double precision, 5))
Sort Method: quicksort Memory: 25kB
-> Result (cost=0.00..14.94 rows=2 width=8) (actual
time=0.005..0.005 rows=0 loops=1)
-> Append (cost=0.00..14.90 rows=2 width=36) (actual
time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on customer_reviews (cost=0.00..0.00
rows=1 width=36) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((review_date >= '2001-01-01'::date) AND
(review_date <= '2001-12-31'::date) AND (product_group = 'Book'::text))
-> Seq Scan on customer_reviews_2001
(cost=0.00..14.90 rows=1 width=36) (actual time=0.000..0.000 rows=0
loops=1)
Filter: ((review_date >= '2001-01-01'::date) AND
(review_date <= '2001-12-31'::date) AND (product_group = 'Book'::text))
< 2017-06-21 13:40:21.086 UTC > CONTEXT: PL/pgSQL function
title_vs_review_plpgsql(date,date) line 3 at RETURN QUERY
< 2017-06-21 13:40:21.086 UTC > LOG: duration: 0.640 ms plan:
Query Text: select title_vs_review_plpgsql('2001-01-01', '2001-12-31');
Result (cost=0.00..5.25 rows=1000 width=32) (actual time=0.636..0.636
rows=0 loops=1)
< 2017-06-21 13:40:26.869 UTC > LOG: duration: 0.030 ms plan:
Query Text:
SELECT
width_bucket(length(product_title), 1, 50, 5) title_length_bucket,
round(avg(review_rating), 2) AS review_average,
count(*)
FROM
customer_reviews
WHERE
product_group = 'Book'
AND
review_date BETWEEN start_date AND end_date
GROUP BY
title_length_bucket
ORDER BY
title_length_bucket

GroupAggregate (cost=74.68..74.88 rows=6 width=44) (actual
time=0.028..0.028 rows=0 loops=1)
Group Key: (width_bucket((length(customer_reviews.product_title))::double
precision, '1'::double precision, '50'::double precision, 5))
-> Sort (cost=74.68..74.70 rows=6 width=8) (actual time=0.026..0.026
rows=0 loops=1)
Sort Key:
(width_bucket((length(customer_reviews.product_title))::double precision,
'1'::double precision, '50'::double precision, 5))
Sort Method: quicksort Memory: 25kB
-> Result (cost=0.00..74.61 rows=6 width=8) (actual
time=0.007..0.007 rows=0 loops=1)
-> Append (cost=0.00..74.50 rows=6 width=36) (actual
time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on customer_reviews (cost=0.00..0.00
rows=1 width=36) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
-> Seq Scan on customer_reviews_2000
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)
Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
-> Seq Scan on customer_reviews_2001
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)
Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
-> Seq Scan on customer_reviews_2002
(cost=0.00..14.90 rows=1 width=36) (actual time=0.000..0.000 rows=0
loops=1)
Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
-> Seq Scan on customer_reviews_2003
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)
Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
-> Seq Scan on customer_reviews_2004
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)
Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
< 2017-06-21 13:40:26.869 UTC > CONTEXT: SQL function "title_vs_review_sql"
statement 1
< 2017-06-21 13:40:26.869 UTC > LOG: duration: 0.684 ms plan:
Query Text: select title_vs_review_sql('2001-01-01', '2001-12-31');
Result (cost=0.00..5.25 rows=1000 width=32) (actual time=0.680..0.680
rows=0 loops=1)

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Clive Evans (#1)
Re: BUG #14715: Constraint exclusion isn't used in function using language sql

Hello,

At Wed, 21 Jun 2017 14:00:56 +0000, cliveevans@ntlworld.com wrote in <20170621140056.27883.82221@wrigleys.postgresql.org>

PostgreSQL version: 9.6.3

...

The same query written using PL/PGSQL will only scan the expected partition
tables.

The two are different in that the parameters of the PL/PgSQL
function are regarded as constants at the time of planning of the
inner SQL statement, while those of the SQL function are
not. Constraint exclusion is considered while planning so the SQL
function doesn't get benefit of it.

This doesn't seem a bug, but I couldn't find such a description
in the documentation, though. Maybe it should be here.

https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS

regareds,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Clive Evans
cliveevans@ntlworld.com
In reply to: Kyotaro Horiguchi (#2)
Re: BUG #14715: Constraint exclusion isn't used in function using language sql

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 23/06/17 01:39, Kyotaro HORIGUCHI
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:20170623.103957.31224862.horiguchi.kyotaro@lab.ntt.co.jp">
<pre wrap="">Hello,

At Wed, 21 Jun 2017 14:00:56 +0000, <a class="moz-txt-link-abbreviated" href="mailto:cliveevans@ntlworld.com">cliveevans@ntlworld.com</a> wrote in <a class="moz-txt-link-rfc2396E" href="mailto:20170621140056.27883.82221@wrigleys.postgresql.org">&lt;20170621140056.27883.82221@wrigleys.postgresql.org&gt;</a>
</pre>
<blockquote type="cite">
<pre wrap="">PostgreSQL version: 9.6.3
</pre>
</blockquote>
<pre wrap="">...
</pre>
<blockquote type="cite">
<pre wrap="">The same query written using PL/PGSQL will only scan the expected partition
tables.
</pre>
</blockquote>
<pre wrap="">
The two are different in that the parameters of the PL/PgSQL
function are regarded as constants at the time of planning of the
inner SQL statement, while those of the SQL function are
not. Constraint exclusion is considered while planning so the SQL
function doesn't get benefit of it.</pre>
</blockquote>
<br>
I understand this, and I understand why. You can't plan based on
what you don't yet know. I'm not entirely clear why the inner query
can treat the parameter as a constant, whereas the outer one is
forced to treat it as dynamic.<br>
<br>
I assume it's something to do with when the plan is created.
Possibly it's an attempt to save the planning time and re-use the
plan in the case of a 'normal' SQL function, although this seems
like a trade off that may well have significant downsides - in this
case obviously, but also others.<br>
<br>
Perhaps I'm looking for a new feature, rather than reporting a bug.
Something that allows lazy planning, where there's a good chance
that treating the parameters as constants is likely to improve the
produced plan sufficiently to offset the additional overhead of
repeatedly planning.<br>
<br>
thanks,<br>
-- <br>
Clive Evans<br>
</body>
</html>

#4Clive Evans
cliveevans@ntlworld.com
In reply to: Clive Evans (#3)
Re: BUG #14715: Constraint exclusion isn't used in function using language sql

On 23/06/17 08:59, Clive Evans wrote:

On 23/06/17 01:39, Kyotaro HORIGUCHI wrote:

Hello,

At Wed, 21 Jun 2017 14:00:56 +0000,cliveevans@ntlworld.com wrote in<20170621140056.27883.82221@wrigleys.postgresql.org>

PostgreSQL version: 9.6.3

...

The same query written using PL/PGSQL will only scan the expected partition
tables.

The two are different in that the parameters of the PL/PgSQL
function are regarded as constants at the time of planning of the
inner SQL statement, while those of the SQL function are
not. Constraint exclusion is considered while planning so the SQL
function doesn't get benefit of it.

I understand this, and I understand why. You can't plan based on what
you don't yet know. I'm not entirely clear why the inner query can
treat the parameter as a constant, whereas the outer one is forced to
treat it as dynamic.

I assume it's something to do with when the plan is created. Possibly
it's an attempt to save the planning time and re-use the plan in the
case of a 'normal' SQL function, although this seems like a trade off
that may well have significant downsides - in this case obviously, but
also others.

Perhaps I'm looking for a new feature, rather than reporting a bug.
Something that allows lazy planning, where there's a good chance that
treating the parameters as constants is likely to improve the produced
plan sufficiently to offset the additional overhead of repeatedly
planning.

Sorry, mail client went a bit rogue. Here it is again, (hopefully)
legibly ...

thanks,
Clive

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs