Regarding query minimizer (simplifier)
Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.
To analyze the root cause of problem, I had to find minimal query that contains same problem. (i.e., same regression issue) The initial query was about 14K and I should manually remove part of query. (it took about 30 minutes to get 500 bytes query) So I am wondering if there are tools to automatically simplify statements to analyze the performance issue.
Thanks,
Jinho Jung
"Jung, Jinho" <jinho.jung@gatech.edu> writes:
Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.
Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).
It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.
regards, tom lane
Hello Tom,
Sorry for the misleading. Could you try these two queries? I made the query even slower in latest version of postgres. These are information about how we set up evaluation environment and query result.
Thanks,
Jinho Jung
Install Multiple version of DBs in one machine
======================================
# Install 10.5
$ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
$ sudo apt update
$ sudo apt-get install postgresql-10
# Install 9.6
$ sudo apt-get install postgresql-9.6
# Install 9.5
$ sudo apt-get install postgresql-9.5
# Install 9.4
$ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev postgresql-server-dev-9.4
# check
$ pg_lsclusters
Original regression query
==========================
explain analyze
select
1
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
;
ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms
CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor =
( select max(ref_2.grantor)
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
where ref_2.object_catalog = f1.object_catalog
)
;
CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower
________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, October 13, 2018 5:59:06 PM
To: Jung, Jinho
Cc: pgsql-hackers@lists.postgresql.org
Subject: Re: Regarding query minimizer (simplifier)
"Jung, Jinho" <jinho.jung@gatech.edu> writes:
Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.
Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).
It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.
regards, tom lane
Hello,
We appreciate you taking time for test! When we do more evaluation, we
noticed that the previously attached query made regression only on DBs that
we installed from APT manager (i.e., apt-get command) not on DBs that we
built from the source code. But we also confirmed that there are many cases
that cause regression to all DBs (installed from APT and build from source
code)
Hope you can also test these queries too. These are the execution time on
our machine.
*1.sql*
10.5 : 20ms
9.4.19: 1,227ms
*4.sql*
10.5 : 13ms
9.4.19: 88,721ms
*20.sql*
10.5 : 271ms
9.4.19: 6,104ms
*22.sql*
10.5 : 8ms
9.4.19: 105ms
Jinho Jung
On Tue, Oct 23, 2018 at 9:52 AM Jung, Jinho <jinho.jung@gatech.edu> wrote:
Show quoted text
Hello Tom,
Sorry for the misleading. Could you try these two queries? I made the
query even slower in latest version of postgres. These are information
about how we set up evaluation environment and query result.Thanks,
Jinho Jung
Install Multiple version of DBs in one machine
======================================
# Install 10.5
$ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc
| sudo apt-key add -
$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/
xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
$ sudo apt update
$ sudo apt-get install postgresql-10# Install 9.6
$ sudo apt-get install postgresql-9.6# Install 9.5
$ sudo apt-get install postgresql-9.5# Install 9.4
$ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev
postgresql-server-dev-9.4# check
$ pg_lsclustersOriginal regression query
==========================
explain analyze
select
1
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
;ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76msCORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor =
( select max(ref_2.grantor)
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
where ref_2.object_catalog = f1.object_catalog
)
;CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower------------------------------
*From:* Tom Lane <tgl@sss.pgh.pa.us>
*Sent:* Saturday, October 13, 2018 5:59:06 PM
*To:* Jung, Jinho
*Cc:* pgsql-hackers@lists.postgresql.org
*Subject:* Re: Regarding query minimizer (simplifier)"Jung, Jinho" <jinho.jung@gatech.edu> writes:
Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find
any SQL queries that cause performance regression. While conducting
evaluation, I found an interesting query which makes x80 times slower
execution in version 10.5 than version 9.4. Please see the attached files,
if you are interested.Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.regards, tom lane
Attachments:
1.sqlapplication/sql; name=1.sqlDownload
EXPLAIN ANALYZE
WITH
jennifer_0 AS (select
subq_0.c3 as c0,
subq_0.c6 as c1,
subq_0.c0 as c2,
subq_0.c3 as c3,
(select pg_catalog.max(idx_scan) from pg_catalog.pg_stat_user_indexes)
as c4,
case when subq_0.c6 is NULL then pg_catalog.timeofday() else pg_catalog.timeofday() end
as c5
from
(select
ref_0.typanalyze as c0,
ref_0.typispreferred as c1,
ref_0.typtypmod as c2,
ref_0.typrelid as c3,
ref_0.typcollation as c4,
ref_0.typdelim as c5,
ref_0.typnotnull as c6
from
pg_catalog.pg_type as ref_0
where case when ((cast(null as anyrange) <> cast(null as anyrange))
and (((cast(null as path) @> cast(null as point))
or ((ref_0.typtypmod <= ref_0.typndims)
and ((ref_0.typtype is not NULL)
and (((false)
and (cast(null as tsvector) @@@ cast(null as tsquery)))
or (cast(null as record) *<> cast(null as record))))))
and (cast(null as box) ?# cast(null as box))))
and (ref_0.typname >= ref_0.typname) then cast(null as polygon) else cast(null as polygon) end
<@ pg_catalog.polygon(
cast(ref_0.typndims as int4),
cast(cast(null as circle) as circle))
limit 45) as subq_0
where (pg_catalog.tintervalin(
cast(case when (true)
and ((cast(null as circle) |>> cast(null as circle))
and (true)) then cast(null as cstring) else cast(null as cstring) end
as cstring)) #> case when subq_0.c2 is not NULL then case when true then cast(null as reltime) else cast(null as reltime) end
else case when true then cast(null as reltime) else cast(null as reltime) end
end
)
or ((select agginitval from pg_catalog.pg_aggregate limit 1 offset 5)
>= (select provider from pg_catalog.pg_shseclabel limit 1 offset 6)
))
select
subq_1.c0 as c0,
subq_1.c0 as c1,
subq_1.c0 as c2,
subq_1.c1 as c3,
subq_1.c1 as c4,
subq_1.c0 as c5
from
(select
ref_1.relname as c0,
ref_1.relname as c1
from
pg_catalog.pg_statio_sys_sequences as ref_1
inner join information_schema.role_routine_grants as ref_2
on (ref_1.blks_read >= ref_1.blks_hit)
inner join pg_catalog.pg_type as ref_3
right join information_schema.user_mapping_options as ref_4
on ((select seq_scan from pg_catalog.pg_stat_xact_sys_tables limit 1 offset 2)
<= ref_3.typlen)
on (ref_2.specific_name = ref_4.authorization_identifier )
where ref_1.blks_hit is not NULL) as subq_1
where true
limit 107
;;*Order is reversed. *
*1.sql*
9.4.19: 20ms
10.5 : 1,227ms
*4.sql*
9.4.19: 13ms
10.5 : 88,721ms
*20.sql*
9.4.19: 271ms
10.5 : 6,104ms
*22.sql*
9.4.19: 8ms
10.5 : 105ms
On Tue, Oct 23, 2018 at 3:15 PM Jinho Jung <visusee@gmail.com> wrote:
Show quoted text
Hello,
We appreciate you taking time for test! When we do more evaluation, we
noticed that the previously attached query made regression only on DBs that
we installed from APT manager (i.e., apt-get command) not on DBs that we
built from the source code. But we also confirmed that there are many cases
that cause regression to all DBs (installed from APT and build from source
code)Hope you can also test these queries too. These are the execution time on
our machine.*1.sql*
10.5 : 20ms
9.4.19: 1,227ms*4.sql*
10.5 : 13ms
9.4.19: 88,721ms*20.sql*
10.5 : 271ms
9.4.19: 6,104ms*22.sql*
10.5 : 8ms
9.4.19: 105msJinho Jung
On Tue, Oct 23, 2018 at 9:52 AM Jung, Jinho <jinho.jung@gatech.edu> wrote:
Hello Tom,
Sorry for the misleading. Could you try these two queries? I made the
query even slower in latest version of postgres. These are information
about how we set up evaluation environment and query result.Thanks,
Jinho Jung
Install Multiple version of DBs in one machine
======================================
# Install 10.5
$ wget --quiet -O -
https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/
xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
$ sudo apt update
$ sudo apt-get install postgresql-10# Install 9.6
$ sudo apt-get install postgresql-9.6# Install 9.5
$ sudo apt-get install postgresql-9.5# Install 9.4
$ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4
libpq-dev postgresql-server-dev-9.4# check
$ pg_lsclustersOriginal regression query
==========================
explain analyze
select
1
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
;ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76msCORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor =
( select max(ref_2.grantor)
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
where ref_2.object_catalog = f1.object_catalog
)
;CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower------------------------------
*From:* Tom Lane <tgl@sss.pgh.pa.us>
*Sent:* Saturday, October 13, 2018 5:59:06 PM
*To:* Jung, Jinho
*Cc:* pgsql-hackers@lists.postgresql.org
*Subject:* Re: Regarding query minimizer (simplifier)"Jung, Jinho" <jinho.jung@gatech.edu> writes:
Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find
any SQL queries that cause performance regression. While conducting
evaluation, I found an interesting query which makes x80 times slower
execution in version 10.5 than version 9.4. Please see the attached files,
if you are interested.Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.regards, tom lane