Regarding query minimizer (simplifier)

Started by Jung, Jinhoover 7 years ago5 messages
#1Jung, Jinho
jinho.jung@gatech.edu
2 attachment(s)

Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.

To analyze the root cause of problem, I had to find minimal query that contains same problem. (i.e., same regression issue) The initial query was about 14K and I should manually remove part of query. (it took about 30 minutes to get 500 bytes query) So I am wondering if there are tools to automatically simplify statements to analyze the performance issue.

Thanks,
Jinho Jung

Attachments:

query6_originalapplication/octet-stream; name=query6_originalDownload
query6_manually_minimizeapplication/octet-stream; name=query6_manually_minimizeDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jung, Jinho (#1)
Re: Regarding query minimizer (simplifier)

"Jung, Jinho" <jinho.jung@gatech.edu> writes:

Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

regards, tom lane

#3Jung, Jinho
jinho.jung@gatech.edu
In reply to: Tom Lane (#2)
2 attachment(s)
Re: Regarding query minimizer (simplifier)

Hello Tom,

Sorry for the misleading. Could you try these two queries? I made the query even slower in latest version of postgres. These are information about how we set up evaluation environment and query result.

Thanks,

Jinho Jung

Install Multiple version of DBs in one machine
======================================
# Install 10.5
$ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
$ sudo apt update
$ sudo apt-get install postgresql-10

# Install 9.6
$ sudo apt-get install postgresql-9.6

# Install 9.5
$ sudo apt-get install postgresql-9.5

# Install 9.4
$ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev postgresql-server-dev-9.4

# check
$ pg_lsclusters

Original regression query
==========================
explain analyze
select
1
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
;

ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms

CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor =
( select max(ref_2.grantor)
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
where ref_2.object_catalog = f1.object_catalog
)
;

CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower

________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, October 13, 2018 5:59:06 PM
To: Jung, Jinho
Cc: pgsql-hackers@lists.postgresql.org
Subject: Re: Regarding query minimizer (simplifier)

"Jung, Jinho" <jinho.jung@gatech.edu> writes:

Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

regards, tom lane

Attachments:

query_regressionapplication/octet-stream; name=query_regressionDownload
query_much_regressionapplication/octet-stream; name=query_much_regressionDownload
#4Jinho Jung
visusee@gmail.com
In reply to: Jung, Jinho (#3)
4 attachment(s)
Re: Regarding query minimizer (simplifier)

Hello,

We appreciate you taking time for test! When we do more evaluation, we
noticed that the previously attached query made regression only on DBs that
we installed from APT manager (i.e., apt-get command) not on DBs that we
built from the source code. But we also confirmed that there are many cases
that cause regression to all DBs (installed from APT and build from source
code)

Hope you can also test these queries too. These are the execution time on
our machine.

*1.sql*
10.5 : 20ms
9.4.19: 1,227ms

*4.sql*
10.5 : 13ms
9.4.19: 88,721ms

*20.sql*
10.5 : 271ms
9.4.19: 6,104ms

*22.sql*
10.5 : 8ms
9.4.19: 105ms

Jinho Jung

On Tue, Oct 23, 2018 at 9:52 AM Jung, Jinho <jinho.jung@gatech.edu> wrote:

Show quoted text

Hello Tom,

Sorry for the misleading. Could you try these two queries? I made the
query even slower in latest version of postgres. These are information
about how we set up evaluation environment and query result.

Thanks,

Jinho Jung

Install Multiple version of DBs in one machine
======================================
# Install 10.5
$ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc
| sudo apt-key add -
$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/
xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
$ sudo apt update
$ sudo apt-get install postgresql-10

# Install 9.6
$ sudo apt-get install postgresql-9.6

# Install 9.5
$ sudo apt-get install postgresql-9.5

# Install 9.4
$ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev
postgresql-server-dev-9.4

# check
$ pg_lsclusters

Original regression query
==========================
explain analyze
select
1
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
;

ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms

CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor =
( select max(ref_2.grantor)
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
where ref_2.object_catalog = f1.object_catalog
)
;

CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower

------------------------------
*From:* Tom Lane <tgl@sss.pgh.pa.us>
*Sent:* Saturday, October 13, 2018 5:59:06 PM
*To:* Jung, Jinho
*Cc:* pgsql-hackers@lists.postgresql.org
*Subject:* Re: Regarding query minimizer (simplifier)

"Jung, Jinho" <jinho.jung@gatech.edu> writes:

Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find

any SQL queries that cause performance regression. While conducting
evaluation, I found an interesting query which makes x80 times slower
execution in version 10.5 than version 9.4. Please see the attached files,
if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

regards, tom lane

Attachments:

4.sqlapplication/sql; name=4.sqlDownload
1.sqlapplication/sql; name=1.sqlDownload
EXPLAIN ANALYZE

WITH 
jennifer_0 AS (select  
    subq_0.c3 as c0, 
    subq_0.c6 as c1, 
    subq_0.c0 as c2, 
    subq_0.c3 as c3, 
    (select pg_catalog.max(idx_scan) from pg_catalog.pg_stat_user_indexes)
       as c4, 
    case when subq_0.c6 is NULL then pg_catalog.timeofday() else pg_catalog.timeofday() end
       as c5
  from 
    (select  
          ref_0.typanalyze as c0, 
          ref_0.typispreferred as c1, 
          ref_0.typtypmod as c2, 
          ref_0.typrelid as c3, 
          ref_0.typcollation as c4, 
          ref_0.typdelim as c5, 
          ref_0.typnotnull as c6
        from 
          pg_catalog.pg_type as ref_0
        where case when ((cast(null as anyrange) <> cast(null as anyrange)) 
                and (((cast(null as path) @> cast(null as point)) 
                    or ((ref_0.typtypmod <= ref_0.typndims) 
                      and ((ref_0.typtype is not NULL) 
                        and (((false) 
                            and (cast(null as tsvector) @@@ cast(null as tsquery))) 
                          or (cast(null as record) *<> cast(null as record)))))) 
                  and (cast(null as box) ?# cast(null as box)))) 
              and (ref_0.typname >= ref_0.typname) then cast(null as polygon) else cast(null as polygon) end
             <@ pg_catalog.polygon(
            cast(ref_0.typndims as int4),
            cast(cast(null as circle) as circle))
        limit 45) as subq_0
  where (pg_catalog.tintervalin(
        cast(case when (true) 
            and ((cast(null as circle) |>> cast(null as circle)) 
              and (true)) then cast(null as cstring) else cast(null as cstring) end
           as cstring)) #> case when subq_0.c2 is not NULL then case when true then cast(null as reltime) else cast(null as reltime) end
           else case when true then cast(null as reltime) else cast(null as reltime) end
           end
        ) 
    or ((select agginitval from pg_catalog.pg_aggregate limit 1 offset 5)
         >= (select provider from pg_catalog.pg_shseclabel limit 1 offset 6)
        ))
select  
    subq_1.c0 as c0, 
    subq_1.c0 as c1, 
    subq_1.c0 as c2, 
    subq_1.c1 as c3, 
    subq_1.c1 as c4, 
    subq_1.c0 as c5
  from 
    (select  
          ref_1.relname as c0, 
          ref_1.relname as c1
        from 
          pg_catalog.pg_statio_sys_sequences as ref_1
              inner join information_schema.role_routine_grants as ref_2
              on (ref_1.blks_read >= ref_1.blks_hit)
            inner join pg_catalog.pg_type as ref_3
              right join information_schema.user_mapping_options as ref_4
              on ((select seq_scan from pg_catalog.pg_stat_xact_sys_tables limit 1 offset 2)
                     <= ref_3.typlen)
            on (ref_2.specific_name = ref_4.authorization_identifier )
        where ref_1.blks_hit is not NULL) as subq_1
  where true
  limit 107
;;
20.sqlapplication/sql; name=20.sqlDownload
22.sqlapplication/sql; name=22.sqlDownload
#5Jinho Jung
visusee@gmail.com
In reply to: Jinho Jung (#4)
Re: Regarding query minimizer (simplifier)

*Order is reversed. *

*1.sql*
9.4.19: 20ms
10.5 : 1,227ms

*4.sql*
9.4.19: 13ms
10.5 : 88,721ms

*20.sql*
9.4.19: 271ms
10.5 : 6,104ms

*22.sql*
9.4.19: 8ms
10.5 : 105ms

On Tue, Oct 23, 2018 at 3:15 PM Jinho Jung <visusee@gmail.com> wrote:

Show quoted text

Hello,

We appreciate you taking time for test! When we do more evaluation, we
noticed that the previously attached query made regression only on DBs that
we installed from APT manager (i.e., apt-get command) not on DBs that we
built from the source code. But we also confirmed that there are many cases
that cause regression to all DBs (installed from APT and build from source
code)

Hope you can also test these queries too. These are the execution time on
our machine.

*1.sql*
10.5 : 20ms
9.4.19: 1,227ms

*4.sql*
10.5 : 13ms
9.4.19: 88,721ms

*20.sql*
10.5 : 271ms
9.4.19: 6,104ms

*22.sql*
10.5 : 8ms
9.4.19: 105ms

Jinho Jung

On Tue, Oct 23, 2018 at 9:52 AM Jung, Jinho <jinho.jung@gatech.edu> wrote:

Hello Tom,

Sorry for the misleading. Could you try these two queries? I made the
query even slower in latest version of postgres. These are information
about how we set up evaluation environment and query result.

Thanks,

Jinho Jung

Install Multiple version of DBs in one machine
======================================
# Install 10.5
$ wget --quiet -O -
https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/
xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
$ sudo apt update
$ sudo apt-get install postgresql-10

# Install 9.6
$ sudo apt-get install postgresql-9.6

# Install 9.5
$ sudo apt-get install postgresql-9.5

# Install 9.4
$ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4
libpq-dev postgresql-server-dev-9.4

# check
$ pg_lsclusters

Original regression query
==========================
explain analyze
select
1
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
;

ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms

CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor =
( select max(ref_2.grantor)
from
information_schema.role_usage_grants as ref_2,
lateral (
select
max((null)) over (partition by ref_3.amopfamily) as c8
from
pg_catalog.pg_amop as ref_3
) as subq_0
where ref_2.object_catalog = f1.object_catalog
)
;

CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower

------------------------------
*From:* Tom Lane <tgl@sss.pgh.pa.us>
*Sent:* Saturday, October 13, 2018 5:59:06 PM
*To:* Jung, Jinho
*Cc:* pgsql-hackers@lists.postgresql.org
*Subject:* Re: Regarding query minimizer (simplifier)

"Jung, Jinho" <jinho.jung@gatech.edu> writes:

Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find

any SQL queries that cause performance regression. While conducting
evaluation, I found an interesting query which makes x80 times slower
execution in version 10.5 than version 9.4. Please see the attached files,
if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

regards, tom lane