Query with correlated join having slow performance

Started by saket bansalover 6 years ago6 messagesgeneral
Jump to latest
#1saket bansal
saket.tcs@gmail.com

Hi Postgres Experts,

Please help me on a query tuning.
Postgres verson: 11.5
This database has been migrated from oracle 12c to postgres. In Oracle
query executes in 2-3 secs, but in postgres it hangs forever. There are no
transactions at this time, I am stuck at first run after migration.

My analysis:

I have done vacuum full , analyze , even with 100% samples using a much
higher value of default_statistics_target.
Also tried different hints using pg_hint_plan extension. Overall cost
reduces, but actual run hangs forever.
I think problem is with correlated subquery join condition.
If "and pdtaltrelt*%*_.tenant_dkby=pdtaltrelt0_.tenant_dkby" condition is
removed from both subqueries, result comes in secs(I understand that would
be skipping correlated join)

SQL> select count(*) from pdtalt_rel_to_tenant_rel;
267216

SQL> select count(distinct tenant_dkby) from pdtalt_rel_to_tenant_rel;
3

Table DDLs , query plan and parameter configuration available at below git
link:
https://github.com/bansalsaket/PG_correlated_subquery_slowness

I have 16 GB , 4 CPU , rhel 7 machine.

Thanks for help in advance, let me know if any additional information is
required

#2Michael Lewis
mlewis@entrata.com
In reply to: saket bansal (#1)
Re: Query with correlated join having slow performance

I'd suggest re-writing your query to avoid ORs whenever possible. Is this
generated by an ORM or subject to change with filters selected in
application or can you totally control it on DB side?

It may be hugely more performant to simply rewrite this as (almost) the
same query twice UNION ALL'd together to separate the
productalt1_.alt_pdt_dbky OR productalt1_.orgnl_pdt_dbky conditions.

#3Justin
zzzzz.graf@gmail.com
In reply to: saket bansal (#1)
Re: Query with correlated join having slow performance

Hi Saket

The first filter condition seems to be duplicated it appears this can be
simplified from

and ( pdtaltrelt0_.status_typ_dbky=102
and ( pdtaltrelt0_.rule_status_typ_dbky is null )
or pdtaltrelt0_.status_typ_dbky in ( 19 )
or pdtaltrelt0_.status_typ_dbky in (20 )
)
and ( pdtaltrelt0_.status_typ_dbky in (19 , 20)
or pdtaltrelt0_.status_typ_dbky=102
and (pdtaltrelt0_.rule_status_typ_dbky is null)
)
TO

and (
(pdtaltrelt0_.status_typ_dbky = 102 and pdtaltrelt0_.rule_status_typ_dbky
is null)
or pdtaltrelt0_.status_typ_dbky in (19, 20)
)

The Explain shows the filter seq filter like so
Filter: (
((status_typ_dbky = ANY ('{19,20}'::bigint[]))
OR ((status_typ_dbky = 102) AND (rule_status_typ_dbky IS NULL))
)
AND
(((status_typ_dbky = 102) AND (rule_status_typ_dbky IS NULL))
OR (status_typ_dbky = 19)
OR (status_typ_dbky = 20)
)
)

I can not see the difference between above/below the AND other than the
order of operations...

On Mon, Dec 9, 2019 at 1:33 PM saket bansal <saket.tcs@gmail.com> wrote:

Show quoted text

Hi Postgres Experts,

Please help me on a query tuning.
Postgres verson: 11.5
This database has been migrated from oracle 12c to postgres. In Oracle
query executes in 2-3 secs, but in postgres it hangs forever. There are no
transactions at this time, I am stuck at first run after migration.

My analysis:

I have done vacuum full , analyze , even with 100% samples using a much
higher value of default_statistics_target.
Also tried different hints using pg_hint_plan extension. Overall cost
reduces, but actual run hangs forever.
I think problem is with correlated subquery join condition.
If "and pdtaltrelt*%*_.tenant_dkby=pdtaltrelt0_.tenant_dkby" condition is
removed from both subqueries, result comes in secs(I understand that would
be skipping correlated join)

SQL> select count(*) from pdtalt_rel_to_tenant_rel;
267216

SQL> select count(distinct tenant_dkby) from pdtalt_rel_to_tenant_rel;
3

Table DDLs , query plan and parameter configuration available at below git
link:
https://github.com/bansalsaket/PG_correlated_subquery_slowness

I have 16 GB , 4 CPU , rhel 7 machine.

Thanks for help in advance, let me know if any additional information is
required

#4saket bansal
saket.tcs@gmail.com
In reply to: Justin (#3)
Re: Query with correlated join having slow performance

Thanks Justin for pointing this out.
More work for optimizer for nothing, I will remove it.

On Mon, Dec 9, 2019 at 2:48 PM Justin <zzzzz.graf@gmail.com> wrote:

Show quoted text

Hi Saket

The first filter condition seems to be duplicated it appears this can be
simplified from

and ( pdtaltrelt0_.status_typ_dbky=102
and ( pdtaltrelt0_.rule_status_typ_dbky is null )
or pdtaltrelt0_.status_typ_dbky in ( 19 )
or pdtaltrelt0_.status_typ_dbky in (20 )
)
and ( pdtaltrelt0_.status_typ_dbky in (19 , 20)
or pdtaltrelt0_.status_typ_dbky=102
and (pdtaltrelt0_.rule_status_typ_dbky is null)
)
TO

and (
(pdtaltrelt0_.status_typ_dbky = 102 and pdtaltrelt0_.rule_status_typ_dbky
is null)
or pdtaltrelt0_.status_typ_dbky in (19, 20)
)

The Explain shows the filter seq filter like so
Filter: (
((status_typ_dbky = ANY ('{19,20}'::bigint[]))
OR ((status_typ_dbky = 102) AND (rule_status_typ_dbky IS NULL))
)
AND
(((status_typ_dbky = 102) AND (rule_status_typ_dbky IS NULL))
OR (status_typ_dbky = 19)
OR (status_typ_dbky = 20)
)
)

I can not see the difference between above/below the AND other than the
order of operations...

On Mon, Dec 9, 2019 at 1:33 PM saket bansal <saket.tcs@gmail.com> wrote:

Hi Postgres Experts,

Please help me on a query tuning.
Postgres verson: 11.5
This database has been migrated from oracle 12c to postgres. In Oracle
query executes in 2-3 secs, but in postgres it hangs forever. There are no
transactions at this time, I am stuck at first run after migration.

My analysis:

I have done vacuum full , analyze , even with 100% samples using a much
higher value of default_statistics_target.
Also tried different hints using pg_hint_plan extension. Overall cost
reduces, but actual run hangs forever.
I think problem is with correlated subquery join condition.
If "and pdtaltrelt*%*_.tenant_dkby=pdtaltrelt0_.tenant_dkby" condition
is removed from both subqueries, result comes in secs(I understand that
would be skipping correlated join)

SQL> select count(*) from pdtalt_rel_to_tenant_rel;
267216

SQL> select count(distinct tenant_dkby) from pdtalt_rel_to_tenant_rel;
3

Table DDLs , query plan and parameter configuration available at below
git link:
https://github.com/bansalsaket/PG_correlated_subquery_slowness

I have 16 GB , 4 CPU , rhel 7 machine.

Thanks for help in advance, let me know if any additional information is
required

#5saket bansal
saket.tcs@gmail.com
In reply to: Michael Lewis (#2)
Re: Query with correlated join having slow performance

Thank you Michael. I re-wrote it and it does perform well. Modified query
at:

https://github.com/bansalsaket/PG_correlated_subquery_slowness/blob/master/Modified%20query%20-%20performs%20faster.txt

Our app team is checking with their vendor whether this can be modified at
source code level or not.
But question remains somewhat valid. Data volume is not huge and original
query wasn't very badly written either. Operating system level resources
are similar.
Do you know of any bugs associated with using co-related sub queries in
postgres. In Oracle, it runs in a sec, while in postgres it does not give
result at all , even after 8-9 hours.
I understand both database engines work differently, but such drastic
change in performance is a surprise!
We have lot of migrations planned from oracle to postgres, this could be a
show stopper. :(
Any suggestions...

On Mon, Dec 9, 2019 at 1:49 PM Michael Lewis <mlewis@entrata.com> wrote:

Show quoted text

I'd suggest re-writing your query to avoid ORs whenever possible. Is this
generated by an ORM or subject to change with filters selected in
application or can you totally control it on DB side?

It may be hugely more performant to simply rewrite this as (almost) the
same query twice UNION ALL'd together to separate the
productalt1_.alt_pdt_dbky OR productalt1_.orgnl_pdt_dbky conditions.

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: saket bansal (#5)
Re: Query with correlated join having slow performance

po 9. 12. 2019 v 21:05 odesílatel saket bansal <saket.tcs@gmail.com> napsal:

Thank you Michael. I re-wrote it and it does perform well. Modified query
at:

https://github.com/bansalsaket/PG_correlated_subquery_slowness/blob/master/Modified%20query%20-%20performs%20faster.txt

Our app team is checking with their vendor whether this can be modified at
source code level or not.
But question remains somewhat valid. Data volume is not huge and original
query wasn't very badly written either. Operating system level resources
are similar.
Do you know of any bugs associated with using co-related sub queries in
postgres. In Oracle, it runs in a sec, while in postgres it does not give
result at all , even after 8-9 hours.
I understand both database engines work differently, but such drastic
change in performance is a surprise!
We have lot of migrations planned from oracle to postgres, this could be a
show stopper. :(
Any suggestions...

There was more times discussion about rewriting OR conditions to UNION -
but nobody did this work what I know. Usually Postgres process OR
conditions well due bitmap scans, but it doesn't work well in some special
cases. To this time this issue was fixed by manual query rewriting.

Regards

Pavel

Show quoted text

On Mon, Dec 9, 2019 at 1:49 PM Michael Lewis <mlewis@entrata.com> wrote:

I'd suggest re-writing your query to avoid ORs whenever possible. Is this
generated by an ORM or subject to change with filters selected in
application or can you totally control it on DB side?

It may be hugely more performant to simply rewrite this as (almost) the
same query twice UNION ALL'd together to separate the
productalt1_.alt_pdt_dbky OR productalt1_.orgnl_pdt_dbky conditions.