pb with join plan

Started by Marc Millasalmost 3 years ago11 messagesgeneral

marc.millas@mokadb.com

almost 3 years ago

Hi,

I have a postgres 14 on linux with a 15 TB db, with 20 cores and 150GB
RAM, all nvme ssd. . Currently one user :-)

A join between 2 big tables and then another join with a smaller third one
takes less than 1 minute and provides a result of 15 M lines. Fine.

if I do add a third join, with a 30k lines table, with a simple equality as
join clause it does work almost as fast. explain analyze indicates 45 sec.
same if i do the very same with another equality clause. explain analyze
indicates 140 sec.

But if I do the same with clause one OR clause 2, I have to kill the
request after an hour, seeing the filesystem showing more than 140 Mb of
increased usage.

Looking at the explain plan with one clause or the 2 ORed, there are
changes in the plan (of course)
with the fastest clause the estimated cost is 3 700 000 and with the a bit
slower one 3 900 000.
with both ORed, the estimated cost is 16 000 000. To me it does sound a bit
strange, as ORing the join clauses should add times, but not more (so so)

So, before providing the 3 explain plans (I must anonymize everything, so
somewhat boring) I would like to know if there is some obvious thing I am
missing.
all tables have been vacuum analyzed.

thanks

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com

David Rowley

dgrowleyml@gmail.com

almost 3 years ago

In reply to: Marc Millas (#1)

Re: pb with join plan

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com> wrote:

But if I do the same with clause one OR clause 2, I have to kill the request after an hour, seeing the filesystem showing more than 140 Mb of increased usage.

So, before providing the 3 explain plans (I must anonymize everything, so somewhat boring) I would like to know if there is some obvious thing I am missing.
all tables have been vacuum analyzed.

I believe you can anonymise the explain with https://explain.depesz.com/

It's pretty hard to say until we see the query, but having an OR in
the join condition makes it impossible to Hash or Merge join, so
perhaps it's slow due to Nested Loop join.

You could consider rewriting the query to use a UNION or a UNION ALL
separating out each branch of the OR into a UNION of its own. That
would allow Hash and Merge join to work again. However, that's all
speculation until you provide more details.

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

David

Marc Millas

marc.millas@mokadb.com

almost 3 years ago

In reply to: David Rowley (#2)

Re: pb with join plan

On Tue, Jun 20, 2023 at 10:14 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com> wrote:

But if I do the same with clause one OR clause 2, I have to kill the

request after an hour, seeing the filesystem showing more than 140 Mb of
increased usage.

So, before providing the 3 explain plans (I must anonymize everything,

so somewhat boring) I would like to know if there is some obvious thing I
am missing.

all tables have been vacuum analyzed.

I believe you can anonymise the explain with https://explain.depesz.com/

link to the anonymized plan of the req with one clause :

https://explain.depesz.com/s/TWp4

It's pretty hard to say until we see the query, but having an OR in

the join condition makes it impossible to Hash or Merge join, so
perhaps it's slow due to Nested Loop join.

You could consider rewriting the query to use a UNION or a UNION ALL
separating out each branch of the OR into a UNION of its own. That
would allow Hash and Merge join to work again. However, that's all
speculation until you provide more details.

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

David

Marc MILLAS

David Rowley

dgrowleyml@gmail.com

almost 3 years ago

In reply to: Marc Millas (#3)

Re: pb with join plan

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com> wrote:

But if I do the same with clause one OR clause 2, I have to kill the request after an hour, seeing the filesystem showing more than 140 Mb of increased usage.

link to the anonymized plan of the req with one clause : https://explain.depesz.com/s/TWp4

It's quite difficult to know what the problem is you want to fix here.
Your initial post indicated it was the query with the OR condition
that was causing you the problems, but the plan you've posted has no
OR condition?!

You're more likely to get help here if you take time to properly
explain the situation and post the information that's actually
relevant to the problem you're having, or state the problem more
clearly, as there's a mismatch somewhere.

It might also be worth having a look at
https://wiki.postgresql.org/wiki/Slow_Query_Questions . EXPLAIN is not
going to tell us what part of the query is slow. I'll let the wiki
page guide you into what to do instead.

David

Marc Millas

marc.millas@mokadb.com

almost 3 years ago

In reply to: David Rowley (#4)

Re: pb with join plan

On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley <dgrowleyml@gmail.com>

wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com>

wrote:

But if I do the same with clause one OR clause 2, I have to kill the

request after an hour, seeing the filesystem showing more than 140 Mb of
increased usage.

link to the anonymized plan of the req with one clause :

https://explain.depesz.com/s/TWp4

link to the plan with the second clause alone:
https://explain.depesz.com/s/byW5
link to the plan with both clauses ORed (the one not finishing)
https://explain.depesz.com/s/jHO2

It's quite difficult to know what the problem is you want to fix here.
Your initial post indicated it was the query with the OR condition
that was causing you the problems, but the plan you've posted has no
OR condition?!

You're more likely to get help here if you take time to properly
explain the situation and post the information that's actually
relevant to the problem you're having, or state the problem more
clearly, as there's a mismatch somewhere.

It might also be worth having a look at
https://wiki.postgresql.org/wiki/Slow_Query_Questions . EXPLAIN is not
going to tell us what part of the query is slow. I'll let the wiki
page guide you into what to do instead.

I know that page. obviously, as I have to kill the request, I cannot
provide a explain analyze...

Show quoted text

David

David Rowley

dgrowleyml@gmail.com

almost 3 years ago

In reply to: Marc Millas (#5)

Re: pb with join plan

On Wed, 21 Jun 2023 at 10:26, Marc Millas <marc.millas@mokadb.com> wrote:

link to the plan with both clauses ORed (the one not finishing) https://explain.depesz.com/s/jHO2

I'd go with the UNION or UNION ALL idea I mentioned earlier.

David

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 3 years ago

In reply to: Marc Millas (#5)

Re: pb with join plan

On 6/21/23 00:26, Marc Millas wrote:

On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com
<mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com
<mailto:marc.millas@mokadb.com>> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley

<dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com

<mailto:marc.millas@mokadb.com>> wrote:

But if I do the same with clause one OR clause 2, I have to

kill the request after an hour, seeing the filesystem showing more
than 140 Mb of increased usage.

link to the anonymized plan of the req with one clause :

https://explain.depesz.com/s/TWp4 <https://explain.depesz.com/s/TWp4>

link to the plan with the second
clause alone: https://explain.depesz.com/s/byW5
<https://explain.depesz.com/s/byW5>
link to the plan with both clauses ORed (the one not
finishing) https://explain.depesz.com/s/jHO2
<https://explain.depesz.com/s/jHO2>

It's quite difficult to know what the problem is you want to fix here.
Your initial post indicated it was the query with the OR condition
that was causing you the problems, but the plan you've posted has no
OR condition?!

You're more likely to get help here if you take time to properly
explain the situation and post the information that's actually
relevant to the problem you're having, or state the problem more
clearly, as there's a mismatch somewhere.

It might also be worth having a look at
https://wiki.postgresql.org/wiki/Slow_Query_Questions
<https://wiki.postgresql.org/wiki/Slow_Query_Questions> . EXPLAIN is not
going to tell us what part of the query is slow. I'll let the wiki
page guide you into what to do instead.

I know that page. obviously, as I have to kill the request, I cannot
provide a explain analyze...

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Marc Millas

marc.millas@mokadb.com

almost 3 years ago

In reply to: Tomas Vondra (#7)

Re: pb with join plan

On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

On 6/21/23 00:26, Marc Millas wrote:

On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com
<mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com
<mailto:marc.millas@mokadb.com>> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley

<dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com

<mailto:marc.millas@mokadb.com>> wrote:

But if I do the same with clause one OR clause 2, I have to

kill the request after an hour, seeing the filesystem showing more
than 140 Mb of increased usage.

link to the anonymized plan of the req with one clause :

https://explain.depesz.com/s/TWp4 <https://explain.depesz.com/s/TWp4

link to the plan with the second
clause alone: https://explain.depesz.com/s/byW5
<https://explain.depesz.com/s/byW5>
link to the plan with both clauses ORed (the one not
finishing) https://explain.depesz.com/s/jHO2
<https://explain.depesz.com/s/jHO2>

It's quite difficult to know what the problem is you want to fix

here.

Your initial post indicated it was the query with the OR condition
that was causing you the problems, but the plan you've posted has no
OR condition?!

You're more likely to get help here if you take time to properly
explain the situation and post the information that's actually
relevant to the problem you're having, or state the problem more
clearly, as there's a mismatch somewhere.

It might also be worth having a look at
https://wiki.postgresql.org/wiki/Slow_Query_Questions
<https://wiki.postgresql.org/wiki/Slow_Query_Questions> . EXPLAIN

is not

going to tell us what part of the query is slow. I'll let the wiki
page guide you into what to do instead.

I know that page. obviously, as I have to kill the request, I cannot
provide a explain analyze...

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.

the query does:
select blabla from table1 join table2 on (list of 9 fields ANDed and
corresponding to the index of both table1 and table2)
join table3 on table1.a=table3.a and table1.b=table3.b
join table4 on (list of 2 clauses table2.d=table4.e and one clause
substr(table2.f)=table4.g all ORed)
table1 and table2 are big (15M and 60M lines), table3 and table4 are small
(30k lines)

basically, if I rewrites the query beginning by the join between table2 and
table4, then join table1 and then table3, postgres generates the same plan,
which doesnt end.

if instead of the 3 clauses of the last join I keep one equality clause,
the explain plan looks the same, but executes in 45 secondes.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Marc MILLAS

Marc Millas

marc.millas@mokadb.com

almost 3 years ago

In reply to: Tomas Vondra (#7)

Re: pb with join plan

Marc MILLAS

On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

On 6/21/23 00:26, Marc Millas wrote:

On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com
<mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com
<mailto:marc.millas@mokadb.com>> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley

<dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com

<mailto:marc.millas@mokadb.com>> wrote:

But if I do the same with clause one OR clause 2, I have to

kill the request after an hour, seeing the filesystem showing more
than 140 Mb of increased usage.

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.

So I did try to simplify my pb.
I create a table with the result of the first 3 joins.
That table do have 15M lines. all tables have been vacuum analyze

Now if I do an explain analyze of a simple join between that table and my
original table 4
using a simple = clause, I get a result in one second (around). and the
planner guesses for rows seems in line with the observed values .
if I use a substr(table1.a)= table2.b, the explain analyze get a result in
21 seconds and the planner estimates a 65M rows result set while the
observed is 330 k rows
so here its 20 times slower and the discrepency between planner rows guess
and reality is a 200 ratio.

Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
then... I kill the query after a quarter an hour without any answer.
if I try to just explain the query, the planner rows guess becomes more
than 2 Billions....
the extremely simple query and plan are here, without automatic obfuscation
https://explain.depesz.com/s/b8Ll

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Marc MILLAS

#10

Kirk Wolak

wolakk@gmail.com

almost 3 years ago

In reply to: Marc Millas (#9)

Re: pb with join plan

On Wed, Jun 21, 2023 at 12:10 PM Marc Millas <marc.millas@mokadb.com> wrote:

Marc MILLAS
On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <
tomas.vondra@enterprisedb.com> wrote:

On 6/21/23 00:26, Marc Millas wrote:

On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com
<mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com
<mailto:marc.millas@mokadb.com>> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley

<dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <

marc.millas@mokadb.com

<mailto:marc.millas@mokadb.com>> wrote:

But if I do the same with clause one OR clause 2, I have to

kill the request after an hour, seeing the filesystem showing more
than 140 Mb of increased usage.

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.

So I did try to simplify my pb.
I create a table with the result of the first 3 joins.
That table do have 15M lines. all tables have been vacuum analyze

Now if I do an explain analyze of a simple join between that table and my
original table 4
using a simple = clause, I get a result in one second (around). and the
planner guesses for rows seems in line with the observed values .
if I use a substr(table1.a)= table2.b, the explain analyze get a result in
21 seconds and the planner estimates a 65M rows result set while the
observed is 330 k rows
so here its 20 times slower and the discrepency between planner rows guess
and reality is a 200 ratio.

Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
then... I kill the query after a quarter an hour without any answer.
if I try to just explain the query, the planner rows guess becomes more
than 2 Billions....
the extremely simple query and plan are here, without automatic obfuscation
https://explain.depesz.com/s/b8Ll

First, I am not sure why you cannot send us the explain analyze. But
moving on...

substr() is a function that mutilates a value such that the index becomes
useless...
If you are looking for the LEFT() of the value, then an INDEX can be used.
I have COLLATION "C" and when I query:
WHERE fld like fld_b||"%"

The optimizer constructs a query that uses the index on "fld"...
But when I try:

WHERE fld like CONCAT_WS("", fld_b,"%")
It doesn't use the index version. (because the function call is too
complicated to see through)

When using functions in where clauses, indexes either have to be made on
those functions, or often times the index cannot be used.

BTW, I noted the COLLATION. That turned out to be important, because my
first DB test did NOT use that collation, and the result
of the LIKE was the non-indexed version...

I hope you find something useful in here.

Also, WHERE fld <> 72... (unless you have a heavily skewed set of
statistics, I read that as. SCAN everything, and check later,
because this should filter very few rows), whereas fld = 72 will be
blazingly fast.

Kirk

#11

Marc Millas

marc.millas@mokadb.com

almost 3 years ago

In reply to: Kirk Wolak (#10)

Re: pb with join plan

On Tue, Jun 27, 2023 at 8:12 PM Kirk Wolak <wolakk@gmail.com> wrote:

On Wed, Jun 21, 2023 at 12:10 PM Marc Millas <marc.millas@mokadb.com>
wrote:

Marc MILLAS
On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <
tomas.vondra@enterprisedb.com> wrote:

On 6/21/23 00:26, Marc Millas wrote:

On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com
<mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com
<mailto:marc.millas@mokadb.com>> wrote:

On Tue, Jun 20, 2023 at 10:14 PM David Rowley

<dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:

On Wed, 21 Jun 2023 at 07:42, Marc Millas <

marc.millas@mokadb.com

<mailto:marc.millas@mokadb.com>> wrote:

But if I do the same with clause one OR clause 2, I have to

kill the request after an hour, seeing the filesystem showing more
than 140 Mb of increased usage.

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.

So I did try to simplify my pb.
I create a table with the result of the first 3 joins.
That table do have 15M lines. all tables have been vacuum analyze

Now if I do an explain analyze of a simple join between that table and my
original table 4
using a simple = clause, I get a result in one second (around). and the
planner guesses for rows seems in line with the observed values .
if I use a substr(table1.a)= table2.b, the explain analyze get a result
in 21 seconds and the planner estimates a 65M rows result set while the
observed is 330 k rows
so here its 20 times slower and the discrepency between planner rows
guess and reality is a 200 ratio.

Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
then... I kill the query after a quarter an hour without any answer.
if I try to just explain the query, the planner rows guess becomes more
than 2 Billions....
the extremely simple query and plan are here, without automatic
obfuscation
https://explain.depesz.com/s/b8Ll

First, I am not sure why you cannot send us the explain analyze. But
moving on...

Kirk, the explain analyze, with the SQL query is directly accessible on
the explain.depesz link .

substr() is a function that mutilates a value such that the index becomes
useless...
If you are looking for the LEFT() of the value, then an INDEX can be used.
I have COLLATION "C" and when I query:
WHERE fld like fld_b||"%"

there are NO indexes on those columns. One of the reasons is that the
simplest index on one column is 380 GB on disk
So to put indexes on each criteria, I must add around 1 TB of disk just
for ttt....
the full scan is not a problem. Its fast.. The problem is the nested loop
which do compare each of the 15M lines of ttt to each of the 30K lines of
inc_pha_r.
its an operation done 450 000 000 000 times. so if each comparison is 1
microsecond long, the nested loop is 125 hours long.
And I am not sure that the comparison is done in 1 microsecond...

The optimizer constructs a query that uses the index on "fld"...
But when I try:

WHERE fld like CONCAT_WS("", fld_b,"%")
It doesn't use the index version. (because the function call is too
complicated to see through)

When using functions in where clauses, indexes either have to be made on
those functions, or often times the index cannot be used.

BTW, I noted the COLLATION. That turned out to be important, because my
first DB test did NOT use that collation, and the result
of the LIKE was the non-indexed version...

I hope you find something useful in here.

Thanks for trying

Show quoted text

Also, WHERE fld <> 72... (unless you have a heavily skewed set of
statistics, I read that as. SCAN everything, and check later,
because this should filter very few rows), whereas fld = 72 will be
blazingly fast.

Kirk