join question

Started by Grzegorz Jaśkiewiczover 17 years ago9 messagesgeneral
Jump to latest
#1Grzegorz Jaśkiewicz
gryzman@gmail.com

Hey folks,
I am trying to rewrite a query here, that takes 1.5m atm to finish. I got it
down to 20s, and still trying to pin it down.

basically, a query looks something like that atm:

select a.*, b.*
from a
join b on a.id = b.a_id and a.banned <> true
where
a.start <= now()
and
b.end > now();

that's 20s query, and now I got it down to 10s , by using something - which
in my eyes would be always wrong - and against all logic. So if someone
could please explain to me why is it faster:

select a.*, b.*
from foo a
join bar b on a.id = b.a_id
where
not exists (
select id from foo where foo.id = b.a_id and foo.banned <> true
)
and
a.start <= now()
and
b.end > now();

plans differ, obviously - second one uses index to lookup .banned in
foo, whilst first one goes for seq scan.
result is the same, but I was actually expecting quite opposite. So is join
on 1-2M rows a bad idea ?
The effect can be seen on both 8.1 and cvs head.

I would be grateful for someone clarifying that to me.

--
GJ

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Grzegorz Jaśkiewicz (#1)
Re: join question

"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes:

that's 20s query, and now I got it down to 10s , by using something - which
in my eyes would be always wrong - and against all logic. So if someone
could please explain to me why is it faster:

[ shrug... ] If you aren't going to show us EXPLAIN ANALYZE output,
we're even more in the dark than you are.

regards, tom lane

#3Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Tom Lane (#2)
Re: join question

"we're even more in the dark than you are."

:)

so here are the plans, that's the real table run.

QUERY PLAN after

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=37807.04..37807.05 rows=1 width=50) (actual
time=9788.642..9805.832 rows=20000 loops=1)

Sort Key: s.nodeid

Sort Method: external sort Disk: 1320kB

-> Nested Loop (cost=376.60..37807.03 rows=1 width=50) (actual
time=15.454..9629.198 rows=20000 loops=1)

-> Nested Loop Anti Join (cost=376.60..37800.27 rows=1 width=50)
(actual time=15.347..9077.445 rows=20000 loops=1)

-> Nested Loop (cost=376.60..37797.99 rows=1 width=50)
(actual time=15.308..8927.428 rows=20000 loops=1)

-> Hash Anti Join (cost=376.60..37791.22 rows=1
width=8) (actual time=15.195..8216.448 rows=20000 loops=1)

Hash Cond: (e.accountid = account.id)

-> Bitmap Heap Scan on efoo
e (cost=368.23..37709.63 rows=19523 width=8) (actual time=14.981..8166.262
rows=20000 loops=1)

Recheck Cond: (packageid = 497)

Filter: ((startdate <= now()) AND (enddate

now()))

-> Bitmap Index Scan on
efoo_packageid_idx (cost=0.00..363.35 rows=19523 width=0) (actual
time=9.694..9.694 rows=20000 loops=1)

Index Cond: (packageid = 497)

-> Hash (cost=8.35..8.35 rows=1 width=8)
(actual time=0.136..0.136 rows=1 loops=1)

-> Index Scan using account_banned_idx on
account (cost=0.00..8.35 rows=1 width=8) (actual time=0.129..0.131 rows=1
loops=1)

Index Cond: (banned = true)

Filter: banned

-> Index Scan using bbaccididx on bb
s (cost=0.00..6.76 rows=1 width=42) (actual time=0.030..0.032 rows=1
loops=20000)

Index Cond: (s.accountid = e.accountid)

-> Index Scan using bbar_bbid_key on bbar
b (cost=0.00..2.27 rows=1 width=11) (actual time=0.005..0.005 rows=0
loops=20000)

Index Cond: ((b.bbid)::text = (s.id)::text)

-> Index Scan using acct_ididx on account a (cost=0.00..6.75
rows=1 width=24) (actual time=0.024..0.025 rows=1 loops=20000)

Index Cond: (a.id = e.accountid)

Total runtime: 9815.280 ms

and before:

QUERY PLAN before

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=130129.98..130178.78 rows=19521 width=50) (actual
time=16156.145..16170.234 rows=20000 loops=1)

Sort Key: s.nodeid

Sort Method: external merge Disk: 1312kB

-> Hash Anti Join (cost=78755.00..128101.84 rows=19521 width=50)
(actual time=12836.008..16071.668 rows=20000 loops=1)

Hash Cond: ((s.id)::text = (b.bbid)::text)

-> Hash Join (cost=78752.17..127830.60 rows=19523 width=50)
(actual time=12825.755..16043.271 rows=20000 loops=1)

Hash Cond: (e.accountid = s.accountid)

-> Merge Join (cost=39100.97..79171.13 rows=19523 width=32)
(actual time=11496.544..12614.860 rows=20000 loops=1)

Merge Cond: (a.id = e.accountid)

-> Index Scan using acct_ididx on account
a (cost=0.00..37277.39 rows=1000002 width=24) (actual time=0.183..859.610
rows=999950 loops=1)

Filter: (banned <> true)

-> Sort (cost=39100.93..39149.73 rows=19523 width=8)
(actual time=11496.268..11507.031 rows=20000 loops=1)

Sort Key: e.accountid

Sort Method: external sort Disk: 472kB

-> Bitmap Heap Scan on efoo
e (cost=368.23..37709.63 rows=19523 width=8) (actual time=14.640..11395.226
rows=20000 loops=1)

Recheck Cond: (packageid = 497)

Filter: ((startdate <= now()) AND (enddate

now()))

-> Bitmap Index Scan on
efoo_packageid_idx (cost=0.00..363.35 rows=19523 width=0) (actual
time=9.377..9.377 rows=20000 loops=1)

Index Cond: (packageid = 497)

-> Hash (cost=18850.09..18850.09 rows=1000009 width=42)
(actual time=1326.158..1326.158 rows=1000009 loops=1)

-> Seq Scan on bb s (cost=0.00..18850.09 rows=1000009
width=42) (actual time=0.032..424.731 rows=1000009 loops=1)

-> Hash (cost=1.81..1.81 rows=81 width=11) (actual
time=10.111..10.111 rows=81 loops=1)

-> Seq Scan on bbar b (cost=0.00..1.81 rows=81 width=11)
(actual time=9.971..10.013 rows=81 loops=1)

Total runtime: 16206.639 ms

(24 rows)

Time: 16217,107 ms

--
GJ

--
GJ

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Grzegorz Jaśkiewicz (#3)
Re: join question

"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes:

so here are the plans, that's the real table run.

Hmm, well this rowcount estimate is way off:

-> Hash Anti Join (cost=376.60..37791.22 rows=1
width=8) (actual time=15.195..8216.448 rows=20000 loops=1)

The fact that it's getting a faster plan despite being completely wrong
about the rowcount means that the cost parameters are way off for your
situation. It looks like you are testing a case where the tables all
fit in memory. Do you expect that to be the reality for your production
use? If so, you might want to reduce random_page_cost to something
close to 1 to reflect it. If not, it'd be a good idea to test with more
realistically-sized tables before deciding what's "faster".

I'm not sure why the rowcount estimate is so far off, but the antijoin
code is all new and probably there's an estimation bug in there
somewhere. (You didn't get this plan, or anything at all like it,
from 8.1 ;-))

regards, tom lane

#5Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Tom Lane (#4)
Re: join question

On Thu, Oct 23, 2008 at 12:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes:

so here are the plans, that's the real table run.

Hmm, well this rowcount estimate is way off:

-> Hash Anti Join (cost=376.60..37791.22 rows=1
width=8) (actual time=15.195..8216.448 rows=20000 loops=1)

The fact that it's getting a faster plan despite being completely wrong
about the rowcount means that the cost parameters are way off for your
situation. It looks like you are testing a case where the tables all
fit in memory. Do you expect that to be the reality for your production
use? If so, you might want to reduce random_page_cost to something
close to 1 to reflect it. If not, it'd be a good idea to test with more
realistically-sized tables before deciding what's "faster".

tell me about it. even tho I am a rookie here, that cough my attention too.

I'm not sure why the rowcount estimate is so far off, but the antijoin
code is all new and probably there's an estimation bug in there
somewhere. (You didn't get this plan, or anything at all like it,
from 8.1 ;-))

nope, that's up2date cvs head. I always test stuff on cvs head first, only
run 8.1 in the office/production/testing - and I already suggested to the
powers to be, that we need to move to 8.3 pronto, for several million
reasons.
Thanks Tom for your opinion :)

--
GJ

#6marcin mank
marcin.mank@gmail.com
In reply to: Grzegorz Jaśkiewicz (#3)
Re: join question

Sort Method: external sort Disk: 1320kB

One simple speedup could be upping Your work_mem to 2M for this query,
so the sorts are in memory.

btw: Last time I used Postgres, it did not show the sort method. Cool.

Greetings
Marcin Mank

#7Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Tom Lane (#4)
Re: join question

On Thu, Oct 23, 2008 at 12:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It looks like you are testing a case where the tables all
fit in memory. Do you expect that to be the reality for your production
use? If so, you might want to reduce random_page_cost to something
close to 1 to reflect it. If not, it'd be a good idea to test with more
realistically-sized tables before deciding what's "faster".

I am sure it at least reads some disc, because I can see peak in hd read -
up to about 10-20MB/s during that query's execution.

--
GJ

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Grzegorz Jaśkiewicz (#5)
Re: join question

"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes:

On Thu, Oct 23, 2008 at 12:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not sure why the rowcount estimate is so far off, but the antijoin
code is all new and probably there's an estimation bug in there
somewhere. (You didn't get this plan, or anything at all like it,
from 8.1 ;-))

nope, that's up2date cvs head. I always test stuff on cvs head first,

I just committed a patch that might help a bit with that.

regards, tom lane

#9Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Tom Lane (#8)
Re: join question

thanks. I shall try it.Also, thanks for putting my name in cvs log ;)