Hash Join vs Nested Loops in 7.2.1 ...

Started by Ed L.about 24 years ago16 messagesgeneral
Jump to latest
#1Ed L.
pggeneral@bluepolka.net

I have a 7.2.1 query with two peculiar characteristics and wondered if anyone
could offer some insight.

First, my query takes 90 seconds with a hash join, but only 1 second with
nested loops.

Second, the same query sometimes takes 10-50 seconds shortly after possibly a
dump or other high-data-volume queries are executed, after which it then
returns to 1 second execution time. Getting crowded out of shared memory?

Finally, I am inclined to turn off hash joins altogether. What should I
expect to lose in doing so?

Schema, query, and plans shown below...

Thanks,
Ed

-- 700,000 rows
CREATE TABLE story (
value TEXT,
key SERIAL
);
CREATE UNIQUE INDEX story_pkey ON story USING btree (key);

-- 700,000 rows
CREATE TABLE dict (
word VARCHAR,
key SERIAL
);
CREATE UNIQUE INDEX dict_pkey ON dict USING btree (key);
CREATE UNIQUE INDEX dict_word_key ON dict USING btree (word);

-- 28,000,000 rows
CREATE TABLE story_dict (
dictkey INTEGER NOT NULL,
storykey INTEGER NOT NULL
);
CREATE UNIQUE INDEX story_dict_pkey ON story_dict(dictkey, storykey);
CREATE INDEX story_dict_tk_idx ON story_dict(storykey);

-- Query:
-- ======

SELECT DISTINCT ftd1.storykey
FROM story_dict ftd1, dict d1
WHERE d1.word = 'foo'
AND d1.key = ftd1.dictkey
AND EXISTS (
SELECT ft2.key
FROM story ft2, story_dict ftd2, dict d2
WHERE d2.word = 'bar'
AND d2.key = ftd2.dictkey
AND ftd2.storykey = ft2.key
AND ftd2.storykey = ftd1.storykey)
AND EXISTS (
SELECT ft3.key
FROM story ft3, story_dict ftd3, dict d3
WHERE d3.word = 'baz'
AND d3.key = ftd3.dictkey
AND ftd3.storykey = ft3.key
AND ftd3.storykey = ftd1.storykey)
ORDER BY ftd1.storykey
LIMIT 1000;

Plan with PGOPTIONS = "-fn":

Limit (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Unique (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Sort (cost=15409053054.71..15409053054.71 rows=9 width=12)
-> Nested Loop (cost=100000000.00..15409053054.57 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..15309052993.63 rows=4398 width=8)
SubPlan
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd2
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft2 (cost=0.00..86701.96 rows=2389196 width=4)
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd3
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft3 (cost=0.00..86701.96 rows=2389196 width=4)

Plan with PGOPTIONS = "-fh":

Limit (cost=635283.31..635283.33 rows=1 width=12)
-> Unique (cost=635283.31..635283.33 rows=1 width=12)
-> Sort (cost=635283.31..635283.31 rows=9 width=12)
-> Nested Loop (cost=0.00..635283.17 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..635222.22 rows=4398 width=8)
SubPlan
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd2
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft2 (cost=0.00..4.06 rows=1
width=4)
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd3
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft3 (cost=0.00..4.06 rows=1
width=4)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#1)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Ed Loehr <pggeneral@bluepolka.net> writes:

I have a 7.2.1 query with two peculiar characteristics and wondered if
anyone could offer some insight.

First, my query takes 90 seconds with a hash join, but only 1 second with
nested loops.

Probably because the EXISTS subplans only need to fetch one row from
each table they access; that's more or less an ideal case for nestloop
indexscans. Nestloops do not scale very well to large retrieval sets,
however...

Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?

Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?

Finally, I am inclined to turn off hash joins altogether.

That would be a remarkably foolish thing to do. Certainly this query
is not a reason to do so; AFAICS the planner will do this one just fine
without any thumb on the scales.

regards, tom lane

#3Ed L.
pggeneral@bluepolka.net
In reply to: Ed L. (#1)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Tom Lane wrote:

Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?

Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?

shared_buffers = 256
max_fsm_relations = 500
(defaults for the rest)

RAM: 2.4GB, maybe? Not that familiar with HPUX mem setup...

(OS: HP-UX B.11.00 U 9000/800)
$ swapinfo -mt
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 2048 147 1901 7% 0 - 1 /dev/vg00/lvol2
reserve - 312 -312
memory 369 351 18 95%
total 2417 810 1607 34% - 0 -

AFAICS the planner will do this one just fine
without any thumb on the scales.

How to I find the thumb?

Ed

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#3)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Ed Loehr <pggeneral@bluepolka.net> writes:

Tom Lane wrote:
Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?

Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?

shared_buffers = 256

That's not a lot --- 256*8K = 2MB. You might try something in the low
thousands.

RAM: 2.4GB, maybe? Not that familiar with HPUX mem setup...

swapinfo won't tell you anything about physical RAM. If you poke around
in SAM, I think there's some displays in there about hardware ...

regards, tom lane

#5Ed L.
pggeneral@bluepolka.net
In reply to: Ed L. (#1)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Tom Lane wrote:

Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?

Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?

shared_buffers = 256

That's not a lot --- 256*8K = 2MB. You might try something in the low
thousands.

RAM: 2.4GB, maybe? Not that familiar with HPUX mem setup...

SAM indicates 512MB of RAM. I upped the shared buffers from 256 to 4096, and
the hashjoin query came down from ~90 seconds to 10, still 10x slower than
the 1-sec nested loops. Is that a performance difference you'd expect
between hash and nested loops on this query because of EXISTS?

Ed

#6Ed L.
pggeneral@bluepolka.net
In reply to: Ed L. (#1)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Ed Loehr wrote:

Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?

Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?

shared_buffers = 256

That's not a lot --- 256*8K = 2MB. You might try something in the low
thousands.

SAM indicates 512MB of RAM. I upped the shared buffers from 256 to
4096, and the hashjoin query came down from ~90 seconds to 10, still 10x
slower than the 1-sec nested loops. Is that a performance difference
you'd expect between hash and nested loops on this query because of EXISTS?

What I neglected to mention was that the planner was *choosing* the slower
hashjoin plan over the much faster nested loop plan without any PGOPTIONS set
or any postgresql.conf changes to enable_*, thus the motivation for a "thumb
on the scales." After upping the number of shared buffers, it has begun
choosing the smart plan 1-second plan, apparently after a restart, not sure.
Thanks, Tom.

Ed

#7grant
grant@amadensor.com
In reply to: Ed L. (#1)
MDDB/MOLAP

Has there been any work on an MDDB extension to PostgreSQL? We are
moving to a MOLAP environment, and PostgreSQL is my favorite RDBMS. If
no one has worked on it, I guess I will have to see what I can do...

The only real obstacle is time.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#6)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Ed Loehr <pggeneral@bluepolka.net> writes:

What I neglected to mention was that the planner was *choosing* the
slower hashjoin plan over the much faster nested loop plan without any
PGOPTIONS set or any postgresql.conf changes to enable_*, thus the
motivation for a "thumb on the scales." After upping the number of
shared buffers, it has begun choosing the smart plan 1-second plan,

Interesting. The estimated cost of indexscans is dependent on
shared_buffers, but not so dependent that I'd have expected it to make a
difference here. What were the EXPLAIN numbers you were getting, again?

regards, tom lane

#9Jean-Luc Lachance
jllachan@nsd.ca
In reply to: Ed L. (#1)
table alias nor valid for delete

Hi all,

I can't specify an alias for the table I want to delete from.

Is this a bug or is that behavior mandated by the SQL standard?

JLL

#10Alexis Maldonado
amaldona@ctcd.cc.tx.us
In reply to: Ed L. (#1)
Sub-selects taking way too long..

Ok.. I know its provably something im doing dumb..
but here it goes..

I have 2 tables that are the same:

"temp_table" and "table"

"temp _table" has 7,761 rows and "table" is empty

the columns for both tables are: ID (primary key sequence), index, column1,
column2

when i run:

Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)

it takes 40 MINUTES to execute..
i dont know what im doing wrong here both tables have "index" indexed ..
help...

#11Ed L.
pggeneral@bluepolka.net
In reply to: Ed L. (#1)
Re: Hash Join vs Nested Loops in 7.2.1 ...

Tom Lane wrote:

What I neglected to mention was that the planner was *choosing* the
slower hashjoin plan over the much faster nested loop plan without any
PGOPTIONS set or any postgresql.conf changes to enable_*, thus the
motivation for a "thumb on the scales." After upping the number of
shared buffers, it has begun choosing the smart plan 1-second plan,

Interesting. The estimated cost of indexscans is dependent on
shared_buffers, but not so dependent that I'd have expected it to make a
difference here. What were the EXPLAIN numbers you were getting, again?

The default plan looked like the "-fn" plan below.

I guess I should also mention there are a number of columns in the 'story'
table that are not involved in the query or plans, but would add to the
'weight' of a row if that makes a difference to the planner. I omitted them
from my earlier listings thinking they were superfluous to this discussion.

Plan with PGOPTIONS = "-fn":

Limit (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Unique (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Sort (cost=15409053054.71..15409053054.71 rows=9 width=12)
-> Nested Loop (cost=100000000.00..15409053054.57 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..15309052993.63 rows=4398 width=8)
SubPlan
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd2
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft2 (cost=0.00..86701.96 rows=2389196 width=4)
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd3
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft3 (cost=0.00..86701.96 rows=2389196 width=4)

Plan with PGOPTIONS = "-fh":

Limit (cost=635283.31..635283.33 rows=1 width=12)
-> Unique (cost=635283.31..635283.33 rows=1 width=12)
-> Sort (cost=635283.31..635283.31 rows=9 width=12)
-> Nested Loop (cost=0.00..635283.17 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..635222.22 rows=4398 width=8)
SubPlan
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd2
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft2 (cost=0.00..4.06 rows=1
width=4)
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd3
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft3 (cost=0.00..4.06 rows=1
width=4)

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jean-Luc Lachance (#9)
Re: table alias nor valid for delete

Jean-Luc Lachance <jllachan@nsd.ca> writes:

I can't specify an alias for the table I want to delete from.
Is this a bug or is that behavior mandated by the SQL standard?

No, and yes.

regards, tom lane

#13Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Jean-Luc Lachance (#9)
Re: table alias nor valid for delete

On Tue, 9 Apr 2002, Jean-Luc Lachance wrote:

Hi all,

I can't specify an alias for the table I want to delete from.

Is this a bug or is that behavior mandated by the SQL standard?

Pretty sure it's at least SQL 92 spec...

<delete statement: searched> seems to have the form of:
DELETE FROM <table name> [WHERE <search condition>]

#14Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Alexis Maldonado (#10)
Re: Sub-selects taking way too long..

On Tue, 9 Apr 2002, Alexis Maldonado wrote:

Ok.. I know its provably something im doing dumb..
but here it goes..

I have 2 tables that are the same:

"temp_table" and "table"

"temp _table" has 7,761 rows and "table" is empty

the columns for both tables are: ID (primary key sequence), index, column1,
column2

when i run:

Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)

IN is unfortunately implemented slowly (I think the FAQ answer has more
details)

You can often get better performance using exists, I think the equivalent
would be:
insert into table
select index, column1, column2 from temp_table
where NOT EXISTS (select * from table where table.index=temp_Table.index)

#15Alexis Maldonado
amaldona@ctcd.cc.tx.us
In reply to: Stephan Szabo (#14)
Re: Sub-selects taking way too long..

thanks i'll try that :)

----- Original Message -----
From: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>
To: "Alexis Maldonado" <amaldona@ctcd.cc.tx.us>
Cc: <pgsql-general@postgresql.org>
Sent: Tuesday, April 09, 2002 3:56 PM
Subject: Re: [GENERAL] Sub-selects taking way too long..

On Tue, 9 Apr 2002, Alexis Maldonado wrote:

Ok.. I know its provably something im doing dumb..
but here it goes..

I have 2 tables that are the same:

"temp_table" and "table"

"temp _table" has 7,761 rows and "table" is empty

the columns for both tables are: ID (primary key sequence), index,

column1,

Show quoted text

column2

when i run:

Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)

IN is unfortunately implemented slowly (I think the FAQ answer has more
details)

You can often get better performance using exists, I think the equivalent
would be:
insert into table
select index, column1, column2 from temp_table
where NOT EXISTS (select * from table where table.index=temp_Table.index)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#16Alexis Maldonado
alex@ctc-disted.net
In reply to: Stephan Szabo (#14)
Re: Sub-selects taking way too long..

cool.. used the EXISTS and now it does it in 3 seconds instead of 40
minutes..
wow.. heheh

ThanX!! hehe

----- Original Message -----
From: "Alexis Maldonado" <amaldona@ctcd.cc.tx.us>
To: <pgsql-general@postgresql.org>
Sent: Tuesday, April 09, 2002 4:09 PM
Subject: Re: [GENERAL] Sub-selects taking way too long..

thanks i'll try that :)

----- Original Message -----
From: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>
To: "Alexis Maldonado" <amaldona@ctcd.cc.tx.us>
Cc: <pgsql-general@postgresql.org>
Sent: Tuesday, April 09, 2002 3:56 PM
Subject: Re: [GENERAL] Sub-selects taking way too long..

On Tue, 9 Apr 2002, Alexis Maldonado wrote:

Ok.. I know its provably something im doing dumb..
but here it goes..

I have 2 tables that are the same:

"temp_table" and "table"

"temp _table" has 7,761 rows and "table" is empty

the columns for both tables are: ID (primary key sequence), index,

column1,

column2

when i run:

Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)

IN is unfortunately implemented slowly (I think the FAQ answer has more
details)

You can often get better performance using exists, I think the

equivalent

would be:
insert into table
select index, column1, column2 from temp_table
where NOT EXISTS (select * from table where

table.index=temp_Table.index)

Show quoted text

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)