Hash Join vs Nested Loops in 7.2.1 ...
I have a 7.2.1 query with two peculiar characteristics and wondered if anyone
could offer some insight.
First, my query takes 90 seconds with a hash join, but only 1 second with
nested loops.
Second, the same query sometimes takes 10-50 seconds shortly after possibly a
dump or other high-data-volume queries are executed, after which it then
returns to 1 second execution time. Getting crowded out of shared memory?
Finally, I am inclined to turn off hash joins altogether. What should I
expect to lose in doing so?
Schema, query, and plans shown below...
Thanks,
Ed
-- 700,000 rows
CREATE TABLE story (
value TEXT,
key SERIAL
);
CREATE UNIQUE INDEX story_pkey ON story USING btree (key);
-- 700,000 rows
CREATE TABLE dict (
word VARCHAR,
key SERIAL
);
CREATE UNIQUE INDEX dict_pkey ON dict USING btree (key);
CREATE UNIQUE INDEX dict_word_key ON dict USING btree (word);
-- 28,000,000 rows
CREATE TABLE story_dict (
dictkey INTEGER NOT NULL,
storykey INTEGER NOT NULL
);
CREATE UNIQUE INDEX story_dict_pkey ON story_dict(dictkey, storykey);
CREATE INDEX story_dict_tk_idx ON story_dict(storykey);
-- Query:
-- ======
SELECT DISTINCT ftd1.storykey
FROM story_dict ftd1, dict d1
WHERE d1.word = 'foo'
AND d1.key = ftd1.dictkey
AND EXISTS (
SELECT ft2.key
FROM story ft2, story_dict ftd2, dict d2
WHERE d2.word = 'bar'
AND d2.key = ftd2.dictkey
AND ftd2.storykey = ft2.key
AND ftd2.storykey = ftd1.storykey)
AND EXISTS (
SELECT ft3.key
FROM story ft3, story_dict ftd3, dict d3
WHERE d3.word = 'baz'
AND d3.key = ftd3.dictkey
AND ftd3.storykey = ft3.key
AND ftd3.storykey = ftd1.storykey)
ORDER BY ftd1.storykey
LIMIT 1000;
Plan with PGOPTIONS = "-fn":
Limit (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Unique (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Sort (cost=15409053054.71..15409053054.71 rows=9 width=12)
-> Nested Loop (cost=100000000.00..15409053054.57 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..15309052993.63 rows=4398 width=8)
SubPlan
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd2
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft2 (cost=0.00..86701.96 rows=2389196 width=4)
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd3
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft3 (cost=0.00..86701.96 rows=2389196 width=4)
Plan with PGOPTIONS = "-fh":
Limit (cost=635283.31..635283.33 rows=1 width=12)
-> Unique (cost=635283.31..635283.33 rows=1 width=12)
-> Sort (cost=635283.31..635283.31 rows=9 width=12)
-> Nested Loop (cost=0.00..635283.17 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..635222.22 rows=4398 width=8)
SubPlan
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd2
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft2 (cost=0.00..4.06 rows=1
width=4)
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd3
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft3 (cost=0.00..4.06 rows=1
width=4)
Ed Loehr <pggeneral@bluepolka.net> writes:
I have a 7.2.1 query with two peculiar characteristics and wondered if
anyone could offer some insight.
First, my query takes 90 seconds with a hash join, but only 1 second with
nested loops.
Probably because the EXISTS subplans only need to fetch one row from
each table they access; that's more or less an ideal case for nestloop
indexscans. Nestloops do not scale very well to large retrieval sets,
however...
Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?
Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?
Finally, I am inclined to turn off hash joins altogether.
That would be a remarkably foolish thing to do. Certainly this query
is not a reason to do so; AFAICS the planner will do this one just fine
without any thumb on the scales.
regards, tom lane
Tom Lane wrote:
Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?
shared_buffers = 256
max_fsm_relations = 500
(defaults for the rest)
RAM: 2.4GB, maybe? Not that familiar with HPUX mem setup...
(OS: HP-UX B.11.00 U 9000/800)
$ swapinfo -mt
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 2048 147 1901 7% 0 - 1 /dev/vg00/lvol2
reserve - 312 -312
memory 369 351 18 95%
total 2417 810 1607 34% - 0 -
AFAICS the planner will do this one just fine
without any thumb on the scales.
How to I find the thumb?
Ed
Ed Loehr <pggeneral@bluepolka.net> writes:
Tom Lane wrote:
Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?
shared_buffers = 256
That's not a lot --- 256*8K = 2MB. You might try something in the low
thousands.
RAM: 2.4GB, maybe? Not that familiar with HPUX mem setup...
swapinfo won't tell you anything about physical RAM. If you poke around
in SAM, I think there's some displays in there about hardware ...
regards, tom lane
Tom Lane wrote:
Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?shared_buffers = 256
That's not a lot --- 256*8K = 2MB. You might try something in the low
thousands.
RAM: 2.4GB, maybe? Not that familiar with HPUX mem setup...
SAM indicates 512MB of RAM. I upped the shared buffers from 256 to 4096, and
the hashjoin query came down from ~90 seconds to 10, still 10x slower than
the 1-sec nested loops. Is that a performance difference you'd expect
between hash and nested loops on this query because of EXISTS?
Ed
Ed Loehr wrote:
Second, the same query sometimes takes 10-50 seconds shortly after
possibly a dump or other high-data-volume queries are executed, after
which it then returns to 1 second execution time. Getting crowded out
of shared memory?Sounds like it. What shared-buffers setting are you using? How much
RAM in the box?shared_buffers = 256
That's not a lot --- 256*8K = 2MB. You might try something in the low
thousands.SAM indicates 512MB of RAM. I upped the shared buffers from 256 to
4096, and the hashjoin query came down from ~90 seconds to 10, still 10x
slower than the 1-sec nested loops. Is that a performance difference
you'd expect between hash and nested loops on this query because of EXISTS?
What I neglected to mention was that the planner was *choosing* the slower
hashjoin plan over the much faster nested loop plan without any PGOPTIONS set
or any postgresql.conf changes to enable_*, thus the motivation for a "thumb
on the scales." After upping the number of shared buffers, it has begun
choosing the smart plan 1-second plan, apparently after a restart, not sure.
Thanks, Tom.
Ed
Has there been any work on an MDDB extension to PostgreSQL? We are
moving to a MOLAP environment, and PostgreSQL is my favorite RDBMS. If
no one has worked on it, I guess I will have to see what I can do...
The only real obstacle is time.
Ed Loehr <pggeneral@bluepolka.net> writes:
What I neglected to mention was that the planner was *choosing* the
slower hashjoin plan over the much faster nested loop plan without any
PGOPTIONS set or any postgresql.conf changes to enable_*, thus the
motivation for a "thumb on the scales." After upping the number of
shared buffers, it has begun choosing the smart plan 1-second plan,
Interesting. The estimated cost of indexscans is dependent on
shared_buffers, but not so dependent that I'd have expected it to make a
difference here. What were the EXPLAIN numbers you were getting, again?
regards, tom lane
Hi all,
I can't specify an alias for the table I want to delete from.
Is this a bug or is that behavior mandated by the SQL standard?
JLL
Ok.. I know its provably something im doing dumb..
but here it goes..
I have 2 tables that are the same:
"temp_table" and "table"
"temp _table" has 7,761 rows and "table" is empty
the columns for both tables are: ID (primary key sequence), index, column1,
column2
when i run:
Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)
it takes 40 MINUTES to execute..
i dont know what im doing wrong here both tables have "index" indexed ..
help...
Tom Lane wrote:
What I neglected to mention was that the planner was *choosing* the
slower hashjoin plan over the much faster nested loop plan without any
PGOPTIONS set or any postgresql.conf changes to enable_*, thus the
motivation for a "thumb on the scales." After upping the number of
shared buffers, it has begun choosing the smart plan 1-second plan,Interesting. The estimated cost of indexscans is dependent on
shared_buffers, but not so dependent that I'd have expected it to make a
difference here. What were the EXPLAIN numbers you were getting, again?
The default plan looked like the "-fn" plan below.
I guess I should also mention there are a number of columns in the 'story'
table that are not involved in the query or plans, but would add to the
'weight' of a row if that makes a difference to the planner. I omitted them
from my earlier listings thinking they were superfluous to this discussion.
Plan with PGOPTIONS = "-fn":
Limit (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Unique (cost=15409053054.71..15409053054.73 rows=1 width=12)
-> Sort (cost=15409053054.71..15409053054.71 rows=9 width=12)
-> Nested Loop (cost=100000000.00..15409053054.57 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..15309052993.63 rows=4398 width=8)
SubPlan
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd2
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft2 (cost=0.00..86701.96 rows=2389196 width=4)
-> Merge Join (cost=429157.62..435130.62 rows=1 width=16)
-> Sort (cost=13.11..13.11 rows=1 width=12)
-> Hash Join (cost=5.98..13.10 rows=1 width=12)
-> Index Scan using story_dict_tk_idx on story_dict ftd3
(cost=0.00..6.59 rows=106 width=8)
-> Hash (cost=5.98..5.98 rows=1 width=4)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Sort (cost=429144.50..429144.50 rows=2389196 width=4)
-> Seq Scan on story ft3 (cost=0.00..86701.96 rows=2389196 width=4)
Plan with PGOPTIONS = "-fh":
Limit (cost=635283.31..635283.33 rows=1 width=12)
-> Unique (cost=635283.31..635283.33 rows=1 width=12)
-> Sort (cost=635283.31..635283.31 rows=9 width=12)
-> Nested Loop (cost=0.00..635283.17 rows=9 width=12)
-> Index Scan using word_idx on dict d1 (cost=0.00..5.98 rows=1 width=4)
-> Index Scan using story_dict_pkey on story_dict ftd1
(cost=0.00..635222.22 rows=4398 width=8)
SubPlan
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d2 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd2
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft2 (cost=0.00..4.06 rows=1
width=4)
-> Nested Loop (cost=0.00..16.07 rows=1 width=16)
-> Nested Loop (cost=0.00..12.00 rows=1 width=12)
-> Index Scan using word_idx on dict d3 (cost=0.00..5.98 rows=1
width=4)
-> Index Scan using story_dict_pkey on story_dict ftd3
(cost=0.00..6.01 rows=1 width=8)
-> Index Scan using story_pkey on story ft3 (cost=0.00..4.06 rows=1
width=4)
Jean-Luc Lachance <jllachan@nsd.ca> writes:
I can't specify an alias for the table I want to delete from.
Is this a bug or is that behavior mandated by the SQL standard?
No, and yes.
regards, tom lane
On Tue, 9 Apr 2002, Jean-Luc Lachance wrote:
Hi all,
I can't specify an alias for the table I want to delete from.
Is this a bug or is that behavior mandated by the SQL standard?
Pretty sure it's at least SQL 92 spec...
<delete statement: searched> seems to have the form of:
DELETE FROM <table name> [WHERE <search condition>]
On Tue, 9 Apr 2002, Alexis Maldonado wrote:
Ok.. I know its provably something im doing dumb..
but here it goes..I have 2 tables that are the same:
"temp_table" and "table"
"temp _table" has 7,761 rows and "table" is empty
the columns for both tables are: ID (primary key sequence), index, column1,
column2when i run:
Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)
IN is unfortunately implemented slowly (I think the FAQ answer has more
details)
You can often get better performance using exists, I think the equivalent
would be:
insert into table
select index, column1, column2 from temp_table
where NOT EXISTS (select * from table where table.index=temp_Table.index)
thanks i'll try that :)
----- Original Message -----
From: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>
To: "Alexis Maldonado" <amaldona@ctcd.cc.tx.us>
Cc: <pgsql-general@postgresql.org>
Sent: Tuesday, April 09, 2002 3:56 PM
Subject: Re: [GENERAL] Sub-selects taking way too long..
On Tue, 9 Apr 2002, Alexis Maldonado wrote:
Ok.. I know its provably something im doing dumb..
but here it goes..I have 2 tables that are the same:
"temp_table" and "table"
"temp _table" has 7,761 rows and "table" is empty
the columns for both tables are: ID (primary key sequence), index,
column1,
Show quoted text
column2
when i run:
Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)IN is unfortunately implemented slowly (I think the FAQ answer has more
details)You can often get better performance using exists, I think the equivalent
would be:
insert into table
select index, column1, column2 from temp_table
where NOT EXISTS (select * from table where table.index=temp_Table.index)---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
cool.. used the EXISTS and now it does it in 3 seconds instead of 40
minutes..
wow.. heheh
ThanX!! hehe
----- Original Message -----
From: "Alexis Maldonado" <amaldona@ctcd.cc.tx.us>
To: <pgsql-general@postgresql.org>
Sent: Tuesday, April 09, 2002 4:09 PM
Subject: Re: [GENERAL] Sub-selects taking way too long..
thanks i'll try that :)
----- Original Message -----
From: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>
To: "Alexis Maldonado" <amaldona@ctcd.cc.tx.us>
Cc: <pgsql-general@postgresql.org>
Sent: Tuesday, April 09, 2002 3:56 PM
Subject: Re: [GENERAL] Sub-selects taking way too long..On Tue, 9 Apr 2002, Alexis Maldonado wrote:
Ok.. I know its provably something im doing dumb..
but here it goes..I have 2 tables that are the same:
"temp_table" and "table"
"temp _table" has 7,761 rows and "table" is empty
the columns for both tables are: ID (primary key sequence), index,
column1,
column2
when i run:
Insert Into table
select index, column1, column2
from temp_table
where index NOT IN (select index from table)IN is unfortunately implemented slowly (I think the FAQ answer has more
details)You can often get better performance using exists, I think the
equivalent
would be:
insert into table
select index, column1, column2 from temp_table
where NOT EXISTS (select * from table where
table.index=temp_Table.index)
Show quoted text
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)