what's going on here?

Started by Benabout 25 years ago11 messagesgeneral

bench@silentmedia.com

about 25 years ago

We have two postgres-driven web pages, and we've noticed in the last few
days that one is loading about 10 times faster than the other. Looking
into it, we found out that postgres has very different execution paths for
what is essentially the same query, and this makes no sense to me.

Sorry for the deludge of info here, but maybe it'll explain why this is
happening....

First, here are the explains. Note the only parameter that changes is
playlist.stream, and yet for some reason the second query makes use of
indices which are unrelated to playlist.stream.

music=# explain select users.name as username, playlist.id as id,
paths.len as len, paths.path as path, timestamp(playlist.added) as added,
timestamp(playlist.played) as played from users, playlist, paths where
paths.md5=files.md5 and files.song=playlist.songid and
playlist.userid=users.id and playlist.played is null and playlist.stream=1
order by playlist.added, playlist.id;
NOTICE: QUERY PLAN:

Sort (cost=10845.28..10845.28 rows=1857 width=80)
-> Hash Join (cost=10007.93..10744.44 rows=1857 width=80)
-> Merge Join (cost=10006.44..10668.17 rows=1857 width=64)
-> Sort (cost=3754.03..3754.03 rows=2321 width=36)
-> Hash Join (cost=306.62..3624.28 rows=2321 width=36)
-> Seq Scan on files (cost=0.00..1072.13 rows=55413 width=16)
-> Hash (cost=300.81..300.81 rows=2321 width=20)
-> Seq Scan on playlist (cost=0.00..300.81 rows=2321 width=20)
-> Sort (cost=6252.41..6252.41 rows=50617 width=28)
-> Seq Scan on paths (cost=0.00..1677.17 rows=50617 width=28)
-> Hash (cost=1.39..1.39 rows=39 width=16)
-> Seq Scan on users (cost=0.00..1.39 rows=39 width=16)

EXPLAIN
music=# explain select users.name as username, playlist.id as id,
paths.len as len, paths.path as path, timestamp(playlist.added) as added,
timestamp(playlist.played) as played from users, playlist, paths where
paths.md5=files.md5 and files.song=playlist.songid and
playlist.userid=users.id and playlist.played is null and playlist.stream=2
order by playlist.added, playlist.id;
NOTICE: QUERY PLAN:

Sort (cost=1467.44..1467.44 rows=164 width=80)
-> Hash Join (cost=1.49..1461.40 rows=164 width=80)
-> Nested Loop (cost=0.00..1452.85 rows=164 width=64)
-> Nested Loop (cost=0.00..771.76 rows=205 width=36)
-> Seq Scan on playlist (cost=0.00..300.81 rows=205 width=20)
-> Index Scan using files_song_btree_key on files (cost=0.00..2.28 rows=1 width=16)
-> Index Scan using paths_md5_btree_key on paths (cost=0.00..3.31 rows=1 width=28)
-> Hash (cost=1.39..1.39 rows=39 width=16)
-> Seq Scan on users (cost=0.00..1.39 rows=39 width=16)

EXPLAIN

Okay, so now the relevant schema:

Index "users_name_key"
Attribute | Type
-----------+------
name | text
unique btree

Index "users_pkey"
Attribute | Type
-----------+---------
id | integer
unique btree (primary key)

Index "playlist_added_key"
Attribute | Type
-----------+---------
added | integer
btree

Index "playlist_pkey"
Attribute | Type
-----------+---------
id | integer
unique btree (primary key)

Index "playlist_played_key"
Attribute | Type
-----------+---------
played | integer
btree

Index "playlist_songid_key"
Attribute | Type
-----------+---------
songid | integer
btree

Index "paths_md5_btree_key"
Attribute | Type
-----------+----------
md5 | char(32)
btree

Index "paths_md5_key"
Attribute | Type
-----------+----------
md5 | char(32)
hash

Index "paths_path_key"
Attribute | Type
-----------+------
path | text
btree

Index "files_md5_btree_key"
Attribute | Type
-----------+----------
md5 | char(32)
btree

Index "files_md5_key"
Attribute | Type
-----------+----------
md5 | char(32)
hash

Index "files_song_btree_key"
Attribute | Type
-----------+---------
song | integer
btree

Index "files_song_key"
Attribute | Type
-----------+---------
song | integer
hash

Interestingly, the sequential scan on playlist claims to be returning 2000
results for stream=1 and only 200 for stream=2. I'm not sure which part of
the where clause this guess comes from, because the playlist table has
equal numbers of entries for both streams.