Please help to improve SELECT JOIN performance, EXPLAIN ANALYZE attached
Good evening,
I have a word game which uses PostgreSQL 13.2 and 80% of the code is
written as stored functions in PL/PgSQL or SQL.
Recently I have purchased some traffic and the number of daily games
increased from 100 to 700.
In the PostgreSQL log I have noticed that the duration for 2 particular
queries have increased, especially in the evenings:
2021-02-19 17:51:19.104 CET [68932] LOG: duration: 2356.723 ms execute
<unnamed>: SELECT words_stat_longest($1::int)
2021-02-19 17:55:23.290 CET [68602] LOG: duration: 2326.507 ms execute
<unnamed>: SELECT words_stat_longest($1::int)
2021-02-19 17:57:57.057 CET [68932] LOG: duration: 1257.773 ms execute
<unnamed>: SELECT out_uid AS uid, out_fcm AS fcm, out_apns AS
apns, out_adm AS adm, out_hms AS hms, out_social AS social,
out_sid AS sid, out_body AS body FROM words_play_game($1::int,
$2::int, $3::jsonb)
2021-02-19 18:02:51.359 CET [68603] LOG: duration: 2305.950 ms execute
<unnamed>: SELECT words_stat_longest($1::int)
2021-02-19 18:08:26.130 CET [68932] LOG: duration: 2375.713 ms execute
<unnamed>: SELECT words_stat_longest($1::int)
One of the queries is actually a simple SELECT ... LIMIT 15 to find the 15
longest words played by a user (they are displayed at the player profile
page).
I have run the EXPLAIN ANALYZE here: https://explain.depesz.com/s/IcqN
It is difficult for me to understand, what could be missing there, does
anybody please have an idea?
Thank you for any input
Alex
P.S. Here are the tables involved in the SELECT query, with the
words_moves and words_geoip holding the most records:
# \d words_scores
Table "public.words_scores"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
mid | bigint | | not null |
gid | integer | | not null |
uid | integer | | not null |
word | text | | not null |
score | integer | | not null |
Indexes:
"words_scores_gid_idx" btree (gid)
"words_scores_length_mid_idx" btree (length(word) DESC, mid DESC)
"words_scores_uid_idx" btree (uid)
Check constraints:
"words_scores_score_check" CHECK (score >= 0)
"words_scores_word_check" CHECK (word ~ '^[А-Я]{2,}$'::text)
Foreign-key constraints:
"words_scores_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid)
ON DELETE CASCADE
"words_scores_mid_fkey" FOREIGN KEY (mid) REFERENCES words_moves(mid)
ON DELETE CASCADE
"words_scores_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid)
ON DELETE CASCADE
# \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
str | text | | |
hand | text | | |
letters | character(1)[] | | |
values | integer[] | | |
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid) ON
DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid) ON
DELETE CASCADE
Referenced by:
TABLE "words_puzzles" CONSTRAINT "words_puzzles_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE
# \d words_geoip
Table "public.words_geoip"
Column | Type | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
block | inet | | not null |
lat | double precision | | |
lng | double precision | | |
Indexes:
"words_geoip_pkey" PRIMARY KEY, btree (block)
# \d words_games
Table "public.words_games"
Column | Type | Collation | Nullable |
Default
----------+--------------------------+-----------+----------+------------------------------------------
gid | integer | | not null |
nextval('words_games_gid_seq'::regclass)
created | timestamp with time zone | | not null |
finished | timestamp with time zone | | |
player1 | integer | | not null |
player2 | integer | | |
played1 | timestamp with time zone | | |
played2 | timestamp with time zone | | |
state1 | text | | |
state2 | text | | |
reason | text | | |
hint1 | text | | |
hint2 | text | | |
score1 | integer | | not null |
score2 | integer | | not null |
chat1 | integer | | not null |
chat2 | integer | | not null |
hand1 | character(1)[] | | not null |
hand2 | character(1)[] | | not null |
pile | character(1)[] | | not null |
letters | character(1)[] | | not null |
values | integer[] | | not null |
bid | integer | | not null |
diff1 | integer | | |
diff2 | integer | | |
open1 | boolean | | not null | false
open2 | boolean | | not null | false
Indexes:
"words_games_pkey" PRIMARY KEY, btree (gid)
"words_games_player1_coalesce_idx" btree (player1, COALESCE(finished,
'infinity'::timestamp with time zone))
"words_games_player2_coalesce_idx" btree (player2, COALESCE(finished,
'infinity'::timestamp with time zone))
Check constraints:
"words_games_chat1_check" CHECK (chat1 >= 0)
"words_games_chat2_check" CHECK (chat2 >= 0)
"words_games_check" CHECK (player1 <> player2)
"words_games_score1_check" CHECK (score1 >= 0)
"words_games_score2_check" CHECK (score2 >= 0)
Foreign-key constraints:
"words_games_bid_fkey" FOREIGN KEY (bid) REFERENCES words_boards(bid)
ON DELETE CASCADE
"words_games_player1_fkey" FOREIGN KEY (player1) REFERENCES
words_users(uid) ON DELETE CASCADE
"words_games_player2_fkey" FOREIGN KEY (player2) REFERENCES
words_users(uid) ON DELETE CASCADE
Referenced by:
TABLE "words_chat" CONSTRAINT "words_chat_gid_fkey" FOREIGN KEY (gid)
REFERENCES words_games(gid) ON DELETE CASCADE
TABLE "words_moves" CONSTRAINT "words_moves_gid_fkey" FOREIGN KEY (gid)
REFERENCES words_games(gid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_gid_fkey" FOREIGN KEY
(gid) REFERENCES words_games(gid) ON DELETE CASCADE
# \d words_users
Table "public.words_users"
Column | Type | Collation | Nullable |
Default
-------------+--------------------------+-----------+----------+------------------------------------------
uid | integer | | not null |
nextval('words_users_uid_seq'::regclass)
created | timestamp with time zone | | not null |
visited | timestamp with time zone | | not null |
ip | inet | | not null |
fcm | text | | |
apns | text | | |
adm | text | | |
motto | text | | |
vip_until | timestamp with time zone | | |
grand_until | timestamp with time zone | | |
elo | integer | | not null |
medals | integer | | not null |
coins | integer | | not null |
avg_score | double precision | | |
avg_time | interval | | |
hms | text | | |
removed | boolean | | not null | false
muted | boolean | | not null | false
Indexes:
"words_users_pkey" PRIMARY KEY, btree (uid)
Check constraints:
"words_users_elo_check" CHECK (elo >= 0)
"words_users_medals_check" CHECK (medals >= 0)
Referenced by:
TABLE "words_chat" CONSTRAINT "words_chat_uid_fkey" FOREIGN KEY (uid)
REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_games" CONSTRAINT "words_games_player1_fkey" FOREIGN KEY
(player1) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_games" CONSTRAINT "words_games_player2_fkey" FOREIGN KEY
(player2) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_moves" CONSTRAINT "words_moves_uid_fkey" FOREIGN KEY (uid)
REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_puzzles" CONSTRAINT "words_puzzles_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_reviews" CONSTRAINT "words_reviews_author_fkey" FOREIGN
KEY (author) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_reviews" CONSTRAINT "words_reviews_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_social" CONSTRAINT "words_social_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_stats" CONSTRAINT "words_stats_uid_fkey" FOREIGN KEY (uid)
REFERENCES words_users(uid) ON DELETE CASCADE
# \d words_social
Table "public.words_social"
Column | Type | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
sid | text | | not null |
social | integer | | not null |
given | text | | not null |
family | text | | |
photo | text | | |
lat | double precision | | |
lng | double precision | | |
stamp | integer | | not null |
uid | integer | | not null |
Indexes:
"words_social_pkey" PRIMARY KEY, btree (sid, social)
"words_social_uid_stamp_idx" btree (uid, stamp DESC)
Check constraints:
"words_social_given_check" CHECK (given ~ '\S'::text)
"words_social_photo_check" CHECK (photo ~* '^https?://...'::text)
"words_social_sid_check" CHECK (sid ~ '\S'::text)
"words_social_social_check" CHECK (0 < social AND social <= 128)
Foreign-key constraints:
"words_social_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid)
ON DELETE CASCADE
Referenced by:
TABLE "words_payments" CONSTRAINT "words_payments_sid_fkey" FOREIGN KEY
(sid, social) REFERENCES words_social(sid, social) ON DELETE CASCADE
"words_geoip_block_idx" spgist (block)
Your explain analyze is showing an example that runs in less than 15ms, and
your logs indicate queries taking 2-3 seconds. I am missing part of the
picture here. Can you clarify?
Yes, Michael, that I have noticed too, but should have written more in my
original mail.
The query when I try it does run in 15ms, but evening logs show the query
(I think only small percentage of it) running 1-3s.
At the same time my CentOS 8 server with 64 GB RAM is never loaded, the
load average show by top is 0.35-0.45.
What could be the reason, does PostgreSQL 13.2 has some scheduled task
maybe?
I have only changed few lines in the postgresql.conf:
# diff postgresql.conf.OLD postgresql.conf
64c64
< max_connections = 100 # (change requires restart)
---
max_connections = 120 # (change requires restart)
130c130
< #work_mem = 4MB # min 64kB
---
work_mem = 8MB # min 64kB
132c132
< #maintenance_work_mem = 64MB # min 1MB
---
maintenance_work_mem = 128MB # min 1MB
393c393
< #effective_cache_size = 4GB
---
effective_cache_size = 8GB
410,411c410,411
< #from_collapse_limit = 8
< #join_collapse_limit = 8 # 1 disables collapsing of explicit
---
from_collapse_limit = 24
join_collapse_limit = 24 # 1 disables collapsing of explicit
469c469
< #log_min_messages = warning # values in order of decreasing
detail:
---
log_min_messages = notice # values in order of decreasing
detail:
497c497
< #log_min_duration_statement = -1 # -1 is disabled, 0 logs all
statements
---
log_min_duration_statement = 1000 # -1 is disabled, 0 logs all
statements
And I have pgbouncer in front of the PostgreSQL:
diff pgbouncer.ini.OLD pgbouncer.ini
12a13,15
words_de = host=/tmp user=xxx password=xxx dbname=words_de
words_en = host=/tmp user=xxx password=xxx dbname=words_en
words_ru = host=/tmp user=xxx password=xxx dbname=words_ru
115a119
;; auth_type = md5
148c152
< ;server_reset_query = DISCARD ALL
---
server_reset_query = DISCARD ALL
156c160
< ;ignore_startup_parameters = extra_float_digits
---
ignore_startup_parameters = extra_float_digits
180c184
< ;max_client_conn = 100
---
max_client_conn = 1000
185c189
< ;default_pool_size = 20
---
default_pool_size = 100
Or is maybe everything ok...? The daily gzipped dump grows big:
939343358 Dec 31 01:33 words_ru-Dec.sql.gz
1221682336 Jan 31 01:33 words_ru-Jan.sql.gz
1423324283 Feb 20 01:34 words_ru-Feb.sql.gz
and being an inexperienced pg admin I am a bit worried.
Thank you
Alex
Show quoted text
On 2021-02-20 10:24:36 +0100, Alexander Farber wrote:
The query when I try it does run in 15ms, but evening logs show the query (I
think only small percentage of it) running 1-3s.At the same time my CentOS 8 server with 64 GB RAM is never loaded, the load
average show by top is 0.35-0.45.What could be the reason, does PostgreSQL 13.2 has some scheduled task maybe?
One very common reason for wildly different execution times is where the
data is: Accessing data in RAM is a lot faster than accessing it on disk
(even for SSDs, but especially for rotating disks).
So it might be that when you executed the EXPLAIN the data just happened
to be in RAM (that's almost certainly the case if you executed the same
query without EXPLAIN before that) while in those cases where it was
slow, the data (or a large part of it) had to be loaded from disk first.
Data might not be in RAM because it wasn't needed for some time, because
some other query or process which needed lots of RAM displaced it, or
because the server was just rebootet.
I have only changed few lines in the postgresql.conf:
I don't see shared_buffers here. That's quite small by default so you
might want to increase it. The usual recommendation is to start with 25%
of the memory (that would be 16 GB in your case) and then see if it gets
better if decrease or increase it.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On 2021-02-19 21:25:36 +0100, Alexander Farber wrote:
In the PostgreSQL log I have noticed that the duration for 2 particular queries
have increased, especially in the evenings:
[...]
One of the queries is actually a simple SELECT ... LIMIT 15 to find the 15
longest words played by a user (they are displayed at the player profile page).I have run the EXPLAIN ANALYZE here: https://explain.depesz.com/s/IcqN
Well, I wouldn't call a query with 8 joins "simple".
I notice two things:
1) It does a lot more than the description ("Select 15 longest words
played by the user id 5") suggests. It not only selects those words, but
also the games in which they were played and some information about both
players.
2) The query seems to spend most of the time in
https://explain.depesz.com/s/IcqN#l11, retrieving all the words ever
played by that user via a parallel bitmap heap scan, only to sort
them and discard all but the 15 longest. I think using an index
should make that much more efficient.
Therefore, three suggestions:
1) Add an index on words_scores(uid, length(word) desc).
2) If postgresql still isn't using that, isolate the core query by
putting it into a CTE:
with s as (select * from words_scores where uid = 5 order by length(word))
select ...
from s
join words_moves m ...
3) If that shifts the bottleneck towards that extra per-user info
(location, photo, ...) try to cache that in the application. That
probably doesn't change very often and doesn't have to be retrieved
from the database every time.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Using auto_explain may be needed to catch the plan when it is slow, if the
other suggestions don't resolve things for you.
On 2021-02-20 06:43:28 -0700, Michael Lewis wrote:
Using auto_explain may be needed to catch the plan when it is slow, if the
other suggestions don't resolve things for you.
Yes, auto_explain is very helpful. Unfortunately, by default it only
does an explain, not the equivalent of an explain analyze. So you know
what the plan was and what the total time was, but you still don't know
where it spent that time.
You can set auto_explain.log_analyze = on, but the documentation warns
that "this can have an extremely negative impact on performance". Does
anybody have experience with trying this in a real-world workload? (I
was never brave enough)
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Thank you all, I will try at first
shared_buffers = 16GB and
index on words_scores(uid, length(word) desc)
With shared_buffers=16GB, pg_top shows:
last pid: 2613; load avg: 0.49, 0.45, 0.37; up 0+00:19:21
16:41:16
16 processes: 16 sleeping
CPU states: 9.2% user, 0.0% nice, 0.2% system, 90.4% idle, 0.1% iowait
Memory: 21G used, 42G free, 29M buffers, 18G cached
DB activity: 39 tps, 0 rollbs/s, 0 buffer r/s, 100 hit%, 3625 row
r/s, 11 row w/s
DB I/O: 0 reads/s, 0 KB/s, 0 writes/s, 0 KB/s
DB disk: 0.0 GB total, 0.0 GB free (100% used)
Swap: 32G free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
2114 postgres 20 0 17G 6378M sleep 2:11 0.77% 44.80% postgres:
words words_ru [local] idle
2012 postgres 20 0 17G 6005M sleep 0:55 0.45% 26.30% postgres:
words words_ru [local] idle
2107 postgres 20 0 17G 6175M sleep 1:16 0.02% 0.97% postgres:
words words_ru [local] idle
1184 postgres 20 0 17G 22M sleep 0:00 0.02% 0.97% postgres:
walwriter
2030 postgres 20 0 17G 5948M sleep 1:18 0.00% 0.00% postgres:
words words_ru [local] idle
2013 postgres 20 0 17G 5724M sleep 0:57 0.00% 0.00% postgres:
words words_ru [local] idle
2103 postgres 20 0 17G 5823M sleep 0:52 0.00% 0.00% postgres:
words words_ru [local] idle
2031 postgres 20 0 17G 5619M sleep 0:41 0.00% 0.00% postgres:
words words_ru [local] idle
2029 postgres 20 0 17G 1128M sleep 0:00 0.00% 0.00% postgres:
words words_ru [local] idle
2104 postgres 20 0 17G 948M sleep 0:00 0.00% 0.00% postgres:
words words_ru [local] idle
2106 postgres 20 0 17G 1257M sleep 0:00 0.00% 0.00% postgres:
words words_ru [local] idle
1182 postgres 20 0 17G 231M sleep 0:00 0.00% 0.00% postgres:
checkpointer
1183 postgres 20 0 17G 135M sleep 0:00 0.00% 0.00% postgres:
background writer
1185 postgres 20 0 17G 8696K sleep 0:00 0.00% 0.00% postgres:
autovacuum launcher
2614 postgres 20 0 17G 22M sleep 0:00 0.00% 0.00% postgres:
words words_ru [local] idle
1187 postgres 20 0 17G 6764K sleep 0:00 0.00% 0.00% postgres:
logical replication launcher
On Sat, Feb 20, 2021 at 4:15 PM Alexander Farber <alexander.farber@gmail.com>
wrote:
Show quoted text
Thank you all, I will try at first
shared_buffers = 16GB and
index on words_scores(uid, length(word) desc)
No issues for us. We have used a low sample rate of 1% or so and gotten
some very useful data. Particularly with logging nested statements so we
can profile stored procs more easily than adding a large number of raise
notice statements.
On 2021-02-21 10:14:04 -0700, Michael Lewis wrote:
No issues for us. We have used a low sample rate of 1% or so and gotten some
very useful data.
Oh, somehow I never noticed the auto_explain.sample_rate parameter in
the docs. Good to know.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"