performance: `tuplehash_iterate` and `group by` question, very slow

Started by Mingjun Yangabout 2 years ago1 messagesbugs
Jump to latest
#1Mingjun Yang
yangxiaoming2019@gmail.com

VERSION: PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
11.4.1 20230605 (Red Hat 11.4.1-2), 64-bit
SYSTEAM KERNEL: Linux localhost.localdomain 5.14.0-362.13.1.el9_3.x86_64 #1
SMP PREEMPT_DYNAMIC Wed Dec 13 14:07:45 UTC 2023 x86_64 x86_64 x86_64
GNU/Linux
SYSTEAM INFO: Rocky Linux 9.3 (Blue Onyx)
CPU: Intel(R) Xeon(R) w5-2445
MOMROY: 256GB

I have a 460 GB table.
+------------------+--------------------------+-----------+
| Column           | Type                     | Modifiers |
|------------------+--------------------------+-----------|
| sha_256          | bytea                    |  not null |
| aaaaa_id         | bigint                   |  not null |
| bbbbb_timestamp  | timestamp with time zone |  not null |
| cccccc_id        | integer                  |  not null |
| ddddddddddddddd  | cert_purpose_type        |  not null |
+------------------+--------------------------+-----------+

I run the query:

```sql
SELECT
l1.cccccc_id,
l1.count
FROM (
SELECT
cccccc_id,
count(aaaaa_id) count
FROM (
SELECT
cccccc_id,
aaaaa_id
FROM
some_table cle
GROUP BY
cccccc_id,
aaaaa_id) cle_data
GROUP BY
cccccc_id) l1;
```

It's long time run, more than 25 hours, CPU keep 95.6%.
My table were create by timescaledb, the query first stage run about 2
hours , and 243.17G temp file created.
I d'not know what's happened, And how I debug or performance this question.
And who had the same question?

```text
PostgreSQL 16.2 - localhost.localdomain - xxxx@localhost:5432/xxxx - Ref.:
2s - Duration mode: query
* Global: 1 day, 22 hours and 13 minutes uptime, 5.38T dbs size - 0B/s
growth, - cache hit ratio, 0.00% rollback ratio
Sessions: 6/800 total, 2 active, 4 idle, 0 idle in txn, 0 idle in txn
abrt, 0 waiting
Activity: 3 tps, 0 insert/s, 0 update/s, 0 delete/s, 0 tuples
returned/s, 244 temp files, 243.17G temp size
* Worker processes: 0/16 total, 0/4 logical workers, 0/16 parallel workers
Other processes & info: 0/8 autovacuum workers, 0/0 wal senders, 0 wal
receivers, 0/10 repl. slots
* Mem.: 250.85G total, 20.69G (8.25%) free, 18.60G (7.42%) used, 211.55G
(84.34%) buff+cached
Swap: 0B total, 0B (-) free, 0B (-) used
IO: 1/s max iops, 0B/s - 0/s read, 7.93K/s - 1/s write
Load average: 1.11 1.11 1.15

PID DATABASE APP USER CLIENT
CPU% MEM% READ/s WRITE/s TIME+ Waiting IOW
state Query
522417 some_db psql some_user 127.0.0.1 95.6
21.1 0B 7.93K 25 h N active
xxxx
```

I run the `perf top -p 522417` monitor the thread, the `tuplehash_iterate`
use 98.79% CPU cycles:

```text
Samples: 327K of event 'cycles', 4000 Hz, Event count (approx.):
64246693339 lost: 0/0 drop: 0/0
Overhead Shared Object Symbol
98.79% postgres [.] tuplehash_iterate
0.94% libc.so.6 [.] __memset_avx512_unaligned_erms
0.03% [kernel] [k] sync_regs
0.02% [kernel] [k] perf_adjust_freq_unthr_context
0.01% [kernel] [k] nohz_balancer_kick
0.01% [kernel] [k] timekeeping_advance
0.01% [kernel] [k] arch_asym_cpu_priority
0.01% [kernel] [k] arch_scale_freq_tick
0.01% [kernel] [k] _raw_spin_lock_irqsave
0.01% [kernel] [k] _raw_spin_lock
0.01% [kernel] [k] __intel_pmu_enable_all.constprop.0
0.01% [kernel] [k] cpuacct_account_field
0.01% [kernel] [k] update_cfs_group
0.01% [kernel] [k] cpuacct_charge
0.01% [kernel] [k] native_sched_clock
0.00% [kernel] [k] lapic_next_deadline
0.00% [kernel] [k] __run_timers.part.0
0.00% [kernel] [k] ktime_get_update_offsets_now
......
```

```text
+-----------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN
                                             |
|-----------------------------------------------------------------------------------------------------------------------|
| Subquery Scan on l1  (cost=167901456.53..167901758.53 rows=200 width=20)
                                             |
|   InitPlan 1 (returns $0)
                                            |
|     ->  Result  (cost=0.00..0.01 rows=1 width=8)
                                             |
|   ->  GroupAggregate  (cost=167901456.52..167901758.52 rows=200 width=12)
                                            |
|         Group Key: cle_1.cccccc_id
                                             |
|         ->  Sort  (cost=167901456.52..167901556.52 rows=40000 width=12)
                                            |
|               Sort Key: cle_1.cccccc_id
                                            |
|               ->  HashAggregate  (cost=167897998.98..167898398.98
rows=40000 width=12)                                |
|                     Group Key: cle_1.cccccc_id, cle_1.aaaaa_id
                                             |
|                     ->  Append  (cost=0.00..141132044.98 rows=5353190799
width=12)                                    |
|                           ->  Seq Scan on _hyper_10_71_chunk cle_1
 (cost=0.00..35615836.68 rows=1667104768 width=12) |
|                           ->  Seq Scan on _hyper_10_72_chunk cle_2
 (cost=0.00..22931461.24 rows=1073370624 width=12) |
|                           ->  Seq Scan on _hyper_10_73_chunk cle_3
 (cost=0.00..12029957.12 rows=563086912 width=12)  |
|                           ->  Seq Scan on _hyper_10_74_chunk cle_4
 (cost=0.00..10949797.28 rows=512535328 width=12)  |
|                           ->  Seq Scan on _hyper_10_75_chunk cle_5
 (cost=0.00..13747554.76 rows=643480576 width=12)  |
|                           ->  Seq Scan on _hyper_10_76_chunk cle_6
 (cost=0.00..7800797.52 rows=365127552 width=12)   |
|                           ->  Seq Scan on _hyper_10_77_chunk cle_7
 (cost=0.00..3433002.24 rows=160687824 width=12)   |
|                           ->  Seq Scan on _hyper_10_78_chunk cle_8
 (cost=0.00..3689582.96 rows=172702896 width=12)   |
|                           ->  Seq Scan on _hyper_10_79_chunk cle_9
 (cost=0.00..3913904.52 rows=183197152 width=12)   |
|                           ->  Seq Scan on _hyper_10_102_chunk cle_10
 (cost=0.00..103.32 rows=4732 width=12)          |
|                           ->  Seq Scan on _hyper_10_103_chunk cle_11
 (cost=0.00..96.84 rows=4484 width=12)           |
|                           ->  Seq Scan on _hyper_10_104_chunk cle_12
 (cost=0.00..688.01 rows=32201 width=12)         |
|                           ->  Seq Scan on _hyper_10_105_chunk cle_13
 (cost=0.00..713.61 rows=33361 width=12)         |
|                           ->  Seq Scan on _hyper_10_106_chunk cle_14
 (cost=0.00..899.98 rows=42098 width=12)         |
|                           ->  Seq Scan on _hyper_10_107_chunk cle_15
 (cost=0.00..1264.84 rows=59184 width=12)        |
|                           ->  Seq Scan on _hyper_10_108_chunk cle_16
 (cost=0.00..2671.00 rows=125000 width=12)       |
|                           ->  Seq Scan on _hyper_10_109_chunk cle_17
 (cost=0.00..3344.34 rows=156534 width=12)       |
|                           ->  Seq Scan on _hyper_10_110_chunk cle_18
 (cost=0.00..243609.02 rows=11402202 width=12)   |
|                           ->  Seq Scan on _hyper_10_148_chunk cle_19
 (cost=0.00..375.74 rows=17574 width=12)         |
|                           ->  Seq Scan on _hyper_10_155_chunk cle_20
 (cost=0.00..42.39 rows=1939 width=12)           |
|                           ->  Seq Scan on _hyper_10_157_chunk cle_21
 (cost=0.00..1.46 rows=46 width=12)              |
|                           ->  Seq Scan on _hyper_10_158_chunk cle_22
 (cost=0.00..46.81 rows=2181 width=12)           |
|                           ->  Seq Scan on _hyper_10_159_chunk cle_23
 (cost=0.00..5.63 rows=263 width=12)             |
|                           ->  Seq Scan on _hyper_10_161_chunk cle_24
 (cost=0.00..100.79 rows=4679 width=12)          |
|                           ->  Seq Scan on _hyper_10_172_chunk cle_25
 (cost=0.00..232.89 rows=10689 width=12)         |
| JIT:
                                             |
|   Functions: 59
                                            |
|   Options: Inlining true, Optimization true, Expressions true, Deforming
true                                         |
+-----------------------------------------------------------------------------------------------------------------------+
EXPLAIN 38
```

```text
+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->
| name | setting

|-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->
| BINDIR | /usr/local/pg16/bin

| DOCDIR | /usr/local/pg16/share/doc/postgresql

| HTMLDIR | /usr/local/pg16/share/doc/postgresql

| INCLUDEDIR | /usr/local/pg16/include

| PKGINCLUDEDIR | /usr/local/pg16/include/postgresql

| INCLUDEDIR-SERVER | /usr/local/pg16/include/postgresql/server

| LIBDIR | /usr/local/pg16/lib

| PKGLIBDIR | /usr/local/pg16/lib/postgresql

| LOCALEDIR | /usr/local/pg16/share/locale

| MANDIR | /usr/local/pg16/share/man

| SHAREDIR | /usr/local/pg16/share/postgresql

| SYSCONFDIR | /usr/local/pg16/etc/postgresql

| PGXS | /usr/local/pg16/lib/postgresql/pgxs/src/makefiles/
pgxs.mk

| CONFIGURE | '--prefix=/usr/local/pg16' '--with-openssl'
'-with-lz4' '--with-systemd' '--with-python' '--with-llvm'

| CC | gcc

| CPPFLAGS | -D_GNU_SOURCE

| CFLAGS | -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type
-Wshadow=compatible-local -Wfor>
| CFLAGS_SL | -fPIC

| LDFLAGS | -L/usr/lib64 -Wl,--as-needed
-Wl,-rpath,'/usr/local/pg16/lib',--enable-new-dtags

| LDFLAGS_EX |

| LDFLAGS_SL |

| LIBS | -lpgcommon -lpgport -llz4 -lssl -lcrypto -lz -ledit
-lm

| VERSION | PostgreSQL 16.2

+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->
```