9.6 Beta 2 Performance Regression on Recursive CTE

Started by Nicholson, Brad (Toronto, ON, CA)almost 10 years ago2 messagesgeneral
Jump to latest

Hi,

I'm seeing a performance regression on 9.6 Beta 2 compared to 9.5.3. The query is question is a recursive query on graph data stored as an adjacency list. While this is example is fairly contrived, it mimics the behavior I am seeing on real data with more realistic queries. The example below uses the same data set, same DB configuration, same query plan and same host. But the 9.6 beta 2 query runs in 26 seconds and 9.5.3 runs in 13.4 seconds. This pattern holds up on repeated runs.

This seems to be related to the citext data type. If I re-run the test below with varchar as the data type, performance is similar between 9.5 and 9.6.

Both are running on CentOS 6.6 using the PG community RPMs.

Version details:

version
----------------------------------------------------------------------------------------------------------
PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 9.6beta2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

Relevant config options (same on both)

cpu_index_tuple_cost=0.005
cpu_operator_cost=0.0025
cpu_tuple_cost=0.05
effective_cache_size=4GB
random_page_cost=4
seq_page_cost=1
shared_buffers=2GB
temp_buffers=8MB
work_mem=4MB

Very basic table structure:

\d adj_edge_citext
Table "public.adj_edge_citext"
Column | Type | Modifiers
--------+--------+-----------
parent | citext |
child | citext |

Import the Graph of Thrones data set from: https://www.macalester.edu/~abeverid/data/stormofswords.csv (ignore the weight column).

Run VACUUM ANALYZE on the table, and then run the following:

EXPLAIN (analyze, buffers, verbose) WITH RECURSIVE d AS (
SELECT
a.parent,
a.child,
1 AS LEVEL,
ARRAY[a.parent] AS path,
FALSE AS CYCLE
FROM
adj_edge_citext a
WHERE a.parent in (SELECT distinct(parent) FROM adj_edge_citext)
UNION ALL
SELECT
a1.parent,
a1.child,
d.LEVEL+1,
path || a1.parent,
a1.parent=ANY(path)
FROM
adj_edge_citext a1
JOIN d ON d.child=a1.parent
WHERE
NOT CYCLE
)
SELECT
parent,
child,
level,
path
FROM d;

9.5.3:
https://explain.depesz.com/s/HOO7
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on d (cost=9766.98..12900.18 rows=31332 width=100) (actual time=0.810..13023.860 rows=1427997 loops=1)
Output: d.parent, d.child, d.level, d.path
Buffers: shared hit=6, temp read=31714 written=64240
CTE d
-> Recursive Union (cost=31.31..9766.98 rows=31332 width=48) (actual time=0.805..9613.617 rows=1427997 loops=1)
Buffers: shared hit=6, temp read=31714 written=31701
-> Hash Semi Join (cost=31.31..69.83 rows=352 width=12) (actual time=0.803..1.879 rows=352 loops=1)
Output: a.parent, a.child, 1, ARRAY[a.parent], false
Hash Cond: (a.parent = adj_edge_citext.parent)
Buffers: shared hit=4
-> Seq Scan on public.adj_edge_citext a (cost=0.00..19.60 rows=352 width=12) (actual time=0.021..0.177 rows=352 loops=1)
Output: a.parent, a.child
Buffers: shared hit=2
-> Hash (cost=27.58..27.58 rows=71 width=6) (actual time=0.760..0.760 rows=71 loops=1)
Output: adj_edge_citext.parent
Buckets: 1024 Batches: 1 Memory Usage: 11kB
Buffers: shared hit=2
-> HashAggregate (cost=20.48..24.03 rows=71 width=6) (actual time=0.638..0.681 rows=71 loops=1)
Output: adj_edge_citext.parent
Group Key: adj_edge_citext.parent
Buffers: shared hit=2
-> Seq Scan on public.adj_edge_citext (cost=0.00..19.60 rows=352 width=6) (actual time=0.010..0.168 rows=352 loops=1)
Output: adj_edge_citext.parent
Buffers: shared hit=2
-> Hash Join (cost=38.08..656.40 rows=3098 width=48) (actual time=0.054..303.806 rows=62072 loops=23)
Output: a1.parent, a1.child, (d_1.level + 1), (d_1.path || a1.parent), (a1.parent = ANY (d_1.path))
Hash Cond: (d_1.child = a1.parent)
Buffers: shared hit=2, temp read=31714 written=13
-> WorkTable Scan on d d_1 (cost=0.00..352.00 rows=1760 width=68) (actual time=0.013..35.953 rows=62087 loops=23)
Output: d_1.parent, d_1.child, d_1.level, d_1.path, d_1.cycle
Filter: (NOT d_1.cycle)
Buffers: temp read=31714 written=13
-> Hash (cost=19.60..19.60 rows=352 width=12) (actual time=0.496..0.496 rows=352 loops=1)
Output: a1.parent, a1.child
Buckets: 1024 Batches: 1 Memory Usage: 24kB
Buffers: shared hit=2
-> Seq Scan on public.adj_edge_citext a1 (cost=0.00..19.60 rows=352 width=12) (actual time=0.009..0.204 rows=352 loops=1)
Output: a1.parent, a1.child
Buffers: shared hit=2
Planning time: 0.543 ms
Execution time: 13415.280 ms

9.6 Beta 2:
https://explain.depesz.com/s/R7oR
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on d (cost=9766.98..12900.18 rows=31332 width=100) (actual time=1.737..25688.068 rows=1427997 loops=1)
Output: d.parent, d.child, d.level, d.path
Buffers: shared hit=6, temp read=31714 written=64240
CTE d
-> Recursive Union (cost=31.31..9766.98 rows=31332 width=49) (actual time=1.731..22178.446 rows=1427997 loops=1)
Buffers: shared hit=6, temp read=31714 written=31701
-> Hash Semi Join (cost=31.31..69.83 rows=352 width=49) (actual time=1.728..3.862 rows=352 loops=1)
Output: a.parent, a.child, 1, ARRAY[a.parent], false
Hash Cond: (a.parent = adj_edge_citext.parent)
Buffers: shared hit=4
-> Seq Scan on index.adj_edge_citext a (cost=0.00..19.60 rows=352 width=12) (actual time=0.023..0.183 rows=352 loops=1)
Output: a.parent, a.child
Buffers: shared hit=2
-> Hash (cost=27.58..27.58 rows=71 width=6) (actual time=1.678..1.678 rows=71 loops=1)
Output: adj_edge_citext.parent
Buckets: 1024 Batches: 1 Memory Usage: 11kB
Buffers: shared hit=2
-> HashAggregate (cost=20.48..24.03 rows=71 width=6) (actual time=1.485..1.527 rows=71 loops=1)
Output: adj_edge_citext.parent
Group Key: adj_edge_citext.parent
Buffers: shared hit=2
-> Seq Scan on index.adj_edge_citext (cost=0.00..19.60 rows=352 width=6) (actual time=0.010..0.177 rows=352 loops=1)
Output: adj_edge_citext.parent, adj_edge_citext.child
Buffers: shared hit=2
-> Hash Join (cost=38.08..656.40 rows=3098 width=49) (actual time=0.095..845.508 rows=62072 loops=23)
Output: a1.parent, a1.child, (d_1.level + 1), (d_1.path || a1.parent), (a1.parent = ANY (d_1.path))
Hash Cond: (d_1.child = a1.parent)
Buffers: shared hit=2, temp read=31714 written=13
-> WorkTable Scan on d d_1 (cost=0.00..352.00 rows=1760 width=68) (actual time=0.021..41.653 rows=62087 loops=23)
Output: d_1.parent, d_1.child, d_1.level, d_1.path, d_1.cycle
Filter: (NOT d_1.cycle)
Buffers: temp read=31714 written=13
-> Hash (cost=19.60..19.60 rows=352 width=12) (actual time=0.757..0.757 rows=352 loops=1)
Output: a1.parent, a1.child
Buckets: 1024 Batches: 1 Memory Usage: 24kB
Buffers: shared hit=2
-> Seq Scan on index.adj_edge_citext a1 (cost=0.00..19.60 rows=352 width=12) (actual time=0.012..0.166 rows=352 loops=1)
Output: a1.parent, a1.child
Buffers: shared hit=2
Planning time: 0.785 ms
Execution time: 26062.927 ms

As an aside, the cost mis-estimates are very common with these types of queries on graph data, and frequently result in the planner making bad choices which result in queries taking seconds instead of milliseconds. I am aware of the whole CTE optimization fence issue, but it is a pretty big problem for graph data in Postgres.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nicholson, Brad (Toronto, ON, CA) (#1)
Re: 9.6 Beta 2 Performance Regression on Recursive CTE

"Nicholson, Brad (Toronto, ON, CA)" <bnicholson@hpe.com> writes:

I'm seeing a performance regression on 9.6 Beta 2 compared to 9.5.3.
The query is question is a recursive query on graph data stored as an
adjacency list.

FWIW, I can't reproduce any regression with this example. I get EXPLAIN
execution times like these, depending on which datatype and environment
I use:

9.5:
text, C locale: 9307.186 ms
citext, C locale: 12541.840 ms
citext, en_US.utf8 locale: 22551.332 ms

HEAD:
text, C locale: 9109.792 ms
citext, C locale: 12251.408 ms
citext, en_US.utf8 locale: 22129.683 ms

The plans are all the same of course. The extra runtime for citext is
presumably due to the cost of smashing strings to lowercase within
citext_eq().

This is with assert-enabled builds of current branch tips, not the
release versions you're working from, but I don't recall that we've
changed anything recently that would be likely to affect this.

Maybe you're comparing an assert-enabled beta build to a
not-assert-enabled 9.5 build?

As an aside, the cost mis-estimates are very common with these types of
queries on graph data,

The only error I'm noticing is a bad guess about the size of the recursive
union result, which is unsurprising since it is only a guess. If you've
heard of ways to estimate recursive union sizes more plausibly, maybe we
could do something about that.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general