[PATCH] Reduce pg_class scans in GRANT/REVOKE ON ALL TABLES IN SCHEMA
Hi hackers,
While reading aclchk.c I noticed that objectsInSchemaToOids(), used
by GRANT/REVOKE ... ON ALL TABLES IN SCHEMA, calls
getRelationsInNamespace() five times for the OBJECT_TABLE case --
once per relkind (RELATION, VIEW, MATVIEW, FOREIGN_TABLE,
PARTITIONED_TABLE):
case OBJECT_TABLE:
objs = getRelationsInNamespace(namespaceId, RELKIND_RELATION);
objects = list_concat(objects, objs);
objs = getRelationsInNamespace(namespaceId, RELKIND_VIEW);
...
objs = getRelationsInNamespace(namespaceId,
RELKIND_PARTITIONED_TABLE);
objects = list_concat(objects, objs);
break;
pg_class does have an index on (relname, relnamespace), but there
is no index matching (relnamespace, relkind), so each of those
per-relkind calls falls back to a full heap scan via
table_beginscan_catalog(). The work is just repeated five times.
The attached patch introduces a small helper
getRelationsInNamespaceMulti() that performs a single heap scan
filtered by relnamespace and distributes matching tuples into
per-relkind buckets supplied by the caller. Relkind filtering is
done in C after each tuple is read, which is trivially cheap. The
OBJECT_TABLE case uses the helper; OBJECT_SEQUENCE and
OBJECT_PROPGRAPH are left on the original getRelationsInNamespace()
helper because they only need a single relkind and benefit from the
second ScanKey.
Correctness / order preservation
--------------------------------
* Result order is identical. The underlying pg_class heap (and
thus its physical scan order) is the same regardless of how we
filter, so each bucket ends up holding the same OIDs in the same
relative order as a separate per-relkind heap scan would have
produced. Concatenating the buckets in the original relkind
order reproduces the previous list tuple-for-tuple.
I verified this empirically. On a schema with interleaved
relkinds (tables, views, matviews, partitioned tables) I ran two
equivalent SQL formulations while forcing seq scans on pg_class:
OLD-path model: UNION ALL of five
"SELECT oid FROM pg_class
WHERE relnamespace = X AND relkind = Y
ORDER BY ctid"
queries, one per relkind, in the same group order the code
uses.
NEW-path model: a single
"SELECT oid FROM pg_class
WHERE relnamespace = X
ORDER BY ctid"
bucketed by relkind and concatenated in the same group
order.
The two formulations produced identical OID sequences, element
by element. A positional FULL JOIN between them returned zero
rows.
* MVCC semantics are, if anything, a bit stricter. The old code
took five separate catalog scans, so in principle concurrent DDL
could commit between scan N and scan N+1 and be visible to one
but not another. With a single scan everything is collected
under one catalog snapshot.
* Locking is unchanged in kind: AccessShareLock on pg_class is
still taken, just once instead of five times.
Benchmark
---------
This is a targeted micro-optimization, not a dramatic speedup.
With 10,000 tables in a single schema (pg_class ~10,452 rows),
running GRANT/REVOKE SELECT ON ALL TABLES IN SCHEMA in a loop
(6 iterations, first dropped as warmup), I measured a consistent
~15% reduction in end-to-end time:
baseline patched delta
GRANT (avg) 88.2 ms 75.9 ms -14%
REVOKE (avg) 134.9 ms 115.7 ms -14%
Per-iteration numbers (ms):
baseline GRANT : 92, 87, 87, 85, 89
patched GRANT : 77, 72, 72, 76, 79, 79
baseline REVOKE: 145, 144, 132, 128, 130, 128
patched REVOKE: 114, 117, 112, 120, 112, 119
The absolute savings are small because most of the time in these
commands is spent updating per-relation ACL tuples, not scanning
pg_class. For schemas with only a handful of relations the effect
is not measurable. The change is aimed at multi-tenant /
partition-heavy installations that regularly issue
"... ON ALL TABLES IN SCHEMA ..." statements over large catalogs.
Testing
-------
Both `make check` and `make check-world` pass cleanly with the
patch applied on top of current master (all suites green, no new
failures). TAP tests were not exercised (tree configured without
--enable-tap-tests); I can rerun with TAP enabled if that is useful.
The patch is attached (against master). Feedback and review
welcome -- in particular I'd like to know if anyone sees a
correctness concern I missed, or prefers a different shape for the
helper (e.g. returning a single flat list rather than per-relkind
buckets).
Thanks,
charsyam
Attachments:
0001-Reduce-pg_class-scans-in-GRANT-REVOKE-ON-ALL-TABLES-.patchapplication/octet-stream; name=0001-Reduce-pg_class-scans-in-GRANT-REVOKE-ON-ALL-TABLES-.patchDownload+80-11
On Sun, Apr 12, 2026 at 04:22:24PM +0900, CharSyam wrote:
Benchmark
---------
This is a targeted micro-optimization, not a dramatic speedup.
With 10,000 tables in a single schema (pg_class ~10,452 rows),
running GRANT/REVOKE SELECT ON ALL TABLES IN SCHEMA in a loop
(6 iterations, first dropped as warmup), I measured a consistent
~15% reduction in end-to-end time:baseline patched delta
GRANT (avg) 88.2 ms 75.9 ms -14%
REVOKE (avg) 134.9 ms 115.7 ms -14%
I am pretty sure that there are users with millions of relations in a
single schema that could benefit from that. At least that would not
be surprising with partitioning these days, and foreign tables. What
kind of numbers do you get if you bump up the number of digits for
these tests. Let's say a comparison based on a few million relations
at least?
The change you are proposing looks simple enough, quickly skimming
through the patch. There may be more optimizations doable here, I
have not looked at that, still I tend to like such micro-optimization
proposals as they provide a silent benefit.
--
Michael
I ran a benchmark on the patch *Reduce pg_class scans in GRANT/REVOKE
ON ALL TABLES IN SCHEMA* (collapses 5 per-relkind pg_class heap scans
into 1 scan distributed into per-relkind buckets). Summary below.(It took
much time to tests)
## Assumptions
- Two builds of PostgreSQL 19devel from the same source tree (one
patched, one at master tip), identical compile flags, separate
--prefix.
- Separate data directories, run sequentially on an otherwise idle
host.
- GUCs: shared_buffers=2GB, max_locks_per_transaction=100000,
fsync=off, synchronous_commit=off, full_page_writes=off,
autovacuum=off.
- bench_s schema contains N empty tables (CREATE TABLE t_i()).
- Measured operations:
GRANT SELECT ON ALL TABLES IN SCHEMA bench_s TO bench_role
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA bench_s FROM bench_role
- Best of 3 runs reported (seconds).
- Two scenarios:
A. Clean catalog — VACUUM FULL pg_class; VACUUM FULL pg_attribute
immediately before measurement. pg_class
densely packed.
B. Bloated catalog — pre-bloat phase: GRANT+REVOKE on all N tables
repeated C cycles, no VACUUM. Both patched
and master operate on catalogs with identical
relpages and n_dead_tup.
## Results — Scenario A (Clean catalog)
macOS (Apple Silicon), best of 3, seconds:
ntables | GRANT patched | GRANT master | REVOKE patched | REVOKE master
--------+---------------+--------------+----------------+--------------
20,000 | 0.116 | 0.115 | 0.231 | 0.226
40,000 | 0.250 | 0.250 | 0.460 | 0.460
100,000 | 0.730 | 0.678 | 1.193 | 1.193
Honestly, there is no measurable performance difference in the clean
state. Patched and master are statistically indistinguishable within
run-to-run noise. This matches the design of the patch: when pg_class
is densely packed, repeating a small seq scan five times is cheap, so
collapsing it into one has nothing meaningful to save. The patch adds
no overhead either — worst case is a tie.
## Results — Scenario B (Bloated catalog)
### Linux x86_64, C=20, best of 3, seconds
ntables | dead_tup | GRANT patched | GRANT master | Δ | REVOKE
patched | REVOKE master | Δ
----------+-----------+---------------+--------------+---------+----------------+---------------+---------
10,000 | 0 | 0.0924 | 0.0935 | −1.2 % |
0.1668 | 0.1696 | −1.6 %
20,000 | 109,825 | 0.2027 | 0.2069 | −2.0 % |
0.3381 | 0.3533 | −4.3 %
50,000 | 329,468 | 0.5555 | 0.5895 | −5.8 % |
0.8901 | 0.9371 | −5.0 %
100,000 | 879,311 | 1.1732 | 1.1968 | −2.0 % |
1.8808 | 1.9555 | −3.8 %
200,000 | 1,978,925 | 2.2188 | 2.3470 | −5.5 % |
3.7290 | 3.9064 | −4.5 %
500,000 | 4,178,604 | 6.0260 | 6.6663 | −9.6 % |
9.8162 | 10.2169 | −3.9 %
1,000,000 | 9,678,399 | 12.9241 | 14.7657 | −12.5 % |
24.8893 | 28.7566 | −13.4 %
### macOS (Apple Silicon), C=20 (C=10 at 1M), best of 3, seconds
ntables | dead_tup | GRANT patched | GRANT master | Δ | REVOKE
patched | REVOKE master | Δ
----------+-----------+---------------+--------------+--------+----------------+---------------+--------
20,000 | 299,960 | 0.168 | 0.163 | +3 % |
0.260 | 0.278 | −6 %
40,000 | 519,601 | 0.307 | 0.307 | 0 % |
0.552 | 0.564 | −2 %
100,000 | 959,268 | 0.784 | 0.934 | −16 % |
1.405 | 1.419 | ~0 %
200,000 | 2,058,886 | 1.787 | 1.878 | −5 % |
2.745 | 2.849 | −4 %
500,000 | 4,258,565 | 4.727 | 5.197 | −9 % |
7.126 | 7.908 | −10 %
1,000,000 | 9,758,364 | 10.977 | 11.126 | −1 % |
19.473 | 20.759 | −6 %
Negative Δ = patched faster. Under catalog bloat the patch produces a
consistent, reproducible improvement on both operating systems.
Happy to share the bench scripts and raw logs on request.
Thanks,
charsyam
2026년 4월 13일 (월) 오전 9:43, Michael Paquier <michael@paquier.xyz>님이 작성:
Show quoted text
On Sun, Apr 12, 2026 at 04:22:24PM +0900, CharSyam wrote:
Benchmark
---------
This is a targeted micro-optimization, not a dramatic speedup.
With 10,000 tables in a single schema (pg_class ~10,452 rows),
running GRANT/REVOKE SELECT ON ALL TABLES IN SCHEMA in a loop
(6 iterations, first dropped as warmup), I measured a consistent
~15% reduction in end-to-end time:baseline patched delta
GRANT (avg) 88.2 ms 75.9 ms -14%
REVOKE (avg) 134.9 ms 115.7 ms -14%I am pretty sure that there are users with millions of relations in a
single schema that could benefit from that. At least that would not
be surprising with partitioning these days, and foreign tables. What
kind of numbers do you get if you bump up the number of digits for
these tests. Let's say a comparison based on a few million relations
at least?The change you are proposing looks simple enough, quickly skimming
through the patch. There may be more optimizations doable here, I
have not looked at that, still I tend to like such micro-optimization
proposals as they provide a silent benefit.
--
Michael