Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Started by alex workalmost 2 years ago35 messages
#1alex work
alexwork033@gmail.com

Hello,

we run multiple versions of PostgreSQL instances on production. Some time ago
we add new physical servers and decided to go with latest GA from pgdg APT
repository, that is PostgreSQL 16.

We encounter slow `GRANT ROLES` only on PostgreSQL 16 instances up to 42 seconds
in production, the client process at PostgresSQL would use 100% of the CPU.
Which is a surprise compared to other instances running older PostgreSQL
releases. On production we have a *LOT* of ROLEs, which unfortunately a case
that we did not test before switching the new servers into production mode.

The Application & ROLEs
-----------------------
Our application make use of ROLEs. We create group ROLEs for each tenant of our
application, these ROLEs are named with `d_` and `a_` prefix.

A special ROLE, called `acc`, it will be a member to each of these `d_` and
`a_` ROLEs.

The application have a concept of "session", which it would mantain and I think
outside the scope of this e-mail. In relation to PostgreSQL, the application
would create a PostgreSQL ROLE that corresponds to its own (application)
session. It would name these ROLEs with `s_` prefix, which CREATEd and
GRANTed its permission on every application's "session".

When an application "session" started, user with `acc` ROLE would grant
membersip of `d_` ROLE to `s_` ROLE (ie. GRANT ROLE `d_xxxx` TO `s_xxxx`;)

To make this clear, for example, we have (say) role `d_202402` already existing
and application would create a new ROLE `s_0000001` which corresponds to
application's "session". Application that connects with special ROLE `acc`
would GRANT ROLE `d_202402` to the ROLE `s_0000001`, like so:

GRANT d_202402 TO s_0000001;

In production we have up to 13 thousands of these ROLEs, each:

$ sudo -u postgres psql -p 5531
psql (16.2 (Debian 16.2-1.pgdg120+2))
Type "help" for help.

postgres=# select count(*) s_roles_count from pg_catalog.pg_authid
where rolname like 's_%';
s_roles_count
---------------
13299
(1 row)

postgres=# select count(*) a_roles_count from pg_catalog.pg_authid
where rolname like 'a_%';
a_roles_count
---------------
12776
(1 row)

postgres=# select count(*) d_roles_count from pg_catalog.pg_authid
where rolname like 'd_%';
d_roles_count
---------------
13984
(1 row)

The Setup
---------

Investigating this slow `GRANT ROLE` we start a VM running Debian 11,
and create a lot of roles.

create special `acc` role and write to some file:
$ echo -e "CREATE ROLE acc WITH LOGIN NOSUPERUSER INHERIT CREATEDB
CREATEROLE NOREPLICATION;\n\n" > create_acc.sql

create a lot of `a_` roles and make sure `acc` is member of each one of them:
$ for idx1 in $(seq -w 1 100); do for idx2 in $(seq -w 1 12); do for
idx3 in $(seq -w 1 10); do echo "CREATE ROLE a_${idx1}${idx2}${idx3}
WITH NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;"; echo "GRANT
a_${idx1}${idx2}${idx3} TO acc WITH ADMIN OPTION;"; done; done; done >
create_a.sql

create a lot of `d_` roles and make sure `acc` is member of each one of them:
$ for idx1 in $(seq -w 1 100); do for idx2 in $(seq -w 1 12); do for
idx3 in $(seq -w 1 10); do echo "CREATE ROLE d_${idx1}${idx2}${idx3}
WITH NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;"; echo "GRANT
d_${idx1}${idx2}${idx3} TO acc WITH ADMIN OPTION;"; done; done; done >
create_d.sql

create a lot of `s_` roles:
$ for idx1 in $(seq -w 1 100); do for idx2 in $(seq -w 1 12); do for
idx3 in $(seq -w 1 10); do echo "CREATE ROLE s_${idx1}${idx2}${idx3}
WITH NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;"; done; done;
done > create_s.sql

merge ROLE creation into one file:
$ cat create_acc.sql create_a.sql create_d.sql create_s.sql >
/tmp/create-roles.sql

PostgreSQL 16
-------------

Install PostgreSQL 16:
--
$ sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt
$(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
$ sudo apt install gnupg2
$ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc
| sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get -y install postgresql-16 postgresql-client-16

Create PostgreSQL 16 instance:
--
$ sudo pg_dropcluster --stop 16 main # drop default Debian cluster
$ sudo pg_createcluster 16 pg16
$ echo "local all acc trust" | sudo tee
/etc/postgresql/16/pg16/pg_hba.conf
$ echo "local all postgres peer" | sudo tee -a
/etc/postgresql/16/pg16/pg_hba.conf
$ sudo systemctl start postgresql@16-pg16.service

Import lots of roles:
--
$ sudo -u postgres /usr/lib/postgresql/16/bin/psql -f
/tmp/create-roles.sql -p 5432 -d postgres

Using ROLE `acc`, grant `d_` ROLE to a session ROLE:
--
$ time sudo -u postgres /usr/lib/postgresql/16/bin/psql -U acc
postgres -c 'GRANT d_0010109 TO s_0010109;'
GRANT ROLE

real 0m7.579s
user 0m0.054s
sys 0m0.020s

This is the surprising behavior for PostgreSQL 16. It seems there's a new logic
in PostgreSQL that checks against each role, and it took 100% of CPU.

At this point we know `acc` is just another ROLE that happens to have ADMIN
privilege that is a member of `d_0010109` group ROLE.

But what happens when `acc` is a SUPERUSER?

Alter role `acc` as SUPERUSER:
--
$ sudo -u postgres /usr/lib/postgresql/16/bin/psql -c 'ALTER ROLE acc
WITH SUPERUSER'
ALTER ROLE

This is a workaround to make GRANT ROLE bearable.

Using ROLE `acc`, grant `d_` ROLE to a session ROLE, again:
--
$ time sudo -u postgres /usr/lib/postgresql/16/bin/psql -U acc
postgres -c 'GRANT d_0010108 TO s_0010108;'
GRANT ROLE

real 0m0.079s
user 0m0.054s
sys 0m0.019s

OK this is fast.

But what hapens when `acc` is back being not a SUPERUSER?

Alter role `acc` to stop being SUPERUSER:
--
$ sudo -u postgres psql -c 'ALTER ROLE acc WITH NOSUPERUSER'
ALTER ROLE

Using ROLE `acc`, grant `d_` ROLE to a session ROLE with `acc` not a SUPERUSER:
--
$ time sudo -u postgres /usr/lib/postgresql/16/bin/psql -U acc
postgres -c 'GRANT d_0010107 TO s_0010107;'
GRANT ROLE

real 0m7.741s
user 0m0.055s
sys 0m0.021s

As expected, slow `GRANT ROLE` again.

At this point, we try with PostgreSQL 15 just to make sure that this
is new to PostgreSQL 16.

$ sudo systemctl stop postgresql@16-pg16

PostgreSQL 15
-------------

Install PostgreSQL 15:
--
$ sudo apt-get update
$ sudo apt-get -y install postgresql-15 postgresql-client-15

$ sudo pg_dropcluster --stop 15 main # drop default Debian cluster
$ sudo pg_createcluster 15 pg15
$ echo "local all acc trust" | sudo tee
/etc/postgresql/15/pg15/pg_hba.conf
$ echo "local all postgres peer" | sudo tee -a
/etc/postgresql/15/pg15/pg_hba.conf
$ sudo systemctl start postgresql@15-pg15.service

Import lots of roles:
--
$ sudo -u postgres /usr/lib/postgresql/15/bin/psql -f
/tmp/create-roles.sql -p 5433 -d postgres

Using ROLE `acc`, grant `d_` ROLE to a session ROLE:
--
$ time sudo -u postgres /usr/lib/postgresql/15/bin/psql -U acc -p 5433
postgres -c 'GRANT d_0010109 TO s_0010109;'
GRANT ROLE

real 0m0.077s
user 0m0.054s
sys 0m0.017s

Seems OK with the same amount of ROLEs. The `acc` ROLE is not a SUPERUSER here.

Alter role `acc` as SUPERUSER:
--
$ sudo -u postgres /usr/lib/postgresql/15/bin/psql -p 5433 -c 'ALTER
ROLE acc WITH SUPERUSER'
ALTER ROLE

Using ROLE `acc`, grant `d_` ROLE to a session ROLE, again:
--
$ time sudo -u postgres /usr/lib/postgresql/15/bin/psql -p 5433 -U acc
postgres -c 'GRANT d_0010108 TO s_0010108;'
GRANT ROLE

real 0m0.084s
user 0m0.057s
sys 0m0.021s

Doesn't matter, GRANT ROLE works still as fast.

Alter role `acc` to stop being a SUPERUSER:
--
$ sudo -u postgres /usr/lib/postgresql/15/bin/psql -p 5433 -c 'ALTER
ROLE acc WITH NOSUPERUSER'
ALTER ROLE

Using ROLE `acc`, grant `d_` ROLE to a session ROLE with `acc` not a SUPERUSER:
--
$ time sudo -u postgres /usr/lib/postgresql/15/bin/psql -p 5433 -U acc
postgres -c 'GRANT d_0010107 TO s_0010107;'
GRANT ROLE

real 0m0.077s
user 0m0.054s
sys 0m0.017s

Again, doesn't matter, GRANT ROLE works still as fast.

Looking At The Source Code
--------------------------

Looking at git diff of `REL_15_6` against `REL_16_0`, it seems the
`roles_is_member_of` function called by the new in PostgreSQL 16
`check_role_membership_authorization`
is expensive for our use case.

REL_16_0: src/backend/commands/user.c:1562
---8<------
(errcode(ERRCODE_INVALID_GRANT_OPERATION),
errmsg("column names cannot be included in GRANT/REVOKE ROLE")));

roleid = get_role_oid(rolename, false);
check_role_membership_authorization(currentUserId,
roleid, stmt->is_grant);
if (stmt->is_grant)
--->8------

While I can see the value in improvements on how ROLEs are being handled
PostgreSQL 16 onward, I'm curious what would help for setups that has thousands
of ROLEs like us outside of patching the source code?

#2Dominique Devienne
ddevienne@gmail.com
In reply to: alex work (#1)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Thu, Mar 21, 2024 at 8:10 AM alex work <alexwork033@gmail.com> wrote:

We encounter slow `GRANT ROLES` only on PostgreSQL 16 instances up to 42
seconds
in production, the client process at PostgresSQL would use 100% of the
CPU. [...]

Using ROLE `acc`, grant `d_` ROLE to a session ROLE:

real 0m7.579s [...]

PostgreSQL 15

Using ROLE `acc`, grant `d_` ROLE to a session ROLE:
real 0m0.077s

Ouch, that's a ~ 100x regression. Thanks for the write-up, that's worrying.
We don't have as many ROLEs, but we do have plenty, so this is worrying.

On top of the v16 ROLE changes breaking on ROLE logic, which was fine prior
(v12-v15).
We've paused for now our planned v16 upgrade, until we have more time to
adapt.

Like you, I welcome the changes. But it turns out more expensive to adapt
to them.
And your report certainly makes me wonder whether we should hold off until
that perf regression is addressed.

Thanks, --DD

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: alex work (#1)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

[ redirecting to -hackers ]

alex work <alexwork033@gmail.com> writes:

We encounter slow `GRANT ROLES` only on PostgreSQL 16 instances up to 42 seconds
in production, the client process at PostgresSQL would use 100% of the CPU.
Which is a surprise compared to other instances running older PostgreSQL
releases. On production we have a *LOT* of ROLEs, which unfortunately a case
that we did not test before switching the new servers into production mode.

I poked into this a bit. It seems the problem is that as of v16, we
try to search for the "best" role membership path from the current
user to the target role, and that's done in a very brute-force way,
as a side effect of computing the set of *all* role memberships the
current role has. In the given case, we could have skipped all that
if we simply tested whether the current role is directly a member
of the target: it is, so there can't be any shorter path. But in
any case roles_is_member_of has horrid performance when the current
role is a member of a lot of roles.

It looks like part of the blame might be ascribable to catcache.c,
as if you look at the problem microscopically you find that
roles_is_member_of is causing catcache to make a ton of AUTHMEMMEMROLE
catcache lists, and SearchSysCacheList is just iterating linearly
through the cache's list-of-lists, so that search is where the O(N^2)
time is actually getting taken. Up to now that code has assumed that
any one catcache would not have very many catcache lists. Maybe it's
time to make that smarter; but since we've gotten away with this
implementation for decades, I can't help feeling that the real issue
is with roles_is_member_of's usage pattern.

For self-containedness, attached is a directly usable shell script
to reproduce the problem. The complaint is that the last GRANT
takes multiple seconds (about 5s on my machine), rather than
milliseconds.

regards, tom lane

Attachments:

grant_with_many_roles.shtext/x-shellscript; charset=us-ascii; name=grant_with_many_roles.shDownload
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

I wrote:

I poked into this a bit. It seems the problem is that as of v16, we
try to search for the "best" role membership path from the current
user to the target role, and that's done in a very brute-force way,
as a side effect of computing the set of *all* role memberships the
current role has.

Actually, roles_is_member_of sucks before v16 too; the new thing
is only that it's being invoked during GRANT ROLE. Using the
roles created by the given test case, I see in v15:

$ psql
psql (15.6)
Type "help" for help.

regression=# drop table at;
DROP TABLE
regression=# set role a_0010308;
SET
regression=> create table at(f1 int);
CREATE TABLE
regression=> \timing
Timing is on.
regression=> set role acc;
SET
Time: 0.493 ms
regression=> insert into at values(1);
INSERT 0 1
Time: 3565.029 ms (00:03.565)
regression=> insert into at values(1);
INSERT 0 1
Time: 2.308 ms

So it takes ~3.5s to populate the roles_is_member_of cache for "acc"
given this membership set. This is actually considerably worse than
in v16 or HEAD, where the same test takes about 1.6s for me.

Apparently the OP has designed their use-case so that they dodge these
implementation problems in v15 and earlier, but that's a far cry from
saying that there were no problems with lots-o-roles before v16.

regards, tom lane

#5Noname
walther@technowledgy.de
In reply to: Tom Lane (#4)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Tom Lane:

Actually, roles_is_member_of sucks before v16 too; the new thing
is only that it's being invoked during GRANT ROLE. Using the
roles created by the given test case, I see in v15:

[...]
So it takes ~3.5s to populate the roles_is_member_of cache for "acc"
given this membership set. This is actually considerably worse than
in v16 or HEAD, where the same test takes about 1.6s for me.

Ah, this reminds me that I hit the same problem about a year ago, but
haven't had the time to put together a test-case, yet. In my case, it's
like this:
- I have one role "authenticator" with which the application (PostgREST)
connects to the database.
- This role has been granted all of the actual user roles and will then
do a SET ROLE for each authenticated request it handles.
- In my case that's currently about 120k roles granted to
"authenticator", back then it was probably around 60k.
- The first request (SET ROLE) for each session took between 5 and 10
*minutes* to succeed - subsequent requests were instant.
- When the "authenticator" role is made SUPERUSER, the first request is
instant, too.

I guess this matches exactly what you are observing.

There is one more thing that is actually even worse, though: When you
try to cancel the query or terminate the backend while the SET ROLE is
still running, this will not work. It will not only not cancel the
query, but somehow leave the process for that backend in some kind of
zombie state that is impossible to recover from.

All of this was v15.

Best,

Wolfgang

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

I wrote:

It looks like part of the blame might be ascribable to catcache.c,
as if you look at the problem microscopically you find that
roles_is_member_of is causing catcache to make a ton of AUTHMEMMEMROLE
catcache lists, and SearchSysCacheList is just iterating linearly
through the cache's list-of-lists, so that search is where the O(N^2)
time is actually getting taken. Up to now that code has assumed that
any one catcache would not have very many catcache lists. Maybe it's
time to make that smarter; but since we've gotten away with this
implementation for decades, I can't help feeling that the real issue
is with roles_is_member_of's usage pattern.

I wrote a quick finger exercise to make catcache.c use a hash table
instead of a single list for CatCLists, modeling it closely on the
existing hash logic for simple catcache entries. This helps a good
deal, but I still see the problematic GRANT taking ~250ms, compared
to 5ms in v15. roles_is_member_of is clearly on the hook for that.

regards, tom lane

Attachments:

v1-use-hash-table-for-CatCLists.patchtext/x-diff; charset=us-ascii; name=v1-use-hash-table-for-CatCLists.patchDownload
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index d5a3c1b591..569f51cb33 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -88,6 +88,8 @@ static void CatCachePrintStats(int code, Datum arg);
 #endif
 static void CatCacheRemoveCTup(CatCache *cache, CatCTup *ct);
 static void CatCacheRemoveCList(CatCache *cache, CatCList *cl);
+static void RehashCatCache(CatCache *cp);
+static void RehashCatCacheLists(CatCache *cp);
 static void CatalogCacheInitializeCache(CatCache *cache);
 static CatCTup *CatalogCacheCreateEntry(CatCache *cache,
 										HeapTuple ntp, SysScanDesc scandesc,
@@ -444,6 +446,7 @@ CatCachePrintStats(int code, Datum arg)
 	long		cc_neg_hits = 0;
 	long		cc_newloads = 0;
 	long		cc_invals = 0;
+	long		cc_nlists = 0;
 	long		cc_lsearches = 0;
 	long		cc_lhits = 0;
 
@@ -453,7 +456,7 @@ CatCachePrintStats(int code, Datum arg)
 
 		if (cache->cc_ntup == 0 && cache->cc_searches == 0)
 			continue;			/* don't print unused caches */
-		elog(DEBUG2, "catcache %s/%u: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits",
+		elog(DEBUG2, "catcache %s/%u: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %d lists, %ld lsrch, %ld lhits",
 			 cache->cc_relname,
 			 cache->cc_indexoid,
 			 cache->cc_ntup,
@@ -465,6 +468,7 @@ CatCachePrintStats(int code, Datum arg)
 			 cache->cc_searches - cache->cc_hits - cache->cc_neg_hits - cache->cc_newloads,
 			 cache->cc_searches - cache->cc_hits - cache->cc_neg_hits,
 			 cache->cc_invals,
+			 cache->cc_nlist,
 			 cache->cc_lsearches,
 			 cache->cc_lhits);
 		cc_searches += cache->cc_searches;
@@ -472,10 +476,11 @@ CatCachePrintStats(int code, Datum arg)
 		cc_neg_hits += cache->cc_neg_hits;
 		cc_newloads += cache->cc_newloads;
 		cc_invals += cache->cc_invals;
+		cc_nlists += cache->cc_nlist;
 		cc_lsearches += cache->cc_lsearches;
 		cc_lhits += cache->cc_lhits;
 	}
-	elog(DEBUG2, "catcache totals: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits",
+	elog(DEBUG2, "catcache totals: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lists, %ld lsrch, %ld lhits",
 		 CacheHdr->ch_ntup,
 		 cc_searches,
 		 cc_hits,
@@ -485,6 +490,7 @@ CatCachePrintStats(int code, Datum arg)
 		 cc_searches - cc_hits - cc_neg_hits - cc_newloads,
 		 cc_searches - cc_hits - cc_neg_hits,
 		 cc_invals,
+		 cc_nlists,
 		 cc_lsearches,
 		 cc_lhits);
 }
@@ -573,6 +579,8 @@ CatCacheRemoveCList(CatCache *cache, CatCList *cl)
 					 cache->cc_keyno, cl->keys);
 
 	pfree(cl);
+
+	--cache->cc_nlist;
 }
 
 
@@ -611,14 +619,19 @@ CatCacheInvalidate(CatCache *cache, uint32 hashValue)
 	 * Invalidate *all* CatCLists in this cache; it's too hard to tell which
 	 * searches might still be correct, so just zap 'em all.
 	 */
-	dlist_foreach_modify(iter, &cache->cc_lists)
+	for (int i = 0; i < cache->cc_nlbuckets; i++)
 	{
-		CatCList   *cl = dlist_container(CatCList, cache_elem, iter.cur);
+		dlist_head *bucket = &cache->cc_lbucket[i];
 
-		if (cl->refcount > 0)
-			cl->dead = true;
-		else
-			CatCacheRemoveCList(cache, cl);
+		dlist_foreach_modify(iter, bucket)
+		{
+			CatCList   *cl = dlist_container(CatCList, cache_elem, iter.cur);
+
+			if (cl->refcount > 0)
+				cl->dead = true;
+			else
+				CatCacheRemoveCList(cache, cl);
+		}
 	}
 
 	/*
@@ -691,14 +704,19 @@ ResetCatalogCache(CatCache *cache)
 	int			i;
 
 	/* Remove each list in this cache, or at least mark it dead */
-	dlist_foreach_modify(iter, &cache->cc_lists)
+	for (i = 0; i < cache->cc_nlbuckets; i++)
 	{
-		CatCList   *cl = dlist_container(CatCList, cache_elem, iter.cur);
+		dlist_head *bucket = &cache->cc_lbucket[i];
 
-		if (cl->refcount > 0)
-			cl->dead = true;
-		else
-			CatCacheRemoveCList(cache, cl);
+		dlist_foreach_modify(iter, bucket)
+		{
+			CatCList   *cl = dlist_container(CatCList, cache_elem, iter.cur);
+
+			if (cl->refcount > 0)
+				cl->dead = true;
+			else
+				CatCacheRemoveCList(cache, cl);
+		}
 	}
 
 	/* Remove each tuple in this cache, or at least mark it dead */
@@ -862,6 +880,12 @@ InitCatCache(int id,
 									 MCXT_ALLOC_ZERO);
 	cp->cc_bucket = palloc0(nbuckets * sizeof(dlist_head));
 
+	/*
+	 * Many catcaches never receive any list searches.  Therefore, we don't
+	 * allocate the cc_lbuckets till we get a list search.
+	 */
+	cp->cc_lbucket = NULL;
+
 	/*
 	 * initialize the cache's relation information for the relation
 	 * corresponding to this cache, and initialize some of the new cache's
@@ -874,7 +898,9 @@ InitCatCache(int id,
 	cp->cc_relisshared = false; /* temporary */
 	cp->cc_tupdesc = (TupleDesc) NULL;
 	cp->cc_ntup = 0;
+	cp->cc_nlist = 0;
 	cp->cc_nbuckets = nbuckets;
+	cp->cc_nlbuckets = 0;
 	cp->cc_nkeys = nkeys;
 	for (i = 0; i < nkeys; ++i)
 	{
@@ -939,6 +965,44 @@ RehashCatCache(CatCache *cp)
 	cp->cc_bucket = newbucket;
 }
 
+/*
+ * Enlarge a catcache's list storage, doubling the number of buckets.
+ */
+static void
+RehashCatCacheLists(CatCache *cp)
+{
+	dlist_head *newbucket;
+	int			newnbuckets;
+	int			i;
+
+	elog(DEBUG1, "rehashing catalog cache id %d for %s; %d lists, %d buckets",
+		 cp->id, cp->cc_relname, cp->cc_nlist, cp->cc_nlbuckets);
+
+	/* Allocate a new, larger, hash table. */
+	newnbuckets = cp->cc_nlbuckets * 2;
+	newbucket = (dlist_head *) MemoryContextAllocZero(CacheMemoryContext, newnbuckets * sizeof(dlist_head));
+
+	/* Move all entries from old hash table to new. */
+	for (i = 0; i < cp->cc_nlbuckets; i++)
+	{
+		dlist_mutable_iter iter;
+
+		dlist_foreach_modify(iter, &cp->cc_lbucket[i])
+		{
+			CatCList   *cl = dlist_container(CatCList, cache_elem, iter.cur);
+			int			hashIndex = HASH_INDEX(cl->hash_value, newnbuckets);
+
+			dlist_delete(iter.cur);
+			dlist_push_head(&newbucket[hashIndex], &cl->cache_elem);
+		}
+	}
+
+	/* Switch to the new array. */
+	pfree(cp->cc_lbucket);
+	cp->cc_nlbuckets = newnbuckets;
+	cp->cc_lbucket = newbucket;
+}
+
 /*
  *		CatalogCacheInitializeCache
  *
@@ -1588,7 +1652,9 @@ SearchCatCacheList(CatCache *cache,
 	Datum		v4 = 0;			/* dummy last-column value */
 	Datum		arguments[CATCACHE_MAXKEYS];
 	uint32		lHashValue;
+	Index		lHashIndex;
 	dlist_iter	iter;
+	dlist_head *lbucket;
 	CatCList   *cl;
 	CatCTup    *ct;
 	List	   *volatile ctlist;
@@ -1602,7 +1668,7 @@ SearchCatCacheList(CatCache *cache,
 	/*
 	 * one-time startup overhead for each cache
 	 */
-	if (cache->cc_tupdesc == NULL)
+	if (unlikely(cache->cc_tupdesc == NULL))
 		CatalogCacheInitializeCache(cache);
 
 	Assert(nkeys > 0 && nkeys < cache->cc_nkeys);
@@ -1618,11 +1684,36 @@ SearchCatCacheList(CatCache *cache,
 	arguments[3] = v4;
 
 	/*
-	 * compute a hash value of the given keys for faster search.  We don't
-	 * presently divide the CatCList items into buckets, but this still lets
-	 * us skip non-matching items quickly most of the time.
+	 * If we haven't previously done a list search in this cache, create the
+	 * bucket header array; otherwise, consider whether it's time to enlarge
+	 * it.
+	 */
+	if (cache->cc_lbucket == NULL)
+	{
+		/* Arbitrary initial size --- must be a power of 2 */
+		int			nbuckets = 16;
+
+		cache->cc_lbucket = (dlist_head *)
+			MemoryContextAllocZero(CacheMemoryContext,
+								   nbuckets * sizeof(dlist_head));
+		/* Don't set cc_nlbuckets if we get OOM allocating cc_lbucket */
+		cache->cc_nlbuckets = nbuckets;
+	}
+	else
+	{
+		/*
+		 * If the hash table has become too full, enlarge the buckets array.
+		 * Quite arbitrarily, we enlarge when fill factor > 2.
+		 */
+		if (cache->cc_nlist > cache->cc_nlbuckets * 2)
+			RehashCatCacheLists(cache);
+	}
+
+	/*
+	 * Find the hash bucket in which to look for the CatCList.
 	 */
 	lHashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+	lHashIndex = HASH_INDEX(lHashValue, cache->cc_nlbuckets);
 
 	/*
 	 * scan the items until we find a match or exhaust our list
@@ -1630,7 +1721,8 @@ SearchCatCacheList(CatCache *cache,
 	 * Note: it's okay to use dlist_foreach here, even though we modify the
 	 * dlist within the loop, because we don't continue the loop afterwards.
 	 */
-	dlist_foreach(iter, &cache->cc_lists)
+	lbucket = &cache->cc_lbucket[lHashIndex];
+	dlist_foreach(iter, lbucket)
 	{
 		cl = dlist_container(CatCList, cache_elem, iter.cur);
 
@@ -1650,13 +1742,13 @@ SearchCatCacheList(CatCache *cache,
 			continue;
 
 		/*
-		 * We found a matching list.  Move the list to the front of the
-		 * cache's list-of-lists, to speed subsequent searches.  (We do not
+		 * We found a matching list.  Move the list to the front of the list
+		 * for its hashbucket, so as to speed subsequent searches.  (We do not
 		 * move the members to the fronts of their hashbucket lists, however,
 		 * since there's no point in that unless they are searched for
 		 * individually.)
 		 */
-		dlist_move_head(&cache->cc_lists, &cl->cache_elem);
+		dlist_move_head(lbucket, &cl->cache_elem);
 
 		/* Bump the list's refcount and return it */
 		ResourceOwnerEnlarge(CurrentResourceOwner);
@@ -1868,7 +1960,12 @@ SearchCatCacheList(CatCache *cache,
 	}
 	Assert(i == nmembers);
 
-	dlist_push_head(&cache->cc_lists, &cl->cache_elem);
+	/*
+	 * Add the CatCList to the appropriate bucket, and count it.
+	 */
+	dlist_push_head(lbucket, &cl->cache_elem);
+
+	cache->cc_nlist++;
 
 	/* Finally, bump the list's refcount and return it */
 	cl->refcount++;
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index a75a528de8..3fb9647b87 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -51,9 +51,11 @@ typedef struct catcache
 	CCFastEqualFN cc_fastequal[CATCACHE_MAXKEYS];	/* fast equal function for
 													 * each key */
 	int			cc_keyno[CATCACHE_MAXKEYS]; /* AttrNumber of each key */
-	dlist_head	cc_lists;		/* list of CatCList structs */
-	int			cc_ntup;		/* # of tuples currently in this cache */
 	int			cc_nkeys;		/* # of keys (1..CATCACHE_MAXKEYS) */
+	int			cc_ntup;		/* # of tuples currently in this cache */
+	int			cc_nlist;		/* # of CatCLists currently in this cache */
+	int			cc_nlbuckets;	/* # of CatCList hash buckets in this cache */
+	dlist_head *cc_lbucket;		/* hash buckets for CatCLists */
 	const char *cc_relname;		/* name of relation the tuples come from */
 	Oid			cc_reloid;		/* OID of relation the tuples come from */
 	Oid			cc_indexoid;	/* OID of index matching cache keys */
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

I wrote:

... I still see the problematic GRANT taking ~250ms, compared
to 5ms in v15. roles_is_member_of is clearly on the hook for that.

Ah: looks like that is mainly the fault of the list_append_unique_oid
calls in roles_is_member_of. That's also an O(N^2) cost of course,
though with a much smaller constant factor.

I don't think we have any really cheap way to de-duplicate the role
OIDs, especially seeing that it has to be done on-the-fly within the
collection loop, and the order of roles_list is at least potentially
interesting. Not sure how to make further progress without a lot of
work.

regards, tom lane

#8Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#7)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Thu, Mar 21, 2024 at 04:31:45PM -0400, Tom Lane wrote:

I wrote:

... I still see the problematic GRANT taking ~250ms, compared
to 5ms in v15. roles_is_member_of is clearly on the hook for that.

Ah: looks like that is mainly the fault of the list_append_unique_oid
calls in roles_is_member_of. That's also an O(N^2) cost of course,
though with a much smaller constant factor.

I don't think we have any really cheap way to de-duplicate the role
OIDs, especially seeing that it has to be done on-the-fly within the
collection loop, and the order of roles_list is at least potentially
interesting. Not sure how to make further progress without a lot of
work.

Assuming these are larger lists, this might benefit from optimizations
involving SIMD intrinsics. I looked into that a while ago [0]/messages/by-id/20230308002502.GA3378449@nathanxps13, but the
effort was abandoned because we didn't have any concrete use-cases at the
time. (I'm looking into some additional optimizations in a separate thread
[1]: /messages/by-id/20240321183823.GA1800896@nathanxps13

[0]: /messages/by-id/20230308002502.GA3378449@nathanxps13
[1]: /messages/by-id/20240321183823.GA1800896@nathanxps13

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#9Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#8)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Thu, Mar 21, 2024 at 03:40:12PM -0500, Nathan Bossart wrote:

On Thu, Mar 21, 2024 at 04:31:45PM -0400, Tom Lane wrote:

I don't think we have any really cheap way to de-duplicate the role
OIDs, especially seeing that it has to be done on-the-fly within the
collection loop, and the order of roles_list is at least potentially
interesting. Not sure how to make further progress without a lot of
work.

Assuming these are larger lists, this might benefit from optimizations
involving SIMD intrinsics. I looked into that a while ago [0], but the
effort was abandoned because we didn't have any concrete use-cases at the
time. (I'm looking into some additional optimizations in a separate thread
[1] that would likely apply here, too.)

Never mind. With the reproduction script, I'm only seeing a ~2%
improvement with my patches.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#9)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

On Thu, Mar 21, 2024 at 03:40:12PM -0500, Nathan Bossart wrote:

On Thu, Mar 21, 2024 at 04:31:45PM -0400, Tom Lane wrote:

I don't think we have any really cheap way to de-duplicate the role
OIDs, especially seeing that it has to be done on-the-fly within the
collection loop, and the order of roles_list is at least potentially
interesting. Not sure how to make further progress without a lot of
work.

Assuming these are larger lists, this might benefit from optimizations
involving SIMD intrinsics.

Never mind. With the reproduction script, I'm only seeing a ~2%
improvement with my patches.

Yeah, you cannot beat an O(N^2) problem by throwing SIMD at it.

However ... I just remembered that we have a Bloom filter implementation
in core now (src/backend/lib/bloomfilter.c). How about using that
to quickly reject (hopefully) most role OIDs, and only do the
list_member_oid check if the filter passes?

regards, tom lane

#11Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#10)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Thu, Mar 21, 2024 at 08:59:54PM -0400, Tom Lane wrote:

However ... I just remembered that we have a Bloom filter implementation
in core now (src/backend/lib/bloomfilter.c). How about using that
to quickly reject (hopefully) most role OIDs, and only do the
list_member_oid check if the filter passes?

Seems worth a try. I've been looking for an excuse to use that...

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#12Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#11)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Thu, Mar 21, 2024 at 08:03:32PM -0500, Nathan Bossart wrote:

On Thu, Mar 21, 2024 at 08:59:54PM -0400, Tom Lane wrote:

However ... I just remembered that we have a Bloom filter implementation
in core now (src/backend/lib/bloomfilter.c). How about using that
to quickly reject (hopefully) most role OIDs, and only do the
list_member_oid check if the filter passes?

Seems worth a try. I've been looking for an excuse to use that...

The Bloom filter appears to help a bit, although it regresses the
create-roles.sql portion of the test. I'm assuming that's thanks to all
the extra pallocs and pfrees, which are probably avoidable if we store the
filter in a long-lived context and just clear it at the beginning of each
call to roles_is_member_of().

HEAD hash hash+bloom
create 2.02 2.06 2.92
grant 4.63 0.27 0.08

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

bloom.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index cf5d08576a..445a14646c 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -36,6 +36,7 @@
 #include "common/hashfn.h"
 #include "foreign/foreign.h"
 #include "funcapi.h"
+#include "lib/bloomfilter.h"
 #include "lib/qunique.h"
 #include "miscadmin.h"
 #include "utils/acl.h"
@@ -4946,6 +4947,7 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 	ListCell   *l;
 	List	   *new_cached_roles;
 	MemoryContext oldctx;
+	bloom_filter *bf;
 
 	Assert(OidIsValid(admin_of) == PointerIsValid(admin_role));
 	if (admin_role != NULL)
@@ -4987,6 +4989,9 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 	 */
 	roles_list = list_make1_oid(roleid);
 
+	bf = bloom_create(1024, work_mem, 0);
+	bloom_add_element(bf, (unsigned char *) &roleid, sizeof(Oid));
+
 	foreach(l, roles_list)
 	{
 		Oid			memberid = lfirst_oid(l);
@@ -5023,16 +5028,29 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * graph, we must test for having already seen this role. It is
 			 * legal for instance to have both A->B and A->C->B.
 			 */
-			roles_list = list_append_unique_oid(roles_list, otherid);
+			if (bloom_lacks_element(bf, (unsigned char *) &otherid, sizeof(Oid)))
+				roles_list = lappend_oid(roles_list, otherid);
+			else
+				roles_list = list_append_unique_oid(roles_list, otherid);
+			bloom_add_element(bf, (unsigned char *) &otherid, sizeof(Oid));
 		}
 		ReleaseSysCacheList(memlist);
 
 		/* implement pg_database_owner implicit membership */
 		if (memberid == dba && OidIsValid(dba))
-			roles_list = list_append_unique_oid(roles_list,
-												ROLE_PG_DATABASE_OWNER);
+		{
+			Oid dbowner = ROLE_PG_DATABASE_OWNER;
+
+			if (bloom_lacks_element(bf, (unsigned char *) &dbowner, sizeof(Oid)))
+				roles_list = lappend_oid(roles_list, ROLE_PG_DATABASE_OWNER);
+			else
+				roles_list = list_append_unique_oid(roles_list, ROLE_PG_DATABASE_OWNER);
+			bloom_add_element(bf, (unsigned char *) &dbowner, sizeof(Oid));
+		}
 	}
 
+	bloom_free(bf);
+
 	/*
 	 * Copy the completed list into TopMemoryContext so it will persist.
 	 */
#13alex work
alexwork033@gmail.com
In reply to: Tom Lane (#10)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

First of all thank you for looking into this.

At the moment we workaround the problem by altering `acc` ROLE into a SUPERUSER
in PostgreSQL 16 instances. It sidestep the problem and having the lowest cost
to implement for us. While at first we think this feels like opening a security
hole, it does not introduce side effects for **our use case** by the way our
application make use of this `acc` ROLE.

Of course we cannot recommend the workaround we took to others having similar
situation.

Show quoted text

On Fri, Mar 22, 2024 at 7:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Nathan Bossart <nathandbossart@gmail.com> writes:

On Thu, Mar 21, 2024 at 03:40:12PM -0500, Nathan Bossart wrote:

On Thu, Mar 21, 2024 at 04:31:45PM -0400, Tom Lane wrote:

I don't think we have any really cheap way to de-duplicate the role
OIDs, especially seeing that it has to be done on-the-fly within the
collection loop, and the order of roles_list is at least potentially
interesting. Not sure how to make further progress without a lot of
work.

Assuming these are larger lists, this might benefit from optimizations
involving SIMD intrinsics.

Never mind. With the reproduction script, I'm only seeing a ~2%
improvement with my patches.

Yeah, you cannot beat an O(N^2) problem by throwing SIMD at it.

However ... I just remembered that we have a Bloom filter implementation
in core now (src/backend/lib/bloomfilter.c). How about using that
to quickly reject (hopefully) most role OIDs, and only do the
list_member_oid check if the filter passes?

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#12)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

The Bloom filter appears to help a bit, although it regresses the
create-roles.sql portion of the test. I'm assuming that's thanks to all
the extra pallocs and pfrees, which are probably avoidable if we store the
filter in a long-lived context and just clear it at the beginning of each
call to roles_is_member_of().

The zero-fill to reinitialize the filter probably costs a good deal
all by itself, considering we're talking about at least a megabyte.
Maybe a better idea is to not enable the filter till we're dealing
with at least 1000 or so entries in roles_list, though obviously that
will complicate the logic.

+            if (bloom_lacks_element(bf, (unsigned char *) &otherid, sizeof(Oid)))
+                roles_list = lappend_oid(roles_list, otherid);
+            else
+                roles_list = list_append_unique_oid(roles_list, otherid);
+            bloom_add_element(bf, (unsigned char *) &otherid, sizeof(Oid));

Hmm, I was imagining something more like

if (bloom_lacks_element(bf, (unsigned char *) &otherid, sizeof(Oid)) ||
!list_member_oid(roles_list, otherid))
{
roles_list = lappend_oid(roles_list, otherid);
bloom_add_element(bf, (unsigned char *) &otherid, sizeof(Oid));
}

to avoid duplicate bloom_add_element calls. Probably wouldn't move
the needle much in this specific test case, but this formulation
would simplify letting the filter kick in later. Very roughly,

if ((bf && bloom_lacks_element(bf, (unsigned char *) &otherid, sizeof(Oid))) ||
!list_member_oid(roles_list, otherid))
{
if (bf == NULL && list_length(roles_list) > 1000)
{
... create bf and populate with existing list entries
}
roles_list = lappend_oid(roles_list, otherid);
if (bf)
bloom_add_element(bf, (unsigned char *) &otherid, sizeof(Oid));
}

regards, tom lane

#15Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#10)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Thu, Mar 21, 2024 at 08:59:54PM -0400, Tom Lane wrote:

Nathan Bossart <nathandbossart@gmail.com> writes:

On Thu, Mar 21, 2024 at 03:40:12PM -0500, Nathan Bossart wrote:

On Thu, Mar 21, 2024 at 04:31:45PM -0400, Tom Lane wrote:

I don't think we have any really cheap way to de-duplicate the role
OIDs, especially seeing that it has to be done on-the-fly within the
collection loop, and the order of roles_list is at least potentially
interesting. Not sure how to make further progress without a lot of
work.

Assuming these are larger lists, this might benefit from optimizations
involving SIMD intrinsics.

Never mind. With the reproduction script, I'm only seeing a ~2%
improvement with my patches.

Yeah, you cannot beat an O(N^2) problem by throwing SIMD at it.

I apparently had some sort of major brain fade when I did this because I
didn't apply your hashing patch when I ran this SIMD test. With it
applied, I see a speedup of ~39%, which makes a whole lot more sense to me.
If I add the Bloom patch (updated with your suggestions), I get another
~73% improvement from there, and a much smaller regression in the role
creation portion.

hash hash+simd hash+simd+bloom
create 1.27 1.27 1.28
grant 0.18 0.11 0.03

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

bloom_v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index cf5d08576a..4952dc9c1e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -36,6 +36,7 @@
 #include "common/hashfn.h"
 #include "foreign/foreign.h"
 #include "funcapi.h"
+#include "lib/bloomfilter.h"
 #include "lib/qunique.h"
 #include "miscadmin.h"
 #include "utils/acl.h"
@@ -4946,6 +4947,7 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 	ListCell   *l;
 	List	   *new_cached_roles;
 	MemoryContext oldctx;
+	bloom_filter *bf = NULL;
 
 	Assert(OidIsValid(admin_of) == PointerIsValid(admin_role));
 	if (admin_role != NULL)
@@ -5023,16 +5025,46 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * graph, we must test for having already seen this role. It is
 			 * legal for instance to have both A->B and A->C->B.
 			 */
-			roles_list = list_append_unique_oid(roles_list, otherid);
+			if ((bf && bloom_lacks_element(bf, (unsigned char *) &otherid, sizeof(Oid))) ||
+				!list_member_oid(roles_list, otherid))
+			{
+				if (bf == NULL && list_length(roles_list) > 1000)
+				{
+					bf = bloom_create(1024 * 1024, work_mem, 0);
+					foreach_oid(role, roles_list)
+						bloom_add_element(bf, (unsigned char *) &role, sizeof(Oid));
+				}
+				roles_list = lappend_oid(roles_list, otherid);
+				if (bf)
+					bloom_add_element(bf, (unsigned char *) &otherid, sizeof(Oid));
+			}
 		}
 		ReleaseSysCacheList(memlist);
 
 		/* implement pg_database_owner implicit membership */
 		if (memberid == dba && OidIsValid(dba))
-			roles_list = list_append_unique_oid(roles_list,
-												ROLE_PG_DATABASE_OWNER);
+		{
+			Oid			dbowner = ROLE_PG_DATABASE_OWNER;
+
+			if ((bf && bloom_lacks_element(bf, (unsigned char *) &dbowner, sizeof(Oid))) ||
+				!list_member_oid(roles_list, dbowner))
+			{
+				if (bf == NULL && list_length(roles_list) > 1000)
+				{
+					bf = bloom_create(1024 * 1024, work_mem, 0);
+					foreach_oid(role, roles_list)
+						bloom_add_element(bf, (unsigned char *) &role, sizeof(Oid));
+				}
+				roles_list = lappend_oid(roles_list, dbowner);
+				if (bf)
+					bloom_add_element(bf, (unsigned char *) &dbowner, sizeof(Oid));
+			}
+		}
 	}
 
+	if (bf)
+		bloom_free(bf);
+
 	/*
 	 * Copy the completed list into TopMemoryContext so it will persist.
 	 */
#16Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#15)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Fri, Mar 22, 2024 at 09:47:39AM -0500, Nathan Bossart wrote:

hash hash+simd hash+simd+bloom
create 1.27 1.27 1.28
grant 0.18 0.11 0.03

For just hash+bloom, I'm seeing 1.29 and 0.04.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#16)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

On Fri, Mar 22, 2024 at 09:47:39AM -0500, Nathan Bossart wrote:

hash hash+simd hash+simd+bloom
create 1.27 1.27 1.28
grant 0.18 0.11 0.03

For just hash+bloom, I'm seeing 1.29 and 0.04.

Yeah, that's about what I'd expect: hash+bloom ought to remove
most (not quite all) of the opportunity for simd to shine, because
the bloom filter should eliminate most of the list_member_oid calls.

Possibly we could fix that small regression in the create phase
with more careful tuning of the magic constants in the bloom
logic? Although I'd kind of expect that the create step doesn't
ever invoke the bloom filter, else it would have been showing a
performance problem before; so this might not be a good test case
for helping us tune those.

I think remaining questions are:

* Is there any case where the new hash catcache logic could lose
measurably? I kind of doubt it, because we were already computing
the hash value for list searches; so basically the only overhead
is one more palloc per cache during the first list search. (If
you accumulate enough lists to cause a rehash, you're almost
surely well into winning territory.)

* Same question for the bloom logic, but here I think it's mostly
a matter of tuning those constants.

* Do we want to risk back-patching any of this, to fix the performance
regression in v16? I think that the OP's situation is a pretty
narrow one, but maybe he's not the only person who managed to dodge
roles_is_member_of's performance issues in most other cases.

regards, tom lane

#18Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#17)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Fri, Mar 22, 2024 at 11:27:46AM -0400, Tom Lane wrote:

Yeah, that's about what I'd expect: hash+bloom ought to remove
most (not quite all) of the opportunity for simd to shine, because
the bloom filter should eliminate most of the list_member_oid calls.

Right. IMHO the SIMD work is still worth considering because there are
probably even more extreme cases where it'll make a decent amount of
difference. Plus, that stuff is pretty low overhead for what you get in
return. That being said, the hash table and Bloom filter should definitely
be the higher priority.

* Same question for the bloom logic, but here I think it's mostly
a matter of tuning those constants.

I suspect there might be some regressions just after the point where we
construct the filter, but I'd still expect that to be a reasonable
trade-off. We could probably pretty easily construct some benchmarks to
understand the impact with a given number of roles. (I'm not sure I'll be
able to get to that today.)

* Do we want to risk back-patching any of this, to fix the performance
regression in v16? I think that the OP's situation is a pretty
narrow one, but maybe he's not the only person who managed to dodge
roles_is_member_of's performance issues in most other cases.

I've heard complaints about performance with many roles before, so I
certainly think this area is worth optimizing. As far as back-patching
goes, my current feeling is that the hash table is probably pretty safe and
provides the majority of the benefit, but anything fancier should probably
be reserved for v17 or v18.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#18)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

On Fri, Mar 22, 2024 at 11:27:46AM -0400, Tom Lane wrote:

* Do we want to risk back-patching any of this, to fix the performance
regression in v16? I think that the OP's situation is a pretty
narrow one, but maybe he's not the only person who managed to dodge
roles_is_member_of's performance issues in most other cases.

I've heard complaints about performance with many roles before, so I
certainly think this area is worth optimizing. As far as back-patching
goes, my current feeling is that the hash table is probably pretty safe and
provides the majority of the benefit, but anything fancier should probably
be reserved for v17 or v18.

Yeah. Although both the catcache and list_append_unique_oid bits
are O(N^2), the catcache seems to have a much bigger constant
factor --- when I did a "perf" check on the unpatched code,
I saw catcache eating over 90% of the runtime and list_member_oid
about 2%. So let's fix that part in v16 and call it a day.
It should be safe to back-patch the catcache changes as long as
we put the new fields at the end of the struct and leave cc_lists
present but empty.

Would you like to review the catcache patch further, or do you
think it's good to go?

regards, tom lane

#20Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#19)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Fri, Mar 22, 2024 at 12:53:15PM -0400, Tom Lane wrote:

Would you like to review the catcache patch further, or do you
think it's good to go?

I'll take another look this afternoon.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#21Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#20)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Fri, Mar 22, 2024 at 11:54:48AM -0500, Nathan Bossart wrote:

On Fri, Mar 22, 2024 at 12:53:15PM -0400, Tom Lane wrote:

Would you like to review the catcache patch further, or do you
think it's good to go?

I'll take another look this afternoon.

LGTM

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#21)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

On Fri, Mar 22, 2024 at 11:54:48AM -0500, Nathan Bossart wrote:

On Fri, Mar 22, 2024 at 12:53:15PM -0400, Tom Lane wrote:

Would you like to review the catcache patch further, or do you
think it's good to go?

I'll take another look this afternoon.

LGTM

Thanks for looking, I'll push that shortly.

regards, tom lane

#23Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#22)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Fri, Mar 22, 2024 at 04:41:49PM -0400, Tom Lane wrote:

Nathan Bossart <nathandbossart@gmail.com> writes:

LGTM

Thanks for looking, I'll push that shortly.

Are there any changes you'd like to see for the Bloom patch [0]/messages/by-id/attachment/158079/bloom_v2.patch? I'd like
to see about getting that committed for v17. One thing that crossed my
mind is creating a combined list/filter that transparently created a filter
when necessary (for reuse elsewhere), but I'm not sure that's v17 material.

[0]: /messages/by-id/attachment/158079/bloom_v2.patch

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#23)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

Are there any changes you'd like to see for the Bloom patch [0]? I'd like
to see about getting that committed for v17. One thing that crossed my
mind is creating a combined list/filter that transparently created a filter
when necessary (for reuse elsewhere), but I'm not sure that's v17 material.

Yeah, that thought occurred to me too, but I think we ought to have a
few more use-cases in view before trying to write an API.

As for the patch, I agree it could go into v17, but I think there is
still a little bit of work to do:

* The magic constants (crossover list length and bloom filter size)
need some testing to see if there are better values. They should
probably be made into named #defines, too. I suspect, with little
proof, that the bloom filter size isn't particularly critical --- but
I know we pulled the crossover of 1000 out of thin air, and I have
no certainty that it's even within an order of magnitude of being a
good choice.

* Code needs more than zero comments.

* Is it worth trying to make a subroutine, or at least a macro,
so as not to have 2 copies of the code?

regards, tom lane

#25Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#24)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Mon, Mar 25, 2024 at 11:08:39AM -0400, Tom Lane wrote:

* The magic constants (crossover list length and bloom filter size)
need some testing to see if there are better values. They should
probably be made into named #defines, too. I suspect, with little
proof, that the bloom filter size isn't particularly critical --- but
I know we pulled the crossover of 1000 out of thin air, and I have
no certainty that it's even within an order of magnitude of being a
good choice.

I'll try to construct a couple of tests to see if we can determine a proper
order of magnitude.

* Code needs more than zero comments.

Yup.

* Is it worth trying to make a subroutine, or at least a macro,
so as not to have 2 copies of the code?

I think so. I'll try that in the next version.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#26Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#25)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Here is a new version of the patch that I feel is in decent shape.

On Mon, Mar 25, 2024 at 10:16:47AM -0500, Nathan Bossart wrote:

On Mon, Mar 25, 2024 at 11:08:39AM -0400, Tom Lane wrote:

* The magic constants (crossover list length and bloom filter size)
need some testing to see if there are better values. They should
probably be made into named #defines, too. I suspect, with little
proof, that the bloom filter size isn't particularly critical --- but
I know we pulled the crossover of 1000 out of thin air, and I have
no certainty that it's even within an order of magnitude of being a
good choice.

I'll try to construct a couple of tests to see if we can determine a proper
order of magnitude.

I spent some time trying to get some ballpark figures but have thus far
been unsuccessful. Even if I was able to get good numbers, I'm not sure
how much they'd help us, as we'll still need to decide how much overhead we
are willing to take in comparison to the linear search. I don't think
~1000 is an unreasonable starting point, as it seems generally more likely
that you will have many more roles to process at that point than if the
threshold was, say, 100. And if the threshold is too high (e.g., 10,000),
this optimization will only kick in for the most extreme cases, so we'd
likely be leaving a lot on the table. But, I will be the first to admit
that my reasoning here is pretty unscientific, and I'm open to suggestions
for how to make it less so.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Optimize-roles_is_member_of-with-a-Bloom-filter.patchtext/x-diff; charset=us-asciiDownload
From 95fe6e811990895acb9f79544f502408ac472203 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 25 Mar 2024 23:17:08 -0500
Subject: [PATCH v3 1/1] Optimize roles_is_member_of() with a Bloom filter.

When the list of roles gathered by roles_is_member_of() grows very
large, a Bloom filter is created to help avoid some linear searches
through the list.  The threshold for creating the Bloom filter is
set arbitrarily high and may require future adjustment.

Suggested-by: Tom Lane
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAGvXd3OSMbJQwOSc-Tq-Ro1CAz%3DvggErdSG7pv2s6vmmTOLJSg%40mail.gmail.com
---
 src/backend/utils/adt/acl.c | 71 +++++++++++++++++++++++++++++++++++--
 1 file changed, 68 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index cf5d08576a..7c4b642ebd 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -36,6 +36,7 @@
 #include "common/hashfn.h"
 #include "foreign/foreign.h"
 #include "funcapi.h"
+#include "lib/bloomfilter.h"
 #include "lib/qunique.h"
 #include "miscadmin.h"
 #include "utils/acl.h"
@@ -78,6 +79,17 @@ static Oid	cached_role[] = {InvalidOid, InvalidOid, InvalidOid};
 static List *cached_roles[] = {NIL, NIL, NIL};
 static uint32 cached_db_hash;
 
+/*
+ * If the list of roles gathered by roles_is_member_of() grows larger than the
+ * below threshold, a Bloom filter is created to speed up list membership
+ * checks.  This threshold is set arbitrarily high to avoid the overhead of
+ * creating the Bloom filter until it seems likely to provide a net benefit.
+ *
+ * XXX: The current threshold of 1024 is little more than a wild guess and may
+ * need to be adjusted in the future.
+ */
+#define ROLES_LIST_BLOOM_THRESHOLD 1024
+static bloom_filter *roles_list_bf = NULL;
 
 static const char *getid(const char *s, char *n, Node *escontext);
 static void putid(char *p, const char *s);
@@ -4918,6 +4930,54 @@ RoleMembershipCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
 	cached_role[ROLERECURSE_SETROLE] = InvalidOid;
 }
 
+/*
+ * A helper function for roles_is_member_of() that provides an optimized
+ * implementation of list_append_unique_oid() via a Bloom filter.  The caller
+ * (i.e., roles_is_member_of()) is responsible for freeing roles_list_bf once
+ * it is done using this function.
+ */
+static inline List *
+roles_list_append(List *roles_list, Oid role)
+{
+	unsigned char *roleptr = (unsigned char *) &role;
+
+	/*
+	 * If there is a previously-created Bloom filter, use it to determine
+	 * whether the role is missing from the list.  Otherwise, do an ordinary
+	 * linear search through the existing role list.
+	 */
+	if ((roles_list_bf &&
+		 bloom_lacks_element(roles_list_bf, roleptr, sizeof(Oid))) ||
+		!list_member_oid(roles_list, role))
+	{
+		/*
+		 * If the list is large, we take on the overhead of creating and
+		 * populating a Bloom filter to speed up future calls to this
+		 * function.
+		 */
+		if (!roles_list_bf &&
+			list_length(roles_list) > ROLES_LIST_BLOOM_THRESHOLD)
+		{
+			roles_list_bf = bloom_create(ROLES_LIST_BLOOM_THRESHOLD * 10,
+										 work_mem, 0);
+			foreach_oid(roleid, roles_list)
+				bloom_add_element(roles_list_bf,
+								  (unsigned char *) &roleid,
+								  sizeof(Oid));
+		}
+
+		/*
+		 * Finally, add the role to the list and the Bloom filter, if it
+		 * exists.
+		 */
+		roles_list = lappend_oid(roles_list, role);
+		if (roles_list_bf)
+			bloom_add_element(roles_list_bf, roleptr, sizeof(Oid));
+	}
+
+	return roles_list;
+}
+
 /*
  * Get a list of roles that the specified roleid is a member of
  *
@@ -5023,16 +5083,21 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * graph, we must test for having already seen this role. It is
 			 * legal for instance to have both A->B and A->C->B.
 			 */
-			roles_list = list_append_unique_oid(roles_list, otherid);
+			roles_list = roles_list_append(roles_list, otherid);
 		}
 		ReleaseSysCacheList(memlist);
 
 		/* implement pg_database_owner implicit membership */
 		if (memberid == dba && OidIsValid(dba))
-			roles_list = list_append_unique_oid(roles_list,
-												ROLE_PG_DATABASE_OWNER);
+			roles_list = roles_list_append(roles_list, ROLE_PG_DATABASE_OWNER);
 	}
 
+	/*
+	 * Free the Bloom filter created by roles_list_append(), if there is one.
+	 */
+	if (roles_list_bf)
+		bloom_free(roles_list_bf);
+
 	/*
 	 * Copy the completed list into TopMemoryContext so it will persist.
 	 */
-- 
2.25.1

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#26)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

I spent some time trying to get some ballpark figures but have thus far
been unsuccessful. Even if I was able to get good numbers, I'm not sure
how much they'd help us, as we'll still need to decide how much overhead we
are willing to take in comparison to the linear search. I don't think
~1000 is an unreasonable starting point, as it seems generally more likely
that you will have many more roles to process at that point than if the
threshold was, say, 100. And if the threshold is too high (e.g., 10,000),
this optimization will only kick in for the most extreme cases, so we'd
likely be leaving a lot on the table. But, I will be the first to admit
that my reasoning here is pretty unscientific, and I'm open to suggestions
for how to make it less so.

I did a little experimentation using the attached quick-hack C
function, and came to the conclusion that setting up the bloom filter
costs more or less as much as inserting 1000 or so OIDs the dumb way.
So we definitely want a threshold that's not much less than that.
For example, with ROLES_LIST_BLOOM_THRESHOLD = 100 I saw:

regression=# select drive_bloom(100, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 319.931 ms
regression=# select drive_bloom(101, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 319.385 ms
regression=# select drive_bloom(102, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 9904.786 ms (00:09.905)

That's a pretty big jump in context. With the threshold set to 1024,

regression=# select drive_bloom(1024, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 14597.510 ms (00:14.598)
regression=# select drive_bloom(1025, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 14589.197 ms (00:14.589)
regression=# select drive_bloom(1026, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 25947.000 ms (00:25.947)
regression=# select drive_bloom(1027, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 25399.718 ms (00:25.400)
regression=# select drive_bloom(2048, 10, 100000);
drive_bloom
-------------

(1 row)

Time: 33809.536 ms (00:33.810)

So I'm now content with choosing a threshold of 1000 or 1024 or so.

As for the bloom filter size, I see that bloom_create does

bitset_bytes = Min(bloom_work_mem * UINT64CONST(1024), total_elems * 2);
bitset_bytes = Max(1024 * 1024, bitset_bytes);

which means that any total_elems input less than 512K is disregarded
altogether. So I'm not sold on your "ROLES_LIST_BLOOM_THRESHOLD * 10"
value. Maybe it doesn't matter though.

I do not like, even a little bit, your use of a static variable to
hold the bloom filter pointer. That code will misbehave horribly
if we throw an error partway through the role-accumulation loop;
the next call will try to carry on using the old filter, which would
be wrong even if it still existed which it likely won't. It's not
that much worse notationally to keep it as a local variable, as I
did in the attached.

regards, tom lane

Attachments:

drive_bloom.ctext/x-c; charset=us-ascii; name=drive_bloom.cDownload
#28Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#27)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Tue, Mar 26, 2024 at 02:16:03PM -0400, Tom Lane wrote:

I did a little experimentation using the attached quick-hack C
function, and came to the conclusion that setting up the bloom filter
costs more or less as much as inserting 1000 or so OIDs the dumb way.
So we definitely want a threshold that's not much less than that.

Thanks for doing this.

So I'm now content with choosing a threshold of 1000 or 1024 or so.

Cool.

As for the bloom filter size, I see that bloom_create does

bitset_bytes = Min(bloom_work_mem * UINT64CONST(1024), total_elems * 2);
bitset_bytes = Max(1024 * 1024, bitset_bytes);

which means that any total_elems input less than 512K is disregarded
altogether. So I'm not sold on your "ROLES_LIST_BLOOM_THRESHOLD * 10"
value. Maybe it doesn't matter though.

Yeah, I wasn't sure how much to worry about this. I figured that we might
as well set it to a reasonable estimate based on the description of the
parameter. This description claims that the filter should work well if
this is off by a factor of 5 or more, and 50x the threshold sounded like it
ought to be good enough for anyone, so that's how I landed on 10x. But as
you point out, this value will be disregarded altogether, and it will
continue to be ignored unless the filter implementation changes, which
seems unlikely. If you have a different value in mind that you would
rather use, I'm fine with changing it.

I do not like, even a little bit, your use of a static variable to
hold the bloom filter pointer. That code will misbehave horribly
if we throw an error partway through the role-accumulation loop;
the next call will try to carry on using the old filter, which would
be wrong even if it still existed which it likely won't. It's not
that much worse notationally to keep it as a local variable, as I
did in the attached.

Ah, yes, that's no good. I fixed this in the new version.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Optimize-roles_is_member_of-with-a-Bloom-filter.patchtext/x-diff; charset=us-asciiDownload
From e5618bb6f1d8b21d527bfda5b49f75d12bf31e9e Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 25 Mar 2024 23:17:08 -0500
Subject: [PATCH v4 1/1] Optimize roles_is_member_of() with a Bloom filter.

When the list of roles gathered by roles_is_member_of() grows very
large, a Bloom filter is created to help avoid some linear searches
through the list.  The threshold for creating the Bloom filter is
set arbitrarily high and may require future adjustment.

Suggested-by: Tom Lane
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAGvXd3OSMbJQwOSc-Tq-Ro1CAz%3DvggErdSG7pv2s6vmmTOLJSg%40mail.gmail.com
---
 src/backend/utils/adt/acl.c | 65 +++++++++++++++++++++++++++++++++++--
 1 file changed, 62 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index cf5d08576a..39197613c2 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -36,6 +36,7 @@
 #include "common/hashfn.h"
 #include "foreign/foreign.h"
 #include "funcapi.h"
+#include "lib/bloomfilter.h"
 #include "lib/qunique.h"
 #include "miscadmin.h"
 #include "utils/acl.h"
@@ -78,6 +79,13 @@ static Oid	cached_role[] = {InvalidOid, InvalidOid, InvalidOid};
 static List *cached_roles[] = {NIL, NIL, NIL};
 static uint32 cached_db_hash;
 
+/*
+ * If the list of roles gathered by roles_is_member_of() grows larger than the
+ * below threshold, a Bloom filter is created to speed up list membership
+ * checks.  This threshold is set arbitrarily high to avoid the overhead of
+ * creating the Bloom filter until it seems likely to provide a net benefit.
+ */
+#define ROLES_LIST_BLOOM_THRESHOLD 1024
 
 static const char *getid(const char *s, char *n, Node *escontext);
 static void putid(char *p, const char *s);
@@ -4918,6 +4926,50 @@ RoleMembershipCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
 	cached_role[ROLERECURSE_SETROLE] = InvalidOid;
 }
 
+/*
+ * A helper function for roles_is_member_of() that provides an optimized
+ * implementation of list_append_unique_oid() via a Bloom filter.  The caller
+ * (i.e., roles_is_member_of()) is responsible for freeing bf once it is done
+ * using this function.
+ */
+static inline List *
+roles_list_append(List *roles_list, bloom_filter **bf, Oid role)
+{
+	unsigned char *roleptr = (unsigned char *) &role;
+
+	/*
+	 * If there is a previously-created Bloom filter, use it to determine
+	 * whether the role is missing from the list.  Otherwise, do an ordinary
+	 * linear search through the existing role list.
+	 */
+	if ((*bf && bloom_lacks_element(*bf, roleptr, sizeof(Oid))) ||
+		!list_member_oid(roles_list, role))
+	{
+		/*
+		 * If the list is large, we take on the overhead of creating and
+		 * populating a Bloom filter to speed up future calls to this
+		 * function.
+		 */
+		if (*bf == NULL &&
+			list_length(roles_list) > ROLES_LIST_BLOOM_THRESHOLD)
+		{
+			*bf = bloom_create(ROLES_LIST_BLOOM_THRESHOLD * 10, work_mem, 0);
+			foreach_oid(roleid, roles_list)
+				bloom_add_element(*bf, (unsigned char *) &roleid, sizeof(Oid));
+		}
+
+		/*
+		 * Finally, add the role to the list and the Bloom filter, if it
+		 * exists.
+		 */
+		roles_list = lappend_oid(roles_list, role);
+		if (*bf)
+			bloom_add_element(*bf, roleptr, sizeof(Oid));
+	}
+
+	return roles_list;
+}
+
 /*
  * Get a list of roles that the specified roleid is a member of
  *
@@ -4946,6 +4998,7 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 	ListCell   *l;
 	List	   *new_cached_roles;
 	MemoryContext oldctx;
+	bloom_filter *bf = NULL;
 
 	Assert(OidIsValid(admin_of) == PointerIsValid(admin_role));
 	if (admin_role != NULL)
@@ -5023,16 +5076,22 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * graph, we must test for having already seen this role. It is
 			 * legal for instance to have both A->B and A->C->B.
 			 */
-			roles_list = list_append_unique_oid(roles_list, otherid);
+			roles_list = roles_list_append(roles_list, &bf, otherid);
 		}
 		ReleaseSysCacheList(memlist);
 
 		/* implement pg_database_owner implicit membership */
 		if (memberid == dba && OidIsValid(dba))
-			roles_list = list_append_unique_oid(roles_list,
-												ROLE_PG_DATABASE_OWNER);
+			roles_list = roles_list_append(roles_list, &bf,
+										   ROLE_PG_DATABASE_OWNER);
 	}
 
+	/*
+	 * Free the Bloom filter created by roles_list_append(), if there is one.
+	 */
+	if (bf)
+		bloom_free(bf);
+
 	/*
 	 * Copy the completed list into TopMemoryContext so it will persist.
 	 */
-- 
2.25.1

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#28)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Nathan Bossart <nathandbossart@gmail.com> writes:

On Tue, Mar 26, 2024 at 02:16:03PM -0400, Tom Lane wrote:

... I'm not sold on your "ROLES_LIST_BLOOM_THRESHOLD * 10"
value. Maybe it doesn't matter though.

Yeah, I wasn't sure how much to worry about this. I figured that we might
as well set it to a reasonable estimate based on the description of the
parameter. This description claims that the filter should work well if
this is off by a factor of 5 or more, and 50x the threshold sounded like it
ought to be good enough for anyone, so that's how I landed on 10x. But as
you point out, this value will be disregarded altogether, and it will
continue to be ignored unless the filter implementation changes, which
seems unlikely. If you have a different value in mind that you would
rather use, I'm fine with changing it.

No, I have no better idea. As you say, we should try to provide some
semi-reasonable number in case bloom_create ever starts paying
attention, and this one seems fine.

I do not like, even a little bit, your use of a static variable to
hold the bloom filter pointer.

Ah, yes, that's no good. I fixed this in the new version.

My one remaining suggestion is that this comment isn't very precise
about what's happening:

* If there is a previously-created Bloom filter, use it to determine
* whether the role is missing from the list. Otherwise, do an ordinary
* linear search through the existing role list.

Maybe more like

* If there is a previously-created Bloom filter, use it to try to
* determine whether the role is missing from the list. If it
* says yes, that's a hard fact and we can go ahead and add the
* role. If it says no, that's only probabilistic and we'd better
* search the list. Without a filter, we must always do an ordinary
* linear search through the existing list.

LGTM other than that nit.

regards, tom lane

#30Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#29)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Tue, Mar 26, 2024 at 03:08:00PM -0400, Tom Lane wrote:

My one remaining suggestion is that this comment isn't very precise
about what's happening:

* If there is a previously-created Bloom filter, use it to determine
* whether the role is missing from the list. Otherwise, do an ordinary
* linear search through the existing role list.

Maybe more like

* If there is a previously-created Bloom filter, use it to try to
* determine whether the role is missing from the list. If it
* says yes, that's a hard fact and we can go ahead and add the
* role. If it says no, that's only probabilistic and we'd better
* search the list. Without a filter, we must always do an ordinary
* linear search through the existing list.

LGTM other than that nit.

Committed with that change. Thanks for the guidance on this one.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#31Ranier Vilela
ranier.vf@gmail.com
In reply to: Nathan Bossart (#30)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Hi,

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:

Committed with that change. Thanks for the guidance on this one.

I think that left an oversight in a commit d365ae7
<https://github.com/postgres/postgres/commit/d365ae705409f5d9c81da4b668f59c3598feb512&gt;
If the admin_role is a NULL pointer, so, can be dereferenced
in the main loop of the function roles_is_member_of and
worst, IMO, can be destroying aleatory memory?

First, is a better shortcut test to check if admin_role is NOT NULL.
Second, !OidIsValid(*admin_role), It doesn't seem necessary anymore.

Or am I losing something?

best regards,
Ranier Vilela

Attachments:

avoid-dereference-a-null-pointer-acl.patchapplication/octet-stream; name=avoid-dereference-a-null-pointer-acl.patchDownload
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index dc10b4a483..3c9366c0aa 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -5062,8 +5062,8 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * While otherid==InvalidOid shouldn't appear in the catalog, the
 			 * OidIsValid() avoids crashing if that arises.
 			 */
-			if (otherid == admin_of && form->admin_option &&
-				OidIsValid(admin_of) && !OidIsValid(*admin_role))
+			if (admin_role != NULL && otherid == admin_of && form->admin_option &&
+				OidIsValid(admin_of))
 				*admin_role = memberid;
 
 			/* If we're supposed to ignore non-heritable grants, do so. */
#32Nathan Bossart
nathandbossart@gmail.com
In reply to: Ranier Vilela (#31)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Wed, Mar 27, 2024 at 01:21:23PM -0300, Ranier Vilela wrote:

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:

Committed with that change. Thanks for the guidance on this one.

I think that left an oversight in a commit d365ae7
<https://github.com/postgres/postgres/commit/d365ae705409f5d9c81da4b668f59c3598feb512&gt;
If the admin_role is a NULL pointer, so, can be dereferenced
in the main loop of the function roles_is_member_of and
worst, IMO, can be destroying aleatory memory?

First, is a better shortcut test to check if admin_role is NOT NULL.
Second, !OidIsValid(*admin_role), It doesn't seem necessary anymore.

Or am I losing something?

If admin_role is NULL, then admin_of is expected to be set to InvalidOid.
See the assertion at the top of the function. AFAICT the code that
dereferences admin_role short-circuits if admin_of is invalid.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#33Ranier Vilela
ranier.vf@gmail.com
In reply to: Nathan Bossart (#32)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Em qua., 27 de mar. de 2024 às 13:41, Nathan Bossart <
nathandbossart@gmail.com> escreveu:

On Wed, Mar 27, 2024 at 01:21:23PM -0300, Ranier Vilela wrote:

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:

Committed with that change. Thanks for the guidance on this one.

I think that left an oversight in a commit d365ae7
<

https://github.com/postgres/postgres/commit/d365ae705409f5d9c81da4b668f59c3598feb512

If the admin_role is a NULL pointer, so, can be dereferenced
in the main loop of the function roles_is_member_of and
worst, IMO, can be destroying aleatory memory?

First, is a better shortcut test to check if admin_role is NOT NULL.
Second, !OidIsValid(*admin_role), It doesn't seem necessary anymore.

Or am I losing something?

If admin_role is NULL, then admin_of is expected to be set to InvalidOid.
See the assertion at the top of the function. AFAICT the code that
dereferences admin_role short-circuits if admin_of is invalid.

These conditions seem a little fragile and confusing to me.
When a simple test, it protects the pointer and avoids a series of tests,
which are unnecessary if the pointer is invalid.

best regards,
Ranier Vilela

#34Nathan Bossart
nathandbossart@gmail.com
In reply to: Ranier Vilela (#33)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

On Wed, Mar 27, 2024 at 01:47:38PM -0300, Ranier Vilela wrote:

Em qua., 27 de mar. de 2024 �s 13:41, Nathan Bossart <
nathandbossart@gmail.com> escreveu:

On Wed, Mar 27, 2024 at 01:21:23PM -0300, Ranier Vilela wrote:

I think that left an oversight in a commit d365ae7
<

https://github.com/postgres/postgres/commit/d365ae705409f5d9c81da4b668f59c3598feb512

If the admin_role is a NULL pointer, so, can be dereferenced
in the main loop of the function roles_is_member_of and
worst, IMO, can be destroying aleatory memory?

First, is a better shortcut test to check if admin_role is NOT NULL.
Second, !OidIsValid(*admin_role), It doesn't seem necessary anymore.

Or am I losing something?

If admin_role is NULL, then admin_of is expected to be set to InvalidOid.
See the assertion at the top of the function. AFAICT the code that
dereferences admin_role short-circuits if admin_of is invalid.

These conditions seem a little fragile and confusing to me.
When a simple test, it protects the pointer and avoids a series of tests,
which are unnecessary if the pointer is invalid.

Maybe. But that doesn't seem like an oversight in commit d365ae7.

-			if (otherid == admin_of && form->admin_option &&
-				OidIsValid(admin_of) && !OidIsValid(*admin_role))
+			if (admin_role != NULL && otherid == admin_of && form->admin_option &&
+				OidIsValid(admin_of))
 				*admin_role = memberid;

I'm not following why it's safe to remove the !OidIsValid(*admin_role)
check here. We don't want to overwrite a previously-set value of
*admin_role, as per the comment above roles_is_member_of():

* If admin_of is not InvalidOid, this function sets *admin_role, either
* to the OID of the first role in the result list that directly possesses
* ADMIN OPTION on the role corresponding to admin_of, or to InvalidOid if
* there is no such role.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#35Ranier Vilela
ranier.vf@gmail.com
In reply to: Nathan Bossart (#34)
1 attachment(s)
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Em qua., 27 de mar. de 2024 às 14:35, Nathan Bossart <
nathandbossart@gmail.com> escreveu:

On Wed, Mar 27, 2024 at 01:47:38PM -0300, Ranier Vilela wrote:

Em qua., 27 de mar. de 2024 às 13:41, Nathan Bossart <
nathandbossart@gmail.com> escreveu:

On Wed, Mar 27, 2024 at 01:21:23PM -0300, Ranier Vilela wrote:

I think that left an oversight in a commit d365ae7
<

https://github.com/postgres/postgres/commit/d365ae705409f5d9c81da4b668f59c3598feb512

If the admin_role is a NULL pointer, so, can be dereferenced
in the main loop of the function roles_is_member_of and
worst, IMO, can be destroying aleatory memory?

First, is a better shortcut test to check if admin_role is NOT NULL.
Second, !OidIsValid(*admin_role), It doesn't seem necessary anymore.

Or am I losing something?

If admin_role is NULL, then admin_of is expected to be set to

InvalidOid.

See the assertion at the top of the function. AFAICT the code that
dereferences admin_role short-circuits if admin_of is invalid.

These conditions seem a little fragile and confusing to me.
When a simple test, it protects the pointer and avoids a series of tests,
which are unnecessary if the pointer is invalid.

Maybe. But that doesn't seem like an oversight in commit d365ae7.

Sorry for exceeding.

-                       if (otherid == admin_of && form->admin_option &&
-                               OidIsValid(admin_of) &&
!OidIsValid(*admin_role))
+                       if (admin_role != NULL && otherid == admin_of &&
form->admin_option &&
+                               OidIsValid(admin_of))
*admin_role = memberid;

I'm not following why it's safe to remove the !OidIsValid(*admin_role)
check here. We don't want to overwrite a previously-set value of
*admin_role, as per the comment above roles_is_member_of():

* If admin_of is not InvalidOid, this function sets *admin_role, either
* to the OID of the first role in the result list that directly possesses
* ADMIN OPTION on the role corresponding to admin_of, or to InvalidOid if
* there is no such role.

Ok. If admin_role is NOT NULL, so *admin_role is InvalidOid, by
initialization
in the head of function.

I think that a cheap test *admin_role == InvalidOid, is enough?
What do you think?

v1 attached.

best regards,
Ranier Vilela

Attachments:

v1-avoid-dereference-a-null-pointer-acl.patchapplication/octet-stream; name=v1-avoid-dereference-a-null-pointer-acl.patchDownload
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index dc10b4a483..0ed3ba866e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -5062,8 +5062,8 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * While otherid==InvalidOid shouldn't appear in the catalog, the
 			 * OidIsValid() avoids crashing if that arises.
 			 */
-			if (otherid == admin_of && form->admin_option &&
-				OidIsValid(admin_of) && !OidIsValid(*admin_role))
+			if (admin_role != NULL && *admin_role == InvalidOid &&
+				otherid == admin_of && form->admin_option && OidIsValid(admin_of))
 				*admin_role = memberid;
 
 			/* If we're supposed to ignore non-heritable grants, do so. */