GUC parameter ACLs and physical walsender
Moving discussion from:
/messages/by-id/4524ed61a015d3496fc008644dcb999bb31916a7.camel@j-davis.com
because this is a separate issue. If you specify a SUSET GUC setting
when connecting as non-superuser for physical replication:
PGOPTIONS="-c wal_compression=on" \
pg_receivewal -D archive -U repl
you get:
FATAL: cannot read pg_class without having selected a database
but only if you connect immediately after the server starts. If you do
something else first, like an ordinary connection and "SELECT 1", and
then start the replication connection, you get (after commit
dbf217c1c7):
FATAL: permission denied to set parameter "wal_compression"
as expected. The problem goes back to a0ffa885e47.
It seems to be because pg_parameter_acl is not nailed in cache. I
attached a quick patch to do so (which turns it into the "expected
permission denied" error). But I'm not sure if that's the right fix, or
if it would be a complete fix. I also don't think that would be
backportable, but perhaps?
Regards,
Jeff Davis
Attachments:
v1-0001-Nail-pg_parameter_acl-in-relcache.patchtext/x-patch; charset=UTF-8; name=v1-0001-Nail-pg_parameter_acl-in-relcache.patchDownload+17-7
On Thu, Apr 23, 2026 at 2:19 AM Jeff Davis <pgsql@j-davis.com> wrote:
It seems to be because pg_parameter_acl is not nailed in cache. I
attached a quick patch to do so (which turns it into the "expected
permission denied" error). But I'm not sure if that's the right fix, or
if it would be a complete fix. I also don't think that would be
backportable, but perhaps?
I think in existing installations AddNewRelationType() would have
picked some oid already, so fixing rowtype_oid at initdb time would
only work in the master branch.
--
John Naylor
Amazon Web Services
On Wed, Apr 22, 2026 at 8:27 PM John Naylor <johncnaylorls@gmail.com> wrote:
On Thu, Apr 23, 2026 at 2:19 AM Jeff Davis <pgsql@j-davis.com> wrote:
It seems to be because pg_parameter_acl is not nailed in cache. I
attached a quick patch to do so (which turns it into the "expected
permission denied" error). But I'm not sure if that's the right fix, or
if it would be a complete fix. I also don't think that would be
backportable, but perhaps?I think in existing installations AddNewRelationType() would have
picked some oid already, so fixing rowtype_oid at initdb time would
only work in the master branch.
John is right that the hardcoded BKI_ROWTYPE_OID(2173) makes this
non-backportable as-is.
On existing installations (PG 15-18), AddNewRelationType() assigned rowtype
OID
10097 to pg_parameter_acl during initdb. Jeff's patch has formrdesc set
rd_att->tdtypeid = 2173, but that field is never corrected — Phase3
overwrites
rd_rel (including reltype) via memcpy, but rd_att->tdtypeid stays at the
formrdesc value. The comment in formrdesc says as much: "this data had
better
be right because it will never be replaced."
On assert-enabled builds, every backend crashes on the first connection:
TRAP: failed Assert("relation->rd_att->tdtypeid == relp->reltype"),
File: "relcache.c", Line: 4294
On non-assert builds, the mismatch has no observable effect. The wrong
typeId
gets stamped into pg_parameter_acl tuple headers by heap_form_tuple, but
nothing reads it back -- all consumers of HeapTupleHeaderGetTypeId are in
the
executor, PL/pgSQL, and record-type processing, not catalog operations.
For backports, formrdesc would need to use something other than the
hardcoded
OID -- either look up the real rowtype OID, or pass InvalidOid and teach the
Assert to tolerate it for late-nailed relations.
To demonstrate, using an existing cluster whose initdb was done without the
nailing patch (rowtype 10097), then started with patched binaries: after
applying
Jeff's v1-0001-Nail-pg_parameter_acl-in-relcache.patch to master, which has
dbf217c1c7 already applied:
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1965,6 +1965,12 @@ formrdesc(const char *relationName, Oid
relationReltype,
relation->rd_att->tdtypeid = relationReltype;
relation->rd_att->tdtypmod = -1; /* just to be sure */
+ elog(LOG, "formrdesc \"%s\": relid=%u reltype=%u tdtypeid=%u",
+ relationName,
+ TupleDescAttr(relation->rd_att, 0)->attrelid,
+ relationReltype,
+ relation->rd_att->tdtypeid);
+
/*
* initialize tuple desc info
*/
@@ -4283,6 +4289,14 @@ RelationCacheInitializePhase3(void)
* data correctly to start with, because it may already have
been
* copied into one or more catcache entries.)
*/
+ elog(LOG, "RelationCacheInitializePhase3 \"%s\": relid=%u "
+ "formrdesc tdtypeid=%u, pg_class reltype=%u%s",
+ RelationGetRelationName(relation),
+ RelationGetRelid(relation),
+ relation->rd_att->tdtypeid,
+ relp->reltype,
+ (relation->rd_att->tdtypeid != relp->reltype)
+ ? " MISMATCH" : "");
Assert(relation->rd_att->tdtypeid == relp->reltype);
Assert(relation->rd_att->tdtypmod == -1);
On an existing cluster with patched binaries it shows:
formrdesc "pg_parameter_acl": relid=0 reltype=2173 tdtypeid=2173
RelationCacheInitializePhase3 "pg_parameter_acl": relid=6243
formrdesc tdtypeid=2173, pg_class reltype=10097 MISMATCH
--
*Mark Dilger*
On Thu, 2026-04-23 at 10:57 -0700, Mark Dilger wrote:
John is right that the hardcoded BKI_ROWTYPE_OID(2173) makes this
non-backportable as-is.
Right, but that leaves the questions:
(a) Is this the right fix for master?
(b) Is there anything we can do in the back branches, or we just leave
it as fix going forward only?
Regards,
Jeff Davis
On Thu, Apr 23, 2026 at 2:06 PM Jeff Davis <pgsql@j-davis.com> wrote:
On Thu, 2026-04-23 at 10:57 -0700, Mark Dilger wrote:
John is right that the hardcoded BKI_ROWTYPE_OID(2173) makes this
non-backportable as-is.Right, but that leaves the questions:
(a) Is this the right fix for master?
Yes. This approach has no problem in master that I can see.
(b) Is there anything we can do in the back branches, or we just leave
it as fix going forward only?
I don't see a solution. We could try to replace the error message with
something better, but even that seems hard to phrase. Replacing
"cannot read pg_class without having selected a database" with, say,
"permission denied" would also be confusing for a role which does have
the privilege but just can't verify it.
--
*Mark Dilger*
On Thu, 2026-04-23 at 16:41 -0700, Mark Dilger wrote:
On Thu, Apr 23, 2026 at 2:06 PM Jeff Davis <pgsql@j-davis.com> wrote:
(a) Is this the right fix for master?
Yes. This approach has no problem in master that I can see.
(b) Is there anything we can do in the back branches, or we just
leave
it as fix going forward only?I don't see a solution.
OK. I'll wait a couple days to see if others have input on these two
points, and then I can commit this to master only. Ideally before
19beta1, because it would bump the catversion.
That would leave a bug in the back branches, but not a very serious one
as far as I can tell.
Regards,
Jeff Davis
On Fri, 2026-04-24 at 12:30 -0700, Jeff Davis wrote:
On Thu, 2026-04-23 at 16:41 -0700, Mark Dilger wrote:
On Thu, Apr 23, 2026 at 2:06 PM Jeff Davis <pgsql@j-davis.com>
wrote:(a) Is this the right fix for master?
Yes. This approach has no problem in master that I can see.
(b) Is there anything we can do in the back branches, or we just
leave
it as fix going forward only?I don't see a solution.
Attached a rebased version, will commit soon. This will be a catalog
bump in beta2.
This fixes an issue introduced in a0ffa885e4, which was released in
v15. Unfortunately, we didn't find a reasonable back-patchable
solution, so I will commit only to master.
For backbranches, we could try to produce a better error message, but
that's about all we can do.
Regards,
Jeff Davis
Attachments:
v2-0001-Nail-pg_parameter_acl-in-relcache.patchtext/x-patch; charset=UTF-8; name=v2-0001-Nail-pg_parameter_acl-in-relcache.patchDownload+18-8
Jeff Davis <pgsql@j-davis.com> writes:
This fixes an issue introduced in a0ffa885e4, which was released in
v15. Unfortunately, we didn't find a reasonable back-patchable
solution, so I will commit only to master.
It looks to me like the existing branches all assign OID 10097
to type pg_parameter_acl. Could we get away with writing
BKI_ROWTYPE_OID(10097,ParameterAclRelation_Rowtype_Id)
in the back branches? It'd depend on whether there are any
build options that change how many OIDs get assigned before
this one, but if there aren't ...
It might not be worth taking any risk for, though, given the
lack of field reports.
regards, tom lane
On Thu, 2026-06-18 at 13:56 -0400, Tom Lane wrote:
It looks to me like the existing branches all assign OID 10097
to type pg_parameter_acl. Could we get away with writing
BKI_ROWTYPE_OID(10097,ParameterAclRelation_Rowtype_Id)
in the back branches? It'd depend on whether there are any
build options that change how many OIDs get assigned before
this one, but if there aren't ...
Interesting idea.
It might not be worth taking any risk for, though, given the
lack of field reports.
That's what I was thinking. If someone is really stuck, there's an easy
workaround, just do a non-replication connection and "SELECT 1" first.
Regards,
Jeff Davis