pg_restore 14 skips ACL COLUMN when --schema is used
pg_restore --schema in our script used to work well against pg_restore 10. Now that we are on PostgreSQL 14. pg_restore 14 skips ACL COLUMN, but not ACL TABLE, when --schema is used.
If there has been a known bug, please help direct me there for any possible workaround.
Thanks,
-Kong
pg_restore --list $dumpfile >$tocfile
pg_restore --use-list=<(grep "ACL public COLUMN event.id" $tocfile) -f - $dumpfile
--
-- PostgreSQL database dump
--
-- Dumped from database version 14.5 (Ubuntu 14.5-1.pgdg20.04+1)
-- Dumped by pg_dump version 14.5 (Ubuntu 14.5-1.pgdg20.04+1)
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
--
-- Name: COLUMN event.id; Type: ACL; Schema: public; Owner: admin
--
GRANT SELECT(id) ON TABLE public.event TO warranty_creation_verification;
GRANT SELECT(id) ON TABLE public.event TO data_reader;
GRANT SELECT(id) ON TABLE public.event TO internal_reader;
--
-- PostgreSQL database dump complete
--
pg_restore --schema=public --use-list=<(grep "ACL public COLUMN event.id" $tocfile) -f - $dumpfile
--
-- PostgreSQL database dump
--
-- Dumped from database version 14.5 (Ubuntu 14.5-1.pgdg20.04+1)
-- Dumped by pg_dump version 14.5 (Ubuntu 14.5-1.pgdg20.04+1)
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
--
-- PostgreSQL database dump complete
--
Kong Man <kong_mansatiansin@hotmail.com> writes:
pg_restore --schema in our script used to work well against pg_restore 10. Now that we are on PostgreSQL 14. pg_restore 14 skips ACL COLUMN, but not ACL TABLE, when --schema is used.
That's not the behavior I'm seeing. Would you mind providing a
*complete* reproducer, not some fragments?
The behavior I'm seeing is that neither a TABLE nor a COLUMN ACL
will be restored, because (since v11) ACLs are restored only
if their table is restored, and the --use-list option you are
using excludes the table.
We could perhaps imagine special-casing that, but I think it would be
a wart, because for every other kind of object --use-list can only
filter stuff out, not filter it in. (That is, if you are using
--use-list along with other selectivity options, an object must pass
both restrictions to be output. I don't want to make --use-list
override other rules just for ACLs.)
It would be interesting to see your actual use-case, because
I suspect you may be doing something that there's a better way
to do now. What set of objects are you trying to extract?
regards, tom lane
Tom,
You are right. 'Neither a TABLE nor a COLUMN ACL' are pg_restore-d, when I use -n or --schema without the TABLE toc entry.
Not knowing the behavioral change to pg_restore since PG11, I struggled to search for the root cause and made a wrong assumption based on the end result from my script.
What I do in my dbsnapshot script (to build DB baselines at each release cycle for DevOps) is (1) to pg_restore using only relevant SCHEMA toc entries, then (2) with pg_restore -n to scope out only relevant schemas it restores everything from $dumpfile. It has been working well on PG10 for several years until we upgraded a few months ago.
Step 2 there should have included the 'public.event' table (that I showed in the previous message), because the table actually got restored with the table privileges. I still struggle to find out why the column privileges are still missing.
Here are some simplified snippets from my script.
tocfile=/tmp/DBA-710.toc
dumpfile=/tmp/DBA-710.dump
pg_restore --list $dumpfile >$tocfile
DB=appdb
SCHEMAS=(public appschema1 appschema2)
exclusion='londiste|POLICY|PUBLICATION TABLE'
restopts=(-d "service=target dbname=$DB")
Log "[$DB]" " restore SCHEMAs: "$(IFS='|'; cat <<<"${SCHEMAS[*]}")
pg_restore "${restopts[@]}" \
--use-list=<(grep -A1 SCHEMA $tocfile| # only relevant SCHEMA & ACL SCHEMA entries
grep -E "\s("$(IFS='|'; cat <<<"${SCHEMAS[*]}")")\s") \
$dumpfile
restopts+=($(printf " -n %s" ${SCHEMAS[@]})) # Scope out for only relevant schemas
Log "[$DB]" " restore data definitions"
pg_restore "${restopts[@]}" \
--use-list=<(grep -Ev "$exclusion" $tocfile) \
$dumpfile
Thanks,
-Kong
________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, July 28, 2023 8:24 PM
To: Kong Man <kong_mansatiansin@hotmail.com>
Cc: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: pg_restore 14 skips ACL COLUMN when --schema is used
Kong Man <kong_mansatiansin@hotmail.com> writes:
pg_restore --schema in our script used to work well against pg_restore 10. Now that we are on PostgreSQL 14. pg_restore 14 skips ACL COLUMN, but not ACL TABLE, when --schema is used.
That's not the behavior I'm seeing. Would you mind providing a
*complete* reproducer, not some fragments?
The behavior I'm seeing is that neither a TABLE nor a COLUMN ACL
will be restored, because (since v11) ACLs are restored only
if their table is restored, and the --use-list option you are
using excludes the table.
We could perhaps imagine special-casing that, but I think it would be
a wart, because for every other kind of object --use-list can only
filter stuff out, not filter it in. (That is, if you are using
--use-list along with other selectivity options, an object must pass
both restrictions to be output. I don't want to make --use-list
override other rules just for ACLs.)
It would be interesting to see your actual use-case, because
I suspect you may be doing something that there's a better way
to do now. What set of objects are you trying to extract?
regards, tom lane
In this example, I generate the public.event table's ToC entries for TABLE, ACL TABLE, and ACL COLUMN, then use the result on --use-list. pg_restore generates only the CREATE TABLE and GRANTs statements at the table, but not the column, level.
grep -P "TABLE public event\b|ACL public (COLUMN|TABLE) event\b" $tocfile
1640; 1259 23161 TABLE public event admin
10023; 0 0 ACL public TABLE event admin
10024; 0 0 ACL public COLUMN event.id admin
10025; 0 0 ACL public COLUMN event.event_notification_id admin
10026; 0 0 ACL public COLUMN event.type admin
10027; 0 0 ACL public COLUMN event.version admin
10028; 0 0 ACL public COLUMN event.partner_id admin
10029; 0 0 ACL public COLUMN event.object_id admin
10030; 0 0 ACL public COLUMN event.data admin
10031; 0 0 ACL public COLUMN event.reference_id1 admin
10032; 0 0 ACL public COLUMN event.reference_id2 admin
10033; 0 0 ACL public COLUMN event.reference_id3 admin
10034; 0 0 ACL public COLUMN event.updated admin
10035; 0 0 ACL public COLUMN event.created admin
10036; 0 0 ACL public COLUMN event.actor_id admin
10037; 0 0 ACL public COLUMN event.sla_type admin
10038; 0 0 ACL public COLUMN event.due admin
10039; 0 0 ACL public COLUMN event.completed admin
10040; 0 0 ACL public COLUMN event.instance_id admin
10041; 0 0 ACL public COLUMN event.pii_data admin
pg_restore -n public --use-list=<(grep -P "TABLE public event\b|ACL public (COLUMN|TABLE) event\b" $tocfile) -f - $dumpfile
--
-- PostgreSQL database dump
--
-- Dumped from database version 14.5 (Ubuntu 14.5-1.pgdg20.04+1)
-- Dumped by pg_dump version 14.5 (Ubuntu 14.5-1.pgdg20.04+1)
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
SET default_tablespace = '';
SET default_table_access_method = heap;
--
-- Name: event; Type: TABLE; Schema: public; Owner: admin
--
CREATE TABLE public.event (
id text NOT NULL,
event_notification_id text,
type text NOT NULL,
version text NOT NULL,
partner_id text,
object_id text,
data text NOT NULL,
reference_id1 text,
reference_id2 text,
reference_id3 text,
updated timestamp with time zone DEFAULT now() NOT NULL,
created timestamp with time zone DEFAULT now() NOT NULL,
actor_id text,
sla_type text,
due timestamp with time zone,
completed timestamp with time zone,
instance_id text,
pii_data json
);
ALTER TABLE public.event OWNER TO admin;
--
-- Name: TABLE event; Type: ACL; Schema: public; Owner: admin
--
GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE public.event TO pii_writer;
GRANT SELECT ON TABLE public.event TO pii_reader;
GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE public.event TO covereditemadm;
GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE public.event TO claimhistoryadm;
--
-- PostgreSQL database dump complete
--
________________________________
From: Kong Man <kong_mansatiansin@hotmail.com>
Sent: Monday, July 31, 2023 3:50 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: pg_restore 14 skips ACL COLUMN when --schema is used
Tom,
You are right. 'Neither a TABLE nor a COLUMN ACL' are pg_restore-d, when I use -n or --schema without the TABLE toc entry.
Not knowing the behavioral change to pg_restore since PG11, I struggled to search for the root cause and made a wrong assumption based on the end result from my script.
What I do in my dbsnapshot script (to build DB baselines at each release cycle for DevOps) is (1) to pg_restore using only relevant SCHEMA toc entries, then (2) with pg_restore -n to scope out only relevant schemas it restores everything from $dumpfile. It has been working well on PG10 for several years until we upgraded a few months ago.
Step 2 there should have included the 'public.event' table (that I showed in the previous message), because the table actually got restored with the table privileges. I still struggle to find out why the column privileges are still missing.
Here are some simplified snippets from my script.
tocfile=/tmp/DBA-710.toc
dumpfile=/tmp/DBA-710.dump
pg_restore --list $dumpfile >$tocfile
DB=appdb
SCHEMAS=(public appschema1 appschema2)
exclusion='londiste|POLICY|PUBLICATION TABLE'
restopts=(-d "service=target dbname=$DB")
Log "[$DB]" " restore SCHEMAs: "$(IFS='|'; cat <<<"${SCHEMAS[*]}")
pg_restore "${restopts[@]}" \
--use-list=<(grep -A1 SCHEMA $tocfile| # only relevant SCHEMA & ACL SCHEMA entries
grep -E "\s("$(IFS='|'; cat <<<"${SCHEMAS[*]}")")\s") \
$dumpfile
restopts+=($(printf " -n %s" ${SCHEMAS[@]})) # Scope out for only relevant schemas
Log "[$DB]" " restore data definitions"
pg_restore "${restopts[@]}" \
--use-list=<(grep -Ev "$exclusion" $tocfile) \
$dumpfile
Thanks,
-Kong
________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, July 28, 2023 8:24 PM
To: Kong Man <kong_mansatiansin@hotmail.com>
Cc: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: pg_restore 14 skips ACL COLUMN when --schema is used
Kong Man <kong_mansatiansin@hotmail.com> writes:
pg_restore --schema in our script used to work well against pg_restore 10. Now that we are on PostgreSQL 14. pg_restore 14 skips ACL COLUMN, but not ACL TABLE, when --schema is used.
That's not the behavior I'm seeing. Would you mind providing a
*complete* reproducer, not some fragments?
The behavior I'm seeing is that neither a TABLE nor a COLUMN ACL
will be restored, because (since v11) ACLs are restored only
if their table is restored, and the --use-list option you are
using excludes the table.
We could perhaps imagine special-casing that, but I think it would be
a wart, because for every other kind of object --use-list can only
filter stuff out, not filter it in. (That is, if you are using
--use-list along with other selectivity options, an object must pass
both restrictions to be output. I don't want to make --use-list
override other rules just for ACLs.)
It would be interesting to see your actual use-case, because
I suspect you may be doing something that there's a better way
to do now. What set of objects are you trying to extract?
regards, tom lane
On Mon, Jul 31, 2023, at 8:00 PM, Kong Man wrote:
In this example, I generate the public.event table's ToC entries for TABLE, ACL TABLE, and ACL COLUMN, then use the result on --use-list. pg_restore generates only the CREATE TABLE and GRANTs statements at the table, but not the column, level.
You didn't provide a test case as requested but I investigated this issue
according to your description. The following commands is sufficient to produce
the analysis below.
CREATE ROLE role1;
CREATE ROLE role2;
CREATE TABLE event (
col1 text,
col2 text,
col3 text
);
GRANT SELECT ON TABLE event TO role1;
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLE event TO role2;
GRANT SELECT (col1, col2) ON TABLE event TO role2;
GRANT SELECT, INSERT, UPDATE (col3) ON TABLE event TO role2;
I created these objects and ran pg_restore using a debugger.
pg_dump -Fc -f /tmp/a.dump -d test
pg_restore -l /tmp/a.dump > /tmp/a.toc
gdb --args pg_restore --use-list=/tmp/a.toc -f - /tmp/a.dump
I set a breakpoint to _tocEntryRequired function to inspect the TocEntry values.
(gdb) b _tocEntryRequired
After a few 'continue' and 'p *te' commands I got:
(gdb) c
Continuing.
Breakpoint 1, _tocEntryRequired (te=0x5555555a6910, curSection=SECTION_PRE_DATA, AH=0x5555555a05b0) at pg_backup_archiver.c:2773
2773 }
(gdb) p *te
$8 = {prev = 0x5555555a67a0, next = 0x5555555a6be0, catalogId = {tableoid = 0, oid = 0}, dumpId = 3327, section = SECTION_NONE, hadDumper = false,
tag = 0x555555585d30 "COLUMN event.col1", namespace = 0x5555555858b0 "public", tablespace = 0x0, tableam = 0x0, owner = 0x555555585340 "euler", desc = 0x555555585df0 "ACL",
defn = 0x5555555a6a00 "GRANT SELECT(col1) ON TABLE public.event TO role2;\n", dropStmt = 0x0, copyStmt = 0x0, dependencies = 0x5555555a6a40, nDeps = 2, dataDumper = 0x0,
dataDumperArg = 0x0, formatData = 0x555555585c70, dataLength = 0, reqs = 0, created = false, pending_prev = 0x0, pending_next = 0x0, depCount = 0, revDeps = 0x0, nRevDeps = 0,
lockDeps = 0x0, nLockDeps = 0}
(gdb) p te->dependencies[0]
$18 = 214
(gdb) p *AH->tocsByDumpId[214]
$17 = {prev = 0x5555555a64e0, next = 0x5555555a67a0, catalogId = {tableoid = 1259, oid = 183415}, dumpId = 214, section = SECTION_PRE_DATA, hadDumper = false,
tag = 0x555555584e00 "event", namespace = 0x555555584f80 "public", tablespace = 0x555555585040 "", tableam = 0x555555584380 "heap", owner = 0x555555584440 "euler",
desc = 0x555555584ec0 "TABLE", defn = 0x5555555a6740 "CREATE TABLE public.event (\n col1 text,\n col2 text,\n col3 text\n);\n",
dropStmt = 0x5555555a61f0 "DROP TABLE public.event;\n", copyStmt = 0x0, dependencies = 0x0, nDeps = 0, dataDumper = 0x0, dataDumperArg = 0x0, formatData = 0x555555584d40,
dataLength = 0, reqs = 1, created = false, pending_prev = 0x0, pending_next = 0x0, depCount = 0, revDeps = 0x0, nRevDeps = 0, lockDeps = 0x0, nLockDeps = 0}
(gdb) p te->dependencies[1]/messages/by-id/32668.1516848577@sss.pgh.pa.us
$21 = 3326
(gdb) p *AH->tocsByDumpId[3326]
$22 = {prev = 0x5555555a6650, next = 0x5555555a6910, catalogId = {tableoid = 0, oid = 0}, dumpId = 3326, section = SECTION_NONE, hadDumper = false,
tag = 0x5555555854c0 "TABLE event", namespace = 0x555555585670 "public", tablespace = 0x0, tableam = 0x0, owner = 0x555555585730 "euler", desc = 0x555555585580 "ACL",
defn = 0x5555555a6890 "GRANT SELECT ON TABLE public.event TO role1;\nGRANT SELECT,INSERT,DELETE,UPDATE ON TABLE public.event TO role2;\n", dropStmt = 0x0, copyStmt = 0x0,
dependencies = 0x5555555a6050, nDeps = 1, dataDumper = 0x0, dataDumperArg = 0x0, formatData = 0x555555585400, dataLength = 0, reqs = 1, created = false, pending_prev = 0x0,
pending_next = 0x0, depCount = 0, revDeps = 0x0, nRevDeps = 0, lockDeps = 0x0, nLockDeps = 0}
It means that an ACL for columns has 2 dependencies (nDeps = 2) the function
_tocEntryRequired() returns 0 (see code below).
else if (ropt->schemaNames.head != NULL ||
ropt->schemaExcludeNames.head != NULL ||
ropt->selTypes)
{
/*
* In a selective dump/restore, we want to restore these dependent
* TOC entry types only if their parent object is being restored.
* Without selectivity options, we let through everything in the
* archive. Note there may be such entries with no parent, eg
* non-default ACLs for built-in objects.
*
* This code depends on the parent having been marked already,
* which should be the case; if it isn't, perhaps due to
* SortTocFromFile rearrangement, skipping the dependent entry
* seems prudent anyway.
*
* Ideally we'd handle, eg, table CHECK constraints this way too.
* But it's hard to tell which of their dependencies is the one to
* consult.
*/
if (te->nDeps != 1 ||
TocIDRequired(AH, te->dependencies[0]) == 0)
return 0;
}
Hence, ProcessArchiveRestoreOptions() sets te->reqs to 0 that avoids the ACL
for columns to be restored.
te->reqs = _tocEntryRequired(te, curSection, AH);
The discussion [1]/messages/by-id/32668.1516848577@sss.pgh.pa.us in the commit 0d4e6ed3085 does not explain if the 'if' logic
covers all cases. It certainly doesn't for the OP case. The only (hackish)
suggestion I have ATM is to detect ACL for columns and returns 1.
[1]: /messages/by-id/32668.1516848577@sss.pgh.pa.us
--
Euler Taveira
EDB https://www.enterprisedb.com/
"Euler Taveira" <euler@eulerto.com> writes:
You didn't provide a test case as requested but I investigated this issue
according to your description. The following commands is sufficient to produce
the analysis below.
Thanks for looking into it! The ingredient I missed while studying this
earlier is that the table must have *both* table-level and column-level
ACLs to provoke the issue.
It means that an ACL for columns has 2 dependencies (nDeps = 2) the function
_tocEntryRequired() returns 0 (see code below).
D'oh. I think this code was correct when written, but we added an extra
dependency from column ACL to table ACL later because restoring them in
the wrong order causes problems.
The discussion [1] in the commit 0d4e6ed3085 does not explain if the 'if' logic
covers all cases. It certainly doesn't for the OP case. The only (hackish)
suggestion I have ATM is to detect ACL for columns and returns 1.
Yeah, ignoring dependencies on ACLs in this logic seems like the best
way forward. Do you want to write a patch?
regards, tom lane
On Wed, Aug 2, 2023, at 5:06 PM, Tom Lane wrote:
Yeah, ignoring dependencies on ACLs in this logic seems like the best
way forward. Do you want to write a patch?
I like your suggestion. Let me give it a try.
--
Euler Taveira
EDB https://www.enterprisedb.com/
You didn't provide a test case as requested
I apologize for not having a complete test case. It's my first reporting a bug. I started to get the hang of it now.
________________________________
From: Euler Taveira <euler@eulerto.com>
Sent: Wednesday, August 2, 2023 4:53 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Kong Man <kong_mansatiansin@hotmail.com>; pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: pg_restore 14 skips ACL COLUMN when --schema is used
On Wed, Aug 2, 2023, at 5:06 PM, Tom Lane wrote:
Yeah, ignoring dependencies on ACLs in this logic seems like the best
way forward. Do you want to write a patch?
I like your suggestion. Let me give it a try.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Aug 2, 2023, at 5:53 PM, Euler Taveira wrote:
On Wed, Aug 2, 2023, at 5:06 PM, Tom Lane wrote:
Yeah, ignoring dependencies on ACLs in this logic seems like the best
way forward. Do you want to write a patch?I like your suggestion. Let me give it a try.
This patch should apply on all supported back branches (11-16).
In my previous email I forgot the --schema option in the pg_restore. The
correct steps are:
pg_dump -Fc -f /u/d.dump -d test
pg_restore -l /u/d.dump > /u/d.toc
pg_restore --schema=public --use-list=<(grep "ACL" /u/d.toc) -f - /u/d.dump
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v1-0001-Fix-column-level-ACL-for-selective-restore.patchtext/x-patch; name="=?UTF-8?Q?v1-0001-Fix-column-level-ACL-for-selective-restore.patch?="Download+4-1
"Euler Taveira" <euler@eulerto.com> writes:
On Wed, Aug 2, 2023, at 5:53 PM, Euler Taveira wrote:
On Wed, Aug 2, 2023, at 5:06 PM, Tom Lane wrote:
Yeah, ignoring dependencies on ACLs in this logic seems like the best
way forward. Do you want to write a patch?
I like your suggestion. Let me give it a try.
Hmm, changing it like that seems likely to have a lot of unexpected
side-effects. What I had in mind was to change the
if (te->nDeps != 1 ||
TocIDRequired(AH, te->dependencies[0]) == 0)
return 0;
bit so that it would loop through the TE's dependencies to see
if any one of them is a required non-ACL TE. We have to relax
the "te->nDeps != 1" restriction, but we still want to check
that there is a parent object that is being restored.
Some work on the associated comment block seems appropriate too.
And maybe add a test case?
regards, tom lane
I wrote:
Hmm, changing it like that seems likely to have a lot of unexpected
side-effects. What I had in mind was to change the
if (te->nDeps != 1 ||
TocIDRequired(AH, te->dependencies[0]) == 0)
return 0;
bit so that it would loop through the TE's dependencies to see
if any one of them is a required non-ACL TE. We have to relax
the "te->nDeps != 1" restriction, but we still want to check
that there is a parent object that is being restored.
I improved the patch along those lines and pushed it.
regards, tom lane