pg_get_expr locking
The function pg_get_expr(), which is used in various system views and
information schema views, does not appear to lock the table passed as
the second argument, and so appears to be liable to fail if there is a
concurrent drop of the table. There is a (probable) case of this being
discussed at [0]/messages/by-id/9ec24d7b-633d-463a-84c6-7acff769c9e8@eisentraut.org. I also see various mentions of this issue in the
commit logs, mostly related to pg_dump.
Is there a reason there is no locking? Performance?
What workaround should we use if there are conflicts created by
concurrent regression tests? Just move the tests around a bit until the
issue goes away?
[0]: /messages/by-id/9ec24d7b-633d-463a-84c6-7acff769c9e8@eisentraut.org
/messages/by-id/9ec24d7b-633d-463a-84c6-7acff769c9e8@eisentraut.org
Peter Eisentraut <peter@eisentraut.org> writes:
The function pg_get_expr(), which is used in various system views and
information schema views, does not appear to lock the table passed as
the second argument, and so appears to be liable to fail if there is a
concurrent drop of the table. There is a (probable) case of this being
discussed at [0]. I also see various mentions of this issue in the
commit logs, mostly related to pg_dump.
Is there a reason there is no locking? Performance?
I think we have a general rule that you shouldn't be able to take
locks on relations you have no privileges for, so pg_get_expr would
need to verify privileges if it locked the rel. At least for
pg_dump's purposes, that cure would be about as bad as the disease.
What workaround should we use if there are conflicts created by
concurrent regression tests? Just move the tests around a bit until the
issue goes away?
Why would a test be applying pg_get_expr to a table it doesn't
control?
regards, tom lane
On 07.02.24 16:26, Tom Lane wrote:
What workaround should we use if there are conflicts created by
concurrent regression tests? Just move the tests around a bit until the
issue goes away?Why would a test be applying pg_get_expr to a table it doesn't
control?
I think the situation is that one test (domain) runs pg_get_expr as part
of an information_schema view, while at the same time another test
(alter_table) drops a table that the pg_get_expr is just processing.
Peter Eisentraut <peter@eisentraut.org> writes:
On 07.02.24 16:26, Tom Lane wrote:
Why would a test be applying pg_get_expr to a table it doesn't
control?
I think the situation is that one test (domain) runs pg_get_expr as part
of an information_schema view, while at the same time another test
(alter_table) drops a table that the pg_get_expr is just processing.
The test case that's failing is, IIUC,
+SELECT * FROM information_schema.domain_constraints
+ WHERE domain_name IN ('con', 'dom', 'pos_int', 'things')
+ ORDER BY constraint_name;
I see no use of pg_get_expr() in the domain_constraints view:
CREATE VIEW domain_constraints AS
SELECT CAST(current_database() AS sql_identifier) AS constraint_catalog,
CAST(rs.nspname AS sql_identifier) AS constraint_schema,
CAST(con.conname AS sql_identifier) AS constraint_name,
CAST(current_database() AS sql_identifier) AS domain_catalog,
CAST(n.nspname AS sql_identifier) AS domain_schema,
CAST(t.typname AS sql_identifier) AS domain_name,
CAST(CASE WHEN condeferrable THEN 'YES' ELSE 'NO' END
AS yes_or_no) AS is_deferrable,
CAST(CASE WHEN condeferred THEN 'YES' ELSE 'NO' END
AS yes_or_no) AS initially_deferred
FROM pg_namespace rs, pg_namespace n, pg_constraint con, pg_type t
WHERE rs.oid = con.connamespace
AND n.oid = t.typnamespace
AND t.oid = con.contypid
AND (pg_has_role(t.typowner, 'USAGE')
OR has_type_privilege(t.oid, 'USAGE'));
I'm a little suspicious that the failure is actually coming from
somewhere down inside has_type_privilege(), but I traced through
that quickly and don't see how it could reach such an error.
In any case I thought we'd hardened all those functions in 403ac226d.
So I'm still pretty mystified. Have you had any luck in making
the failure reproducible?
regards, tom lane
I wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
I think the situation is that one test (domain) runs pg_get_expr as part
of an information_schema view, while at the same time another test
(alter_table) drops a table that the pg_get_expr is just processing.
The test case that's failing is, IIUC,
+SELECT * FROM information_schema.domain_constraints + WHERE domain_name IN ('con', 'dom', 'pos_int', 'things') + ORDER BY constraint_name;
Oh, scratch that: there are two confusingly lookalike queries
in the patch. The one that is failing is
SELECT * FROM information_schema.check_constraints
WHERE (constraint_schema, constraint_name)
IN (SELECT constraint_schema, constraint_name
FROM information_schema.domain_constraints
WHERE domain_name IN ('con', 'dom', 'pos_int', 'things'))
ORDER BY constraint_name;
and we have trouble because the evaluation of pg_get_expr in
check_constraints is done before the semijoin that would restrict
it to just the desired objects.
After looking at the code I'm less worried about the
permissions-checking angle than I was before, because I see
that pg_get_expr already takes a transient AccessShareLock
on the rel, down inside set_relation_column_names. This is
not ideal from a permissions standpoint perhaps, but it's
probably OK considering we've done that for a long time.
We just need to hold that lock a little while longer.
I propose the attached as a reasonably localized fix.
We could imagine a more aggressive refactoring that would
allow passing down the Relation instead of re-opening it
in set_relation_column_names, but I doubt it's worth the
trouble.
regards, tom lane