A comment in DropRole() contradicts the actual behavior

Started by Alexander Lakhinalmost 2 years ago3 messages
#1Alexander Lakhin
exclusion@gmail.com

Hello,

Please look at errors, which produced by the following script, starting
from 6566133c5:
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-1.log 2>&1 &
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-2.log 2>&1 &
wait

ERROR:  could not find tuple for role 16387
ERROR:  could not find tuple for role 16390
ERROR:  could not find tuple for role 16394
...

Maybe these errors are expected, but then I'm confused by the comment in
DropRole():
        /*
         * Re-find the pg_authid tuple.
         *
         * Since we've taken a lock on the role OID, it shouldn't be possible
         * for the tuple to have been deleted -- or for that matter updated --
         * unless the user is manually modifying the system catalogs.
         */
        tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
        if (!HeapTupleIsValid(tuple))
            elog(ERROR, "could not find tuple for role %u", roleid);

Best regards,
Alexander

#2Michael Paquier
michael@paquier.xyz
In reply to: Alexander Lakhin (#1)
Re: A comment in DropRole() contradicts the actual behavior

On Thu, Feb 08, 2024 at 09:00:01AM +0300, Alexander Lakhin wrote:

Hello,

Please look at errors, which produced by the following script, starting
from 6566133c5:
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-1.log 2>&1 &
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-2.log 2>&1 &
wait

ERROR:  could not find tuple for role 16387
ERROR:  could not find tuple for role 16390
ERROR:  could not find tuple for role 16394
...

Maybe these errors are expected, but then I'm confused by the comment in
DropRole():

Well, these errors should never happen, IMHO, so it seems to me that
the comment is right and that the code lacks locking, contrary to what
the comment tells.
--
Michael

#3Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#2)
Re: A comment in DropRole() contradicts the actual behavior

At Thu, 8 Feb 2024 16:39:23 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Thu, Feb 08, 2024 at 09:00:01AM +0300, Alexander Lakhin wrote:

Hello,

Please look at errors, which produced by the following script, starting
from 6566133c5:
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-1.log 2>&1 &
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-2.log 2>&1 &
wait

ERROR:  could not find tuple for role 16387
ERROR:  could not find tuple for role 16390
ERROR:  could not find tuple for role 16394
...

Maybe these errors are expected, but then I'm confused by the comment in
DropRole():

Well, these errors should never happen, IMHO, so it seems to me that
the comment is right and that the code lacks locking, contrary to what
the comment tells.

The function acquires a lock, but it does not perform an existence
check until it first attempts to fetch the tuple, believing in its
presence. However, I doubt performing an additional existence check
right after acquiring the lock is worthwhile.

By the way, I saw the following error with the provided script:

ERROR: duplicate key value violates unique constraint "pg_authid_rolname_index"
DETAIL: Key (rolname)=(u) already exists.
STATEMENT: CREATE USER u;

This seems to be another instance of a similar thinko.

I vaguely think that we should just regard the absence as a concurrent
drop and either adjust or remove the message, then fix the
comment. The situation is slightly different for the duplication
case. We shouldn't ignore it but rather need to adjust the error
message. As long as these behaviors don't lead to inconsistencies.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center