Thread: A comment in DropRole() contradicts the actual behavior

A comment in DropRole() contradicts the actual behavior

From
Alexander Lakhin
Date:
Hello,

Please look at errors, which produced by the following script, starting
from 6566133c5:
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-1.log 2>&1 &
for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-2.log 2>&1 &
wait

ERROR:  could not find tuple for role 16387
ERROR:  could not find tuple for role 16390
ERROR:  could not find tuple for role 16394
...

Maybe these errors are expected, but then I'm confused by the comment in
DropRole():
         /*
          * Re-find the pg_authid tuple.
          *
          * Since we've taken a lock on the role OID, it shouldn't be possible
          * for the tuple to have been deleted -- or for that matter updated --
          * unless the user is manually modifying the system catalogs.
          */
         tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
         if (!HeapTupleIsValid(tuple))
             elog(ERROR, "could not find tuple for role %u", roleid);

Best regards,
Alexander



Re: A comment in DropRole() contradicts the actual behavior

From
Michael Paquier
Date:
On Thu, Feb 08, 2024 at 09:00:01AM +0300, Alexander Lakhin wrote:
> Hello,
>
> Please look at errors, which produced by the following script, starting
> from 6566133c5:
> for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-1.log 2>&1 &
> for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-2.log 2>&1 &
> wait
>
> ERROR:  could not find tuple for role 16387
> ERROR:  could not find tuple for role 16390
> ERROR:  could not find tuple for role 16394
> ...
>
> Maybe these errors are expected, but then I'm confused by the comment in
> DropRole():

Well, these errors should never happen, IMHO, so it seems to me that
the comment is right and that the code lacks locking, contrary to what
the comment tells.
--
Michael

Attachment

Re: A comment in DropRole() contradicts the actual behavior

From
Kyotaro Horiguchi
Date:
At Thu, 8 Feb 2024 16:39:23 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> On Thu, Feb 08, 2024 at 09:00:01AM +0300, Alexander Lakhin wrote:
> > Hello,
> >
> > Please look at errors, which produced by the following script, starting
> > from 6566133c5:
> > for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-1.log 2>&1 &
> > for i in `seq 100`; do (echo "CREATE USER u; DROP USER u;"); done | psql >psql-2.log 2>&1 &
> > wait
> >
> > ERROR:  could not find tuple for role 16387
> > ERROR:  could not find tuple for role 16390
> > ERROR:  could not find tuple for role 16394
> > ...
> >
> > Maybe these errors are expected, but then I'm confused by the comment in
> > DropRole():
>
> Well, these errors should never happen, IMHO, so it seems to me that
> the comment is right and that the code lacks locking, contrary to what
> the comment tells.

The function acquires a lock, but it does not perform an existence
check until it first attempts to fetch the tuple, believing in its
presence. However, I doubt performing an additional existence check
right after acquiring the lock is worthwhile.

By the way, I saw the following error with the provided script:

> ERROR:  duplicate key value violates unique constraint "pg_authid_rolname_index"
> DETAIL:  Key (rolname)=(u) already exists.
> STATEMENT:  CREATE USER u;

This seems to be another instance of a similar thinko.

I vaguely think that we should just regard the absence as a concurrent
drop and either adjust or remove the message, then fix the
comment. The situation is slightly different for the duplication
case. We shouldn't ignore it but rather need to adjust the error
message.  As long as these behaviors don't lead to inconsistencies.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center