Thread: Postgre7.0.2 drop user bug

Postgre7.0.2 drop user bug

From
Matthew
Date:
> I think we have found a bug in postgresql 7.0.2.  If I'm right then a fix
> for this should probably be added to 7.0.3 also.  Anyway without further
> adieu:
> 
> I have attached detail of my session at the end of this email, but the
> summary is as follows.
> 
> If I run the following commands my pg_group file will be corrupted and up
> to 2Gig in size. (FYI - I am starting this after a fresh initdb, and the
> pg_group file
> starts at 0 bytes)
> 
> create user foo1;
> create user foo2;
> create user foo3;
> create group test1;
> alter group test1 add user foo1,foo2,foo3;
>     (pg_group is 8192 bytes)
> drop user foo1,foo2;
>     (now my pg_group file is 24567 bytes)
> drop user foo3;
> 
> (now my psql is hanging, control-c does nothing to kill the query.  Also
> my pg_group file is now growing rapidly till it fills the disk or I kill
> -9 postmaster, which ever comes first.  Not good!)
> 
> I think the problem is with the drop user command (obviously).  After the
> alter group command all three user ids are in the pg_group table.  After I
> drop two of the users, both users are gone from the pg_user table, but the
> second user is still in pg_group.  So we now have a reference to a user
> that does not exist.  At this point I think the system is in trouble, and
> the last drop user command just seals the coffin.  If I do the same script
> but drop the users one at a time, everything is fine.  So apparently while
> the drop user command correctly drops all users from pg_user, it only
> checks pg_group for the first user in the list.
> 
> We found this problem on one of our soon to be production database
> servers, fortunately prior to install.  The only way to get the server
> back
> in to a fully functional state was to perform an initdb (I stopped
> postgre, then deleted the data directory then restarted using
> /etc/init.d/postgresql start  which performs initdb)
> 
> Anyway, any comments?  Can anyone else repeat this? I hope this is easy to
> fix.  I guess the quick fix is to disallow multiple users to be specified 
> in the drop user command.
> 
> Thanks much,
> 
> Matt O'Connor
> 
> 
> p.s..  as promised here is some detail from my session.
> I have tested this on both RH7 with pg7.0.2 as supplied by RH, and with
> RH6.2 with 7.0.2 rpms from Lamar.
> 
> 
> Red Hat Linux release 7.0 (Guinness)
> select version();
>                            version
> -------------------------------------------------------------
>  PostgreSQL 7.0.2 on i686-pc-linux-gnu, compiled by gcc 2.96
> (1 row)
> 
> ls -l ./data/pg_group
> -rw-------    1 postgres postgres        0 Oct 17 20:26 ./data/pg_group
> create user foo1;
> CREATE USER
> create user foo2;
> CREATE USER
> create user foo3;
> CREATE USER
> select * from pg_user;
>  usename  | usesysid | usecreatedb | usetrace | usesuper | usecatupd |
> passwd  | valuntil
> ----------+----------+-------------+----------+----------+-----------+----
> ------+----------
>  postgres |       26 | t           | t        | t        | t         |
> ******** |
>  foo1     |       27 | f           | f        | f        | f         |
> ******** |
>  foo2     |       28 | f           | f        | f        | f         |
> ******** |
>  foo3     |       29 | f           | f        | f        | f         |
> ******** |
> (4 rows)
> 
> create group test1;
> CREATE GROUP
> select * from pg_group;
>  groname | grosysid | grolist
> ---------+----------+---------
>  test1   |        1 |
> (1 row)
>  
> alter group test1 add user foo1,foo2,foo3;
> ALTER GROUP
> select * from pg_user;
>  usename  | usesysid | usecreatedb | usetrace | usesuper | usecatupd |
> passwd  | valuntil
> ----------+----------+-------------+----------+----------+-----------+----
> ------+----------
>  postgres |       26 | t           | t        | t        | t         |
> ******** |
>  foo1     |       27 | f           | f        | f        | f         |
> ******** |
>  foo2     |       28 | f           | f        | f        | f         |
> ******** |
>  foo3     |       29 | f           | f        | f        | f         |
> ******** |
> (4 rows)
>  
> select * from pg_group;
>  groname | grosysid |  grolist
> ---------+----------+------------
>  test1   |        1 | {27,28,29}
> (1 row)
>  
> ls -l ./data/pg_group
> -rw-------    1 postgres postgres     8192 Oct 17 20:27 ./data/pg_group
> drop user foo1,foo2;
> DROP USER
> ls -l ./data/pg_group
> -rw-------    1 postgres postgres    24576 Oct 17 20:27 ./data/pg_group
> select * from pg_user;
>  usename  | usesysid | usecreatedb | usetrace | usesuper | usecatupd |
> passwd  | valuntil
> ----------+----------+-------------+----------+----------+-----------+----
> ------+----------
>  postgres |       26 | t           | t        | t        | t         |
> ******** |
>  foo3     |       29 | f           | f        | f        | f         |
> ******** |
> (2 rows)
>  
> select * from pg_group;
>  groname | grosysid | grolist
> ---------+----------+---------
>  test1   |        1 | {29,27}
> (1 row)
>  
> drop user foo3;
> Cancel request sent
> Terminated
> bash-2.04$  ls -l ./data/pg_group
-rw-------    1 postgres postgres 438386688 Oct 17 20:27 ./data/pg_group
bash-2.04$

(as you can see, psql was not responding after the last drop user, I hit
control-c and is said Cancel request sent, but psql never returned to a
prompt, so I had to kill -9 the postmaster.


Re: Postgre7.0.2 drop user bug

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
>> I think we have found a bug in postgresql 7.0.2.

Bug confirmed --- on a compilation with Asserts enabled, this sequence
causes an assert failure during update of pg_group_name_index in the
DROP USER command, in both 7.0.* and current sources.  I ran out of
steam before finding the cause, but something's pretty busted here...
        regards, tom lane


Re: Postgre7.0.2 drop user bug

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
>> Anyway, any comments?  Can anyone else repeat this? I hope this is easy to
>> fix.  I guess the quick fix is to disallow multiple users to be specified 
>> in the drop user command.

The correct fix is CommandCounterIncrement() in the DROP USER loop,
so that later iterations can see the changes made by prior iterations.
Without, death and destruction ensue if any of the users are in the
same groups, because the later AlterGroup calls fail.

Fixed in current and back-patched for 7.0.3.
        regards, tom lane


RE: Postgre7.0.2 drop user bug

From
Matthew
Date:
>From: Tom Lane
>Fixed in current and back-patched for 7.0.3.
>
>            regards, tom lane

Should I check out the current pre 7.0.3 CVS and test?  If so I think you
gave the CVS information in a few previous emails on the hackers list so I
will look there for it.

Thanks for the quick response.

Matt O'Connor


RE: Postgre7.0.2 drop user bug

From
Matthew
Date:
>-----Original Message-----
>From: Tom Lane

>The correct fix is CommandCounterIncrement() in the DROP USER loop,
>so that later iterations can see the changes made by prior iterations.
>
>            regards, tom lane

Since postgre now suppport referential integrity and cascading deletes,
wouldn't it make more sense to use that code to manage the relationship
between pg_user and pg_group (and probably a wealth of other system tables),
rather then having to write specific code to manage every relationship
between system tables, or are those types of constraints just not applicable
to system tables?


Re: Postgre7.0.2 drop user bug

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
> Should I check out the current pre 7.0.3 CVS and test?

Sure, the more the merrier ...
        regards, tom lane


Re: Postgre7.0.2 drop user bug

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
>> The correct fix is CommandCounterIncrement() in the DROP USER loop,
>> so that later iterations can see the changes made by prior iterations.

> Since postgre now suppport referential integrity and cascading deletes,
> wouldn't it make more sense to use that code to manage the relationship
> between pg_user and pg_group (and probably a wealth of other system tables),

Dunno if it's worth the trouble to change.  Certainly this particular
bug would still be a bug, whether the cascaded deletion is done "by hand"
or not. 
        regards, tom lane