Thread: Fixing flat user/group files at database startup

Fixing flat user/group files at database startup

From
Tom Lane
Date:
Michael Klatt reported here:
http://archives.postgresql.org/pgsql-admin/2005-02/msg00031.php
that we have problems because the flat files global/pg_pwd
and global/pg_group aren't rebuilt following WAL recovery.
This has in fact been a bug since we created WAL, although it's
certainly far worse in the context of PITR because the window in
which the files can get out of sync is far wider.

I've been investigating fixing this and it seems like a bit of a mess,
because reading pg_shadow and pg_group in the normal way pretty much
requires having the full backend environment, which is most definitely
not available in the bootstrap-like context that WAL recovery runs in.
Short of writing another kluge like GetRawDatabaseInfo(), it seems like
the only clean solution is to have the postmaster launch a special
backend just after the startup process completes.  Even that isn't super
clean, because of the problem of choosing what database the special
backend should attach to.  I don't much like the idea that the database
will fail to start up if template1 isn't there :-(  Maybe we could hack
it to connect to template0 instead, but that's only marginally better.
(Yes, Virginia, you can drop template0 too.)

One idea I'm toying with is to try to make something like
GetRawDatabaseInfo but not as klugy.  The principal reason that
GetRawDatabaseInfo is an intolerable hack is that it can't verify the
commit states of transactions.  Now that limitation was written into it
back when pg_log was an ordinary relation and we didn't have any special
infrastructure for getting at it (so you needed most of the backend up
before you could look at pg_log).  I think that the clog/subtrans/slru
mechanisms might work well enough in the startup environment to be used
to examine transaction commit results.  But going in this direction
would require writing a fair amount of new code, probably too much to
consider backpatching into 8.0.*.

Anybody see a cleaner solution?  Or at least other ideas to consider?
        regards, tom lane


Re: Fixing flat user/group files at database startup

From
Alvaro Herrera
Date:
Sorry, hit the wrong key.

----- Forwarded message from Alvaro Herrera <alvherre@dcc.uchile.cl> -----

Date: Fri, 4 Feb 2005 22:39:11 -0300
From: Alvaro Herrera <alvherre@dcc.uchile.cl>
To: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: [HACKERS] Fixing flat user/group files at database startup

On Fri, Feb 04, 2005 at 03:16:33PM -0500, Tom Lane wrote:
> Michael Klatt reported here:
> http://archives.postgresql.org/pgsql-admin/2005-02/msg00031.php
> that we have problems because the flat files global/pg_pwd
> and global/pg_group aren't rebuilt following WAL recovery.
> This has in fact been a bug since we created WAL, although it's
> certainly far worse in the context of PITR because the window in
> which the files can get out of sync is far wider.

I thought that maybe we could reconstruct the file using the previous
file and the WAL entry, but then I noticed that we don't emit a special
WAL entry for user create/delete/update, so this would also require a
lot of new code.

> One idea I'm toying with is to try to make something like
> GetRawDatabaseInfo but not as klugy.  The principal reason that
> GetRawDatabaseInfo is an intolerable hack is that it can't verify the
> commit states of transactions.  Now that limitation was written into it
> back when pg_log was an ordinary relation and we didn't have any special
> infrastructure for getting at it (so you needed most of the backend up
> before you could look at pg_log).  I think that the clog/subtrans/slru
> mechanisms might work well enough in the startup environment to be used
> to examine transaction commit results.

If you make something similar to GetRawDatabaseInfo, then you don't need
the plain files at all, do you?  By doing that, a lot of code could go
away.

> But going in this direction would require writing a fair amount of new
> code, probably too much to consider backpatching into 8.0.*.

If you don't backport this into 8.0, then 8.0 will be broken forever?
The special-backend solution does not really seem a lot better, and it
doesn't seem less code either.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Para tener más hay que desear menos"

----- End forwarded message -----
-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
La web junta la gente porque no importa que clase de mutante sexual seas,
tienes millones de posibles parejas. Pon "buscar gente que tengan sexo con
ciervos incendiándose", y el computador dirá "especifique el tipo de ciervo"
(Jason Alexander)


Re: Fixing flat user/group files at database startup

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Fri, Feb 04, 2005 at 03:16:33PM -0500, Tom Lane wrote:
>> Michael Klatt reported here:
>> http://archives.postgresql.org/pgsql-admin/2005-02/msg00031.php
>> that we have problems because the flat files global/pg_pwd
>> and global/pg_group aren't rebuilt following WAL recovery.

> I thought that maybe we could reconstruct the file using the previous
> file and the WAL entry, but then I noticed that we don't emit a special
> WAL entry for user create/delete/update, so this would also require a
> lot of new code.

What of mods that later get rolled back?  Seems pretty messy to do it
that way, even if there were adequate support in the WAL mechanisms.

>> One idea I'm toying with is to try to make something like
>> GetRawDatabaseInfo but not as klugy.

> If you make something similar to GetRawDatabaseInfo, then you don't need
> the plain files at all, do you?  By doing that, a lot of code could go
> away.

Not unless we're willing to let the postmaster execute the
GetRawDatabaseInfo substitute, which I think is a nonstarter on
reliability grounds.

I'm actually thinking more in the other direction: get rid of
GetRawDatabaseInfo in its present form, instead relying on a flat-file
representation of pg_database to let new backends find out the OID and
tablespace of their target database.  That would let us fix many of the
weird corner cases in which GetRawDatabaseInfo fails (see the knee-jerk
recommendation to "VACUUM pg_database and CHECKPOINT" that we make any
time someone reports they can't log into one specific database).  The
trick is to have infrastructure that makes the flat files more reliable
than they are now.

>> But going in this direction would require writing a fair amount of new
>> code, probably too much to consider backpatching into 8.0.*.

> If you don't backport this into 8.0, then 8.0 will be broken forever?

[ shrug... ]  This bug has been there since 7.1; it just has a greater
chance of biting someone in 8.0 than it did before.  If we have to
document "do something to force a pg_shadow update" as part of the PITR
recovery process in 8.0.*, well, it's ugly but IMHO it beats putting a
large amount of poorly tested code into a dot-release.

I've been fooling with trying to make a small kluge that would be
reasonable to back-port, but not having a lot of luck so far.  What I've
got at the moment is this bit of code to be called at the end of the 
BS_XLOG_STARTUP case in bootstrap.c:

/** This routine is called once during database startup, after completing* WAL replay if needed.  Its purpose is to
syncthe flat files with the* current state of the pg_shadow and pg_group tables.  This is particularly* important
duringPITR operation, since the flat files will come from the* base backup which may be far out of sync with the
currentstate.** In theory we could skip this if no WAL replay occurred, but it seems* safest to just do it always.*/
 
void
BuildUserGroupFiles(void)
{   RelFileNode rnode;   Relation    rel;
   /* use a fake relcache similar to WAL recovery */   XLogInitRelationCache();
   /* need to have a resource owner too to keep heapscan machinery happy */   CurrentResourceOwner =
ResourceOwnerCreate(NULL,"BuildUserGroupFiles");
 
   /* hard-wired path to pg_shadow */   rnode.spcNode = GLOBALTABLESPACE_OID;   rnode.dbNode = 0;   rnode.relNode =
RelOid_pg_shadow;
   rel = XLogOpenRelation(true, 0, rnode);
   write_user_file(rel);
   /* hard-wired path to pg_group */   rnode.spcNode = GLOBALTABLESPACE_OID;   rnode.dbNode = 0;   rnode.relNode =
RelOid_pg_group;
   rel = XLogOpenRelation(true, 0, rnode);
   write_group_file(rel);
   XLogCloseRelationCache();
}

However this still dumps core because write_user_file and
write_group_file expect to be able to use heap_getattr, which needs to
have a tupdesc, which the phony relcache entries made by
XLogOpenRelation haven't got.  We could fix that by consing up tupdescs
from fixed data in pg_attribute.h (a la the formrdesc'd reldescs in
relcache.c) but that's pretty messy.  I gave up after thinking ahead
and realizing that it will still fail miserably on out-of-line toasted
datums, which write_group_file at least has got to be able to cope with
--- the grolist array might be large enough to require toasting.  The
WAL recovery environment has definitely not got the infrastructure
needed by the tuple toaster.

What I am thinking is that we have to punt on this problem until the
proposed pg_role catalog is in place.  With the grolist array
representation of group membership replaced by a fixed-width
pg_role_members catalog, there would be no need to deal with any
potentially-toasted columns while extracting the data needed for the
flat files.
        regards, tom lane


Re: Fixing flat user/group files at database startup

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> We can't build the files very easily during WAL recovery, but
> what about if we compare the files to the database after the normal
> backend startup?  If they're different, regenerate the files.

This assumes that you can get in in the first place, which is not a good
assumption if the flat password file is missing all your current users
and/or passwords.
        regards, tom lane


Re: Fixing flat user/group files at database startup

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> What I am thinking is that we have to punt on this problem until the
> proposed pg_role catalog is in place.  With the grolist array
> representation of group membership replaced by a fixed-width
> pg_role_members catalog, there would be no need to deal with any
> potentially-toasted columns while extracting the data needed for the
> flat files.

While that sounds alright to me, I was wondering if it'd be possible to
take a slightly different approach to it so we could solve it for
8.0.x...  We can't build the files very easily during WAL recovery, but
what about if we compare the files to the database after the normal
backend startup?  If they're different, regenerate the files.  Maybe I'm
missing something here, but that seems pretty straight-forward to me...
Stephen

Re: Fixing flat user/group files at database startup

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > We can't build the files very easily during WAL recovery, but
> > what about if we compare the files to the database after the normal
> > backend startup?  If they're different, regenerate the files.
>
> This assumes that you can get in in the first place, which is not a good
> assumption if the flat password file is missing all your current users
> and/or passwords.

I was thinking of this just being a general 'check that the world is
sane' part of backend startup and would be automated and happen every
time, not something an admin would have to kick off or anything.  The
backend looks through the WAL files and whatnot during startup too, to
check if there was a crash or something...

I guess I'm confused by 'who' needs to 'get in' to have a bit of code
run at the very end of the backend startup.  Apparently I'm somewhat
naive in that area.
Stephen

Re: Fixing flat user/group files at database startup

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> I guess I'm confused by 'who' needs to 'get in' to have a bit of code
> run at the very end of the backend startup.  Apparently I'm somewhat
> naive in that area.

The point is that the postmaster is going to use the flat files to
check whether you're allowed to log in in the first place.  It does
you little good to have a fixup routine in place in backend startup
if you're not allowed to get that far.

(Indeed, one of the points on my personal TODO list entry for this
problem is to make sure the postmaster doesn't try to use the flat
files before we know they've been brought up to date.  Right at the
moment it loads them before even launching the WAL replay process...)
        regards, tom lane