Re: Use of 'cp -r' in CREATE DATABASE - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Use of 'cp -r' in CREATE DATABASE
Date
Msg-id 15401.1072715936@sss.pgh.pa.us
Whole thread Raw
In response to Re: Use of 'cp -r' in CREATE DATABASE  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: Use of 'cp -r' in CREATE DATABASE
List pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> How about (even quicker and dirtier) putting a limited loop (say 5 
> iterations?) with a small delay in it around the check for whether or 
> not we are the only connection? Createdb is hardly a time critical 
> operation, and taking a few seconds extra in the worst case wouldn't 
> matter that much, would it?

Hmm, that might help, though as you say it's only a 90% solution.

> my mtime suggestion could take care of the case where the template db is 
> changed after the connection check is performed.

I don't really trust the mtime idea.  In the first place, mtime is only
good to a second or two on most platforms, which may not be enough
resolution.  In the second place, mtime isn't necessarily updated on
every single write.  If you are running on NFS (not that we've ever
recommended that, but people do it) you also have to worry about time
skew between your clock and the file server's clock.

It occurs to me that it wouldn't really be that hard to develop a
locking-based solution, if we consider that the purpose of the lock
is to keep new connections out of the template database.  Let's assume
we define a lock tag that represents a whole database (dbId = db's OID,
relId = zero would probably do).  Modify the backend so that every
session takes a share-lock on this tag during its startup sequence
(early in ReverifyMyDatabase() would likely be a good place).  The lock
does not need to be held over the life of the session, just acquired
before any database write can possibly occur in the session.  That means
we don't need any complicated new support in the lock manager.

Now, we can make CREATE DATABASE (and DROP DATABASE) acquire exclusive
lock on this lock before they look for extant sessions attached to the
target database.  That prevents any new sessions from getting in.  In
the CREATE case, incoming sessions for the template database are simply
delayed until the CREATE completes.  In the DROP case, incoming sessions
for the doomed database are guaranteed to notice that their database is
gone, because they won't be able to look for it in pg_database until
after the DROP completes.  (I think we currently guarantee the latter
by means of DROP taking an exclusive lock on pg_database, but a lock
that didn't lock out unrelated operations would surely be nicer.)

We could combine this with your idea of delaying: after acquiring the
exclusive lock, CREATE can delay to see if existing sessions connected
to the template database quit soon.  Since the exclusive lock is
preventing new sessions from coming in, this delay does not need to be
very long (probably the average length of a CREATE DATABASE operation
would be enough).

Thoughts?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Use of 'cp -r' in CREATE DATABASE
Next
From: Andrew Dunstan
Date:
Subject: Re: Use of 'cp -r' in CREATE DATABASE