Thread: Use of 'cp -r' in CREATE DATABASE

Use of 'cp -r' in CREATE DATABASE

From
Bruce Momjian
Date:
Our dbcommands.c has for create database:
   snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc, target_dir);

but my BSD/OS manual only documents 'cp -R' and mentions:
    Historic versions of the cp utility had a -r option.  This implementation    supports that option, however, its use
isstrongly discouraged, as it    does not correctly copy special files, symbolic links or fifo's.
 

I think we should switch to -R in our code.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Use of 'cp -r' in CREATE DATABASE

From
Alvaro Herrera
Date:
On Thu, Dec 11, 2003 at 06:36:05PM -0500, Bruce Momjian wrote:
> Our dbcommands.c has for create database:
> 
>     snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc, target_dir);
> 
[...]
> 
> I think we should switch to -R in our code.

But you will have to write special code for Win32, won't you?
Maybe it would be better to avoid using system commands
altogether and copy the whole thing using syscalls ...

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Use it up, wear it out, make it do, or do without"


Re: Use of 'cp -r' in CREATE DATABASE

From
Philip Yarra
Date:
On Fri, 12 Dec 2003 10:36 am, Bruce Momjian wrote:
> I think we should switch to -R in our code.

Both -r and -R are supported on Linux (fileutils 4.1), Tru64  v4.0 and Solaris
(at least as far back as 2.6) so no complaints here.

Regards, Philip.


Re: Use of 'cp -r' in CREATE DATABASE

From
"Nigel J. Andrews"
Date:
On Thu, 11 Dec 2003, Alvaro Herrera wrote:

> On Thu, Dec 11, 2003 at 06:36:05PM -0500, Bruce Momjian wrote:
> > Our dbcommands.c has for create database:
> > 
> >     snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc, target_dir);
> > 
> [...]
> > 
> > I think we should switch to -R in our code.
> 
> But you will have to write special code for Win32, won't you?
> Maybe it would be better to avoid using system commands
> altogether and copy the whole thing using syscalls ...

That was my immediate thought. Unfortunately that means reinventing the
wheel; or grabbing it from BSD or somewhere and distributing it with
postgresql.


-- 
Nigel J. Andrews



Re: Use of 'cp -r' in CREATE DATABASE

From
Bruce Momjian
Date:
Nigel J. Andrews wrote:
> On Thu, 11 Dec 2003, Alvaro Herrera wrote:
> 
> > On Thu, Dec 11, 2003 at 06:36:05PM -0500, Bruce Momjian wrote:
> > > Our dbcommands.c has for create database:
> > > 
> > >     snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc, target_dir);
> > > 
> > [...]
> > > 
> > > I think we should switch to -R in our code.
> > 
> > But you will have to write special code for Win32, won't you?
> > Maybe it would be better to avoid using system commands
> > altogether and copy the whole thing using syscalls ...
> 
> That was my immediate thought. Unfortunately that means reinventing the
> wheel; or grabbing it from BSD or somewhere and distributing it with
> postgresql.

We already have in dbcommands.c:#ifndef WIN32    snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc, target_dir);
if(system(buf) != 0)    {        if (remove_dbdirs(nominal_loc, alt_loc))            ereport(ERROR,
(errmsg("couldnot initialize database directory"),                     errdetail("Failing system command was: %s",
buf),                    errhint("Look in the postmaster's stderr log for more information.")));        else
ereport(ERROR,                   (errmsg("could not initialize database directory; delete failed as well"),
       errdetail("Failing system command was: %s", buf),                     errhint("Look in the postmaster's stderr
logfor more information.")));    }#else   /* WIN32 */    if (copydir(src_loc, target_dir) != 0)    {        /* copydir
shouldalready have given details of its troubles */        if (remove_dbdirs(nominal_loc, alt_loc))
ereport(ERROR,                   (errmsg("could not initialize database directory")));        else
ereport(ERROR,                   (errmsg("could not initialize database directory; delete failed as well")));
}#endif /* WIN32 */
 

so Win32 is already handled by copydir.  One reason we didn't implement
this in C is because there might be some OS-specific handling for copy. 
copydir() is in port/copydir.c and is for Win32 alone.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Use of 'cp -r' in CREATE DATABASE

From
"Andrew Dunstan"
Date:
Nigel J. Andrews said:
> On Thu, 11 Dec 2003, Alvaro Herrera wrote:
>
>> On Thu, Dec 11, 2003 at 06:36:05PM -0500, Bruce Momjian wrote:
>> > Our dbcommands.c has for create database:
>> >
>> >     snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc,
>> >     target_dir);
>> >
>> [...]
>> >
>> > I think we should switch to -R in our code.
>>
>> But you will have to write special code for Win32, won't you?
>> Maybe it would be better to avoid using system commands
>> altogether and copy the whole thing using syscalls ...
>
> That was my immediate thought. Unfortunately that means reinventing the
> wheel; or grabbing it from BSD or somewhere and distributing it with
> postgresql.
>

We need a consistent policy about it, I think. I grabbed some code for a
recursive mkdir from NetBSD, and it is in initdb.c. But I also wrote a
recursive rm, and Bruce replaced it with calls to the native utilities
using system(), for understandable reasons. Maybe we need a small,
portable, utility library. Or maybe just relying on system utilities is
acceptable.

cheers

andrew




Re: Use of 'cp -r' in CREATE DATABASE

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> Nigel J. Andrews said:
> > On Thu, 11 Dec 2003, Alvaro Herrera wrote:
> >
> >> On Thu, Dec 11, 2003 at 06:36:05PM -0500, Bruce Momjian wrote:
> >> > Our dbcommands.c has for create database:
> >> >
> >> >     snprintf(buf, sizeof(buf), "cp -r '%s' '%s'", src_loc,
> >> >     target_dir);
> >> >
> >> [...]
> >> >
> >> > I think we should switch to -R in our code.
> >>
> >> But you will have to write special code for Win32, won't you?
> >> Maybe it would be better to avoid using system commands
> >> altogether and copy the whole thing using syscalls ...
> >
> > That was my immediate thought. Unfortunately that means reinventing the
> > wheel; or grabbing it from BSD or somewhere and distributing it with
> > postgresql.
> >
> 
> We need a consistent policy about it, I think. I grabbed some code for a
> recursive mkdir from NetBSD, and it is in initdb.c. But I also wrote a
> recursive rm, and Bruce replaced it with calls to the native utilities
> using system(), for understandable reasons. Maybe we need a small,
> portable, utility library. Or maybe just relying on system utilities is
> acceptable.

From my prespective, the native utilities are better if we can invoke
them portabily because they can handle any OS-specific issues.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Use of 'cp -r' in CREATE DATABASE

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> but my BSD/OS manual only documents 'cp -R' and mentions:
> I think we should switch to -R in our code.

And break the code on who knows how many other systems?  No thanks.

If we want to do anything at all with this code, we should eliminate the
use of system("cp") entirely in favor of doing the recursive copy logic
ourselves.  We already have the beginnings of same in the Windows port,
and I think we'll be forced down that path anyway for tablespaces.
        regards, tom lane


Re: Use of 'cp -r' in CREATE DATABASE

From
Andrew Dunstan
Date:
Tom Lane wrote:

>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>  
>
>>but my BSD/OS manual only documents 'cp -R' and mentions:
>>I think we should switch to -R in our code.
>>    
>>
>
>And break the code on who knows how many other systems?  No thanks.
>
>If we want to do anything at all with this code, we should eliminate the
>use of system("cp") entirely in favor of doing the recursive copy logic
>ourselves.  We already have the beginnings of same in the Windows port,
>and I think we'll be forced down that path anyway for tablespaces.
>
>  
>

That seems cleaner to me.

IIRC we don't copy anything but plain files and directories - no special 
files, symlinks or fifos, so the -R/-r differences shouldn't affect us 
anyway, should they? Also, that should make the implementation of an 
internal recursive copy much simpler - far fewer cases to consider.

cheers

andrew



Re: Use of 'cp -r' in CREATE DATABASE

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> IIRC we don't copy anything but plain files and directories - no special 
> files, symlinks or fifos, so the -R/-r differences shouldn't affect us 
> anyway, should they? Also, that should make the implementation of an 
> internal recursive copy much simpler - far fewer cases to consider.

In the ordinary case, yes.  There could perhaps be hand-created symlinks
in the source directory, but I think we would actually prefer that the
copy be stupid about such things (copy the referenced file rather than
duplicating the symlink).  Special files would be a reason to error out.

Also, I'm not sure that there's any good reason to recurse into
subdirectories.  The only subdirectory a database dir could have at
present is the temp-file one, and we'd really prefer that that *not* be
copied at all.

A final point is that implementing CREATE DATABASE via "cp -r" is and
always has been fundamentally broken anyway, because of the lack of
interlocking against other backends changing the source database.
We have a very half-baked defense against that (erroring out if anyone
else is connected to the source DB at the start of the copy) which
I would dearly love to get rid of.  With a file-by-file copy, it might
be possible to do better.  (I'm wondering if there's any way to take
ShareLocks on the individual tables we are copying --- if we could
figure out their OIDs we could do this, but relfilenode is not OID.)

These considerations might change somewhat when we get around to
implementing tablespaces, but I think it's likely that that will
make the need for a custom copy implementation greater, not less.
        regards, tom lane


Re: Use of 'cp -r' in CREATE DATABASE

From
Bruce Momjian
Date:
I added this comment to the code in case we don't fix cp -r:
   /* We might need to use cp -R one day for portability */

---------------------------------------------------------------------------

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > IIRC we don't copy anything but plain files and directories - no special 
> > files, symlinks or fifos, so the -R/-r differences shouldn't affect us 
> > anyway, should they? Also, that should make the implementation of an 
> > internal recursive copy much simpler - far fewer cases to consider.
> 
> In the ordinary case, yes.  There could perhaps be hand-created symlinks
> in the source directory, but I think we would actually prefer that the
> copy be stupid about such things (copy the referenced file rather than
> duplicating the symlink).  Special files would be a reason to error out.
> 
> Also, I'm not sure that there's any good reason to recurse into
> subdirectories.  The only subdirectory a database dir could have at
> present is the temp-file one, and we'd really prefer that that *not* be
> copied at all.
> 
> A final point is that implementing CREATE DATABASE via "cp -r" is and
> always has been fundamentally broken anyway, because of the lack of
> interlocking against other backends changing the source database.
> We have a very half-baked defense against that (erroring out if anyone
> else is connected to the source DB at the start of the copy) which
> I would dearly love to get rid of.  With a file-by-file copy, it might
> be possible to do better.  (I'm wondering if there's any way to take
> ShareLocks on the individual tables we are copying --- if we could
> figure out their OIDs we could do this, but relfilenode is not OID.)
> 
> These considerations might change somewhat when we get around to
> implementing tablespaces, but I think it's likely that that will
> make the need for a custom copy implementation greater, not less.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Use of 'cp -r' in CREATE DATABASE

From
Andrew Dunstan
Date:

Tom Lane wrote:

>A final point is that implementing CREATE DATABASE via "cp -r" is and
>always has been fundamentally broken anyway, because of the lack of
>interlocking against other backends changing the source database.
>We have a very half-baked defense against that (erroring out if anyone
>else is connected to the source DB at the start of the copy) which
>I would dearly love to get rid of.  
>

It struck me this morning that we could strengthen that defense 
considerable with the addition of some checks after the copy on the 
mtimes of the copied files. The additional code could be quite small and 
fast.

Longer term, a robust mechanism for DB level locks would probably be 
preferable, I guess, so I'm not sure if my idea is worth doing in the 
mean time. Presumably it hasn't caused much of a problem up to now, 
since most people are not likely to monkey with their template dbs at 
the same time as trying to create dbs based on them.

cheers

andrew



Re: Use of 'cp -r' in CREATE DATABASE

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Longer term, a robust mechanism for DB level locks would probably be 
> preferable, I guess, so I'm not sure if my idea is worth doing in the 
> mean time. Presumably it hasn't caused much of a problem up to now, 
> since most people are not likely to monkey with their template dbs at 
> the same time as trying to create dbs based on them.

Actually, we do get regular complaints about CREATE DATABASE failing
because template1 is busy, and that's not because the users are actually
*doing* anything to template1, it's just because template1 is the
default database-to-connect-to for a whole lot of operations like
createuser, createdb itself, psql -l, yadda yadda, and so other backends
tend to be transiently connected to template1 even though they have no
intention of doing anything to that database in particular.

A quick-and-dirty solution would be to make another template database,
so that the default-connection-target could be different from the
default-copy-source, but that would cost a fair amount of disk space
(compared to a minimal installation) and create lots of backwards
compatibility issues too.

I'd like to see a locking-type solution myself, but I'm not sure what
the semantics look like.  For things like createuser we'd need to invent
a lock type that says "I'm connected to this database but I only plan to
modify global tables", or something like that.  Messy.
        regards, tom lane


Re: Use of 'cp -r' in CREATE DATABASE

From
Andrew Dunstan
Date:
Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>Longer term, a robust mechanism for DB level locks would probably be 
>>preferable, I guess, so I'm not sure if my idea is worth doing in the 
>>mean time. Presumably it hasn't caused much of a problem up to now, 
>>since most people are not likely to monkey with their template dbs at 
>>the same time as trying to create dbs based on them.
>>    
>>
>
>Actually, we do get regular complaints about CREATE DATABASE failing
>because template1 is busy, and that's not because the users are actually
>*doing* anything to template1, it's just because template1 is the
>default database-to-connect-to for a whole lot of operations like
>createuser, createdb itself, psql -l, yadda yadda, and so other backends
>tend to be transiently connected to template1 even though they have no
>intention of doing anything to that database in particular.
>
>A quick-and-dirty solution would be to make another template database,
>so that the default-connection-target could be different from the
>default-copy-source, but that would cost a fair amount of disk space
>(compared to a minimal installation) and create lots of backwards
>compatibility issues too.
>
>I'd like to see a locking-type solution myself, but I'm not sure what
>the semantics look like.  For things like createuser we'd need to invent
>a lock type that says "I'm connected to this database but I only plan to
>modify global tables", or something like that.  Messy.
>
>  
>

Right. So we have 2 problems - we catch too many cases, and we don't 
catch all the cases that matter.

How about (even quicker and dirtier) putting a limited loop (say 5 
iterations?) with a small delay in it around the check for whether or 
not we are the only connection? Createdb is hardly a time critical 
operation, and taking a few seconds extra in the worst case wouldn't 
matter that much, would it? This wouldn't be a 100% solution but I 
suspect it would be a 99% solution for the spurious failures, and my 
mtime suggestion could take care of the case where the template db is 
changed after the connection check is performed.

cheers

andrew



Re: Use of 'cp -r' in CREATE DATABASE

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> How about (even quicker and dirtier) putting a limited loop (say 5 
> iterations?) with a small delay in it around the check for whether or 
> not we are the only connection? Createdb is hardly a time critical 
> operation, and taking a few seconds extra in the worst case wouldn't 
> matter that much, would it?

Hmm, that might help, though as you say it's only a 90% solution.

> my mtime suggestion could take care of the case where the template db is 
> changed after the connection check is performed.

I don't really trust the mtime idea.  In the first place, mtime is only
good to a second or two on most platforms, which may not be enough
resolution.  In the second place, mtime isn't necessarily updated on
every single write.  If you are running on NFS (not that we've ever
recommended that, but people do it) you also have to worry about time
skew between your clock and the file server's clock.

It occurs to me that it wouldn't really be that hard to develop a
locking-based solution, if we consider that the purpose of the lock
is to keep new connections out of the template database.  Let's assume
we define a lock tag that represents a whole database (dbId = db's OID,
relId = zero would probably do).  Modify the backend so that every
session takes a share-lock on this tag during its startup sequence
(early in ReverifyMyDatabase() would likely be a good place).  The lock
does not need to be held over the life of the session, just acquired
before any database write can possibly occur in the session.  That means
we don't need any complicated new support in the lock manager.

Now, we can make CREATE DATABASE (and DROP DATABASE) acquire exclusive
lock on this lock before they look for extant sessions attached to the
target database.  That prevents any new sessions from getting in.  In
the CREATE case, incoming sessions for the template database are simply
delayed until the CREATE completes.  In the DROP case, incoming sessions
for the doomed database are guaranteed to notice that their database is
gone, because they won't be able to look for it in pg_database until
after the DROP completes.  (I think we currently guarantee the latter
by means of DROP taking an exclusive lock on pg_database, but a lock
that didn't lock out unrelated operations would surely be nicer.)

We could combine this with your idea of delaying: after acquiring the
exclusive lock, CREATE can delay to see if existing sessions connected
to the template database quit soon.  Since the exclusive lock is
preventing new sessions from coming in, this delay does not need to be
very long (probably the average length of a CREATE DATABASE operation
would be enough).

Thoughts?
        regards, tom lane


Re: Use of 'cp -r' in CREATE DATABASE

From
Andrew Dunstan
Date:
Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>How about (even quicker and dirtier) putting a limited loop (say 5 
>>iterations?) with a small delay in it around the check for whether or 
>>not we are the only connection? Createdb is hardly a time critical 
>>operation, and taking a few seconds extra in the worst case wouldn't 
>>matter that much, would it?
>>    
>>
>
>Hmm, that might help, though as you say it's only a 90% solution.
>
>It occurs to me that it wouldn't really be that hard to develop a
>locking-based solution, if we consider that the purpose of the lock
>is to keep new connections out of the template database.  Let's assume
>we define a lock tag that represents a whole database (dbId = db's OID,
>relId = zero would probably do).  Modify the backend so that every
>session takes a share-lock on this tag during its startup sequence
>(early in ReverifyMyDatabase() would likely be a good place).  The lock
>does not need to be held over the life of the session, just acquired
>before any database write can possibly occur in the session.  That means
>we don't need any complicated new support in the lock manager.
>
>Now, we can make CREATE DATABASE (and DROP DATABASE) acquire exclusive
>lock on this lock before they look for extant sessions attached to the
>target database.  That prevents any new sessions from getting in.  In
>the CREATE case, incoming sessions for the template database are simply
>delayed until the CREATE completes.  In the DROP case, incoming sessions
>for the doomed database are guaranteed to notice that their database is
>gone, because they won't be able to look for it in pg_database until
>after the DROP completes.  (I think we currently guarantee the latter
>by means of DROP taking an exclusive lock on pg_database, but a lock
>that didn't lock out unrelated operations would surely be nicer.)
>
>We could combine this with your idea of delaying: after acquiring the
>exclusive lock, CREATE can delay to see if existing sessions connected
>to the template database quit soon.  Since the exclusive lock is
>preventing new sessions from coming in, this delay does not need to be
>very long (probably the average length of a CREATE DATABASE operation
>would be enough).
>
>Thoughts?
>
>  
>

Neat. Took me a few minutes to get my head around how it would work. 
Seems like any success would be guaranteed to be correct, which is a 
definite advance, and only long-lived connections to the template (e.g. 
"psql template1") would cause failures. A delay of around 2 secs seems 
right on my system.

(I played around in my head with other ideas like a read-only connection 
flag and things based on checksumming, but they either seemed likely to 
be error-prone or too expensive.)

cheers

andrew