Thread: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

From
stark@postgresql.org (Greg Stark)
Date:
Log Message:
-----------
Speed up CREATE DATABASE by deferring the fsyncs until after copying
all the data and using posix_fadvise to nudge the OS into flushing it
earlier. This also hopefully makes CREATE DATABASE avoid spamming the
cache.

Tests show a big speedup on Linux at least on some filesystems.

Idea and patch from Andres Freund.

Modified Files:
--------------
    pgsql/src/backend/storage/file:
        fd.c (r1.153 -> r1.154)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/storage/file/fd.c?r1=1.153&r2=1.154)
    pgsql/src/include/storage:
        fd.h (r1.66 -> r1.67)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/storage/fd.h?r1=1.66&r2=1.67)
    pgsql/src/port:
        copydir.c (r1.25 -> r1.26)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/port/copydir.c?r1=1.25&r2=1.26)

Re: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

From
Andres Freund
Date:
On Monday 15 February 2010 01:50:57 Greg Stark wrote:
> Log Message:
> -----------
> Speed up CREATE DATABASE by deferring the fsyncs until after copying
> all the data and using posix_fadvise to nudge the OS into flushing it
> earlier. This also hopefully makes CREATE DATABASE avoid spamming the
> cache.
>
> Tests show a big speedup on Linux at least on some filesystems.
>
> Idea and patch from Andres Freund.
I just found a relatively big problem with one of your modifications on the
patch - you removed the
FreeDir(xldir);
xldir = AllocateDir(fromdir);
pair - unfortunately its crucial because otherwise the DIR does not get
rewound - that resulted in *no* files getting fsync()ed (otherwise the loop
above wouldn't have finished yet...).
I think that was also causing the problems I pointed out in " Directory fsync
and other fun"...

You removed it because you didn't want to open the directory twice? I think
doing that is simpler than using rewinddir - I have no idea how usable that
one is on windows for example

Could you add it back?

Andres

Re: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

From
Greg Stark
Date:
On Sun, Feb 21, 2010 at 11:43 PM, Andres Freund <andres@anarazel.de> wrote:
> Could you add it back?
>

Oops, sorry. Sigh. Done.

--
greg

Re: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> I just found a relatively big problem with one of your modifications on the
> patch - you removed the
> FreeDir(xldir);
> xldir = AllocateDir(fromdir);
> pair - unfortunately its crucial because otherwise the DIR does not get
> rewound - that resulted in *no* files getting fsync()ed (otherwise the loop
> above wouldn't have finished yet...).
> I think that was also causing the problems I pointed out in " Directory fsync
> and other fun"...

Actually, that code had *multiple* problems including stat'ing the wrong
file entirely, not to mention that this last commit failed to even
compile.  I also think it should scan the todir not the fromdir, just on
general principles to avoid any possibility of race conditions.

            regards, tom lane

Re: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

From
Greg Stark
Date:
On Mon, Feb 22, 2010 at 2:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Actually, that code had *multiple* problems including stat'ing the wrong
> file entirely, not to mention that this last commit failed to even
> compile.  I also think it should scan the todir not the fromdir, just on
> general principles to avoid any possibility of race conditions.

Argh. I'll be less careless in the future, I promise.

I had concluded that scanning the original directory was odd but
better because it served to double-check that all the original files
actually made it and also because if there were any unrelated files
present there was no need to fsync them. But I agree it's odd and not
very general for copydir if we decide to use it elsewhere other than
create database.




--
greg

Re: Re: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> On Mon, Feb 22, 2010 at 2:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I also think it should scan the todir not the fromdir, just on
>> general principles to avoid any possibility of race conditions.

> I had concluded that scanning the original directory was odd but
> better because it served to double-check that all the original files
> actually made it and also because if there were any unrelated files
> present there was no need to fsync them.

Well, just for the record: if that was actually intentional then both of
you erred seriously by not including a comment that explained that the
coding was intentional (and giving the reasoning).  Any reader of the
code would have assumed that it was a copy-and-paste error, as I did.

> But I agree it's odd and not
> very general for copydir if we decide to use it elsewhere other than
> create database.

Yeah, to me it seems more likely to cause problems down the road than
to catch anything.  If the system is missing directory entries during
ReadDir then we have problems far beyond what copydir can deal with.
The point of the fsync loop is just to force the copy results down to
the platter, not to cross-check that the source directory isn't
changing.  (And, what's more, I don't believe that the source directory
can't change during CREATE DATABASE.  Consider delayed cleanup of
deleted relations during checkpoints.)

            regards, tom lane