Re: [pgsql-hackers-win32] SRA Win32 sync() code - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: [pgsql-hackers-win32] SRA Win32 sync() code
Date
Msg-id 200311170158.hAH1wmc06667@candle.pha.pa.us
Whole thread Raw
In response to Re: SRA Win32 sync() code  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [pgsql-hackers-win32] SRA Win32 sync() code
Re: [pgsql-hackers-win32] SRA Win32 sync() code
List pgsql-patches
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> Seriously though, if we can move the bulk of the writing work into
> >> background processes then I don't believe that there will be any
> >> significant penalty for regular backends.
>
> > If the background writer starts using fsync(), we can have normal
> > backends that do a write() set a shared memory boolean.  We can then
> > test that boolean and do sync() only if other backends had to do their
> > own writes.
>
> That seems like the worst of both worlds --- you still are depending on
> sync() for correctness.
>
> Also, as long as backends only *seldom* do writes, making them fsync a
> write when they do make one will be less of an impact on overall system
> performance than having a sync() ensue shortly afterwards.  I think you
> are focusing too narrowly on the idea that backends shouldn't ever wait
> for writes, and failing to see the bigger picture.  What we need to
> optimize is overall system performance, not an arbitrary restriction
> that certain processes never wait for certain things.

OK, let me give you my logic and you can tell me where I am wrong.

First, how many backend can a single write process support if all the
backends are doing insert/update/deletes?  5?  10?  Let's assume 10.
Second, once we change write to write/fsync, how much slower will that
be?  100x, 1000x?  Let's say 10x.

So, by my logic, if we have 100 backends all doing updates, we will need
10 * 100 or 1000 writer processes or threads to keep up with that load.
That seems quite excessive to me from a context switching and process
overhead perspective.

Where am I wrong?

Also, if we go with the fsync only at checkpoint, we are doing fsync's
once every minute (at checkpoint time) rather than several times a
second potentially.

Do we know that having the background writer fsync a file that was
written by a backend cause all the data to fsync?  I think I could write
a program to test this by timing each of these tests:

    create an empty file
    open file
    time fsync
    close

    open file
    write 2mb into the file
    time fsync
    close

    open file
    write 2mb into the file
    close
    open file
    time fsync
    close

If I do the write via system(), I am doing it in a separate process so
the test should work.  Should I try this?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: SIGPIPE handling
Next
From: Steven Singer
Date:
Subject: contrib/dbmirror conditional replication