Re: Group commit and commit delay/siblings - Mailing list pgsql-performance

From Greg Smith
Subject Re: Group commit and commit delay/siblings
Date
Msg-id 4CFC6814.9010002@2ndquadrant.com
Whole thread Raw
In response to Re: Group commit and commit delay/siblings  (Jignesh Shah <jkshah@gmail.com>)
Responses Re: Group commit and commit delay/siblings  (Jignesh Shah <jkshah@gmail.com>)
Re: Group commit and commit delay/siblings  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Jignesh Shah wrote:
> The commit_siblings = 5 basically checks that it sleeps only when that
> many backends are active. This I think is a very expensive check and I
> would rather make commit_siblings=0 (which the current code does not
> support.. it only supports minimum of 1)

I just posted a message to the Facebook group sorting out the confusion
in terminology there.

The code Jignesh is alluding to does this:

        if (CommitDelay > 0 && enableFsync &&
            CountActiveBackends() >= CommitSiblings)
            pg_usleep(CommitDelay);

And the expensive part of the overhead beyond the delay itself is
CountActiveBackends(), which iterates over the entire procArray
structure.  Note that it doesn't bother acquiring ProcArrayLock for
that, as some small inaccuracy isn't really a problem for what it's
using the number for.  And it ignores backends waiting on a lock too, as
unlikely to commit in the near future.

The siblings count is the only thing that keeps this delay from kicking
in on every single commit when the feature is turned on, which it is by
default.  I fear that a reworking in the direction Jignesh is suggesting
here, where that check was removed, would cripple situations where only
a single process was trying to get commits accomplished.

As for why this somewhat weird feature hasn't been removed yet, it's
mainly because we have some benchmarks from Jignesh proving its value in
the hands of an expert.  If you have a system with a really
high-transaction rate, where you can expect that the server is
constantly busy and commits are being cached (and subsequently written
to physical disk asyncronously), a brief pause after each commit helps
chunk commits into the write cache as more efficient blocks.  It seems a
little counter-intuititive, but it does seem to work.

The number of people who are actually in that position are very few
though, so for the most part this parameter is just a magnet for people
to set incorrectly because they don't understand it.  With this
additional insight from Jignesh clearing up some of the questions I had
about this, I'm tempted to pull commit_siblings altogether, make
commit_delay default to 0, and update the docs to say something
suggesting "this will slow down every commit you make; only increase it
if you have a high commit rate system where that's necessary to get
better commit chunking".

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


pgsql-performance by date:

Previous
From: Jignesh Shah
Date:
Subject: Re: Group commit and commit delay/siblings
Next
From: John Papandriopoulos
Date:
Subject: Re: Query-plan for partitioned UPDATE/DELETE slow and swaps vmem compared to SELECT