Thread: COMMIT NOWAIT Performance Option (patch)

COMMIT NOWAIT Performance Option (patch)

From
"Simon Riggs"
Date:
A prototype patch is posted to -patches, which is WORK IN PROGRESS.
[This patch matches discussion thread on -hackers.]

The following TODO items remain

1. discuss which process will issue regular XLogFlush(). If agreed,
implement WALWriter process to perform this task. (Yes, the patch isn't
fully implemented, yet).
2. remove fsync parameter
3. Prevent COMMIT NOWAIT when commit_delay = 0
4. Discuss whether commit_delay is OK to usurp; twas just an earlier
suggestion from someone else, can go either way.
5. docs

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


Attachment

Re: COMMIT NOWAIT Performance Option (patch)

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> A prototype patch is posted to -patches, which is WORK IN PROGRESS.
> [This patch matches discussion thread on -hackers.]

What does this accomplish other than adding syntactic sugar over a
feature that really doesn't work well anyway?  I don't see any point
in encouraging people to use commit_delay in its present form.  If we
had a portable solution for millisecond-or-so waits then maybe it would
work ...

            regards, tom lane

Re: COMMIT NOWAIT Performance Option (patch)

From
"Simon Riggs"
Date:
On Mon, 2007-02-26 at 18:14 -0500, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > A prototype patch is posted to -patches, which is WORK IN PROGRESS.
> > [This patch matches discussion thread on -hackers.]
>
> What does this accomplish other than adding syntactic sugar over a
> feature that really doesn't work well anyway?  I don't see any point
> in encouraging people to use commit_delay in its present form.  If we
> had a portable solution for millisecond-or-so waits then maybe it would
> work ...

This patch doesn't intend to implement group commit. I've changed the
meaning of commit_delay, sorry if that confuses.

You and I discussed this in Toronto actually, IIRC. The best way to
describe this proposal is deferred fsync, so perhaps a different
parameter commit_fsync_delay would be more appropriate.

Bruce has requested this feature many times from me, so I thought it
about time to publish.

The key point is that COMMIT NOWAIT doesn't wait for group commit to
return, it just doesn't wait at all - leaving someone else to flush WAL.
It's much better than fsync=off.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: [HACKERS] COMMIT NOWAIT Performance Option (patch)

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> On Mon, 2007-02-26 at 18:14 -0500, Tom Lane wrote:
>> What does this accomplish other than adding syntactic sugar over a
>> feature that really doesn't work well anyway?

> This patch doesn't intend to implement group commit. I've changed the
> meaning of commit_delay, sorry if that confuses.

Ah.  The patch was pretty much unintelligible without the discussion
(which got here considerably later :-().  I've still got misgivings
about how safe it really is, but at least this is better than what
commit_delay wishes it could do.

            regards, tom lane

Re: [HACKERS] COMMIT NOWAIT Performance Option (patch)

From
"Simon Riggs"
Date:
On Mon, 2007-02-26 at 23:07 -0500, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > On Mon, 2007-02-26 at 18:14 -0500, Tom Lane wrote:
> >> What does this accomplish other than adding syntactic sugar over a
> >> feature that really doesn't work well anyway?
>
> > This patch doesn't intend to implement group commit. I've changed the
> > meaning of commit_delay, sorry if that confuses.
>
> Ah.  The patch was pretty much unintelligible without the discussion
> (which got here considerably later :-().  I've still got misgivings
> about how safe it really is, but at least this is better than what
> commit_delay wishes it could do.

Latest WIP version of patch now ready for performance testing.

Applies cleanly to CVS HEAD, with two additional files:
src/backend/postmaster/walwriter.c
src/include/postmaster/walwriter.h

Patch passes make installcheck in these cases
- no options set
- wal_writer_delay = 100000
- wal_writer_delay = 100000 and transaction_guarantee = off for all
transactions by default in postgresql.conf.
Normal checkpoints and restarts work without problem after these runs.

What this patch does
--------------------

Implements unguaranteed transactions, which skip the XLogFlush step when
they commit. The flush point is updated in shared memory so that a
separate WAL writer process will perform the flush each time it cycles.

These parameters control this behaviour

transaction_guarantee = on (default) | off       USERSET
wal_writer_delay = 0 (default, ==off)            SIGHUP
log_transaction_guarantee = on (default) | off   SIGHUP
(the default for this would be off in later production version)

WAL writer will start/stop when wal_writer_delay is non-zero/zero.

Unguaranteed transactions are only allowed for
- Execute message
- Fastpath message
- Sync message
- simple query implicit-commit-at-end and explicit COMMITs

All other transaction commits will always use guaranteed commit path.
These include things like VACUUM, various DDL and about a dozen other
places that execute commits. The abort path is never fast in any case.

In addition, any transaction that is deleting files follows guaranteed
commit path, however it was requested.

The interlock between commits and checkpoints is maintained. After the
CheckpointStartLock has been gained by bgwriter, all unguaranteed
transactions are flushed.

(In addition the fsync GUC has been removed from postgresql.conf.sample,
but not actually removed. If this patch goes ahead, I suggest we
deprecate it for one release then remove it next...)

What this patch doesn't do yet
------------------------------

Crash recovery does not yet work, but can be made to do so with TODO
items (1) and (2) below.

1. The interlock between buffer manager and WAL is maintained, but not
sufficiently to avoid problems in all cases. Specifically, commit hint
bits must not be written to disk ahead of a transaction commit.

Two approaches are possible
1. avoid setting the hint bits for unguaranteed transactions
2. set the hint bits *and* update the LSN of the page to be the LSN of
the unguaranteed transaction for which we are setting the hint bits.

Either way, we need to maintain a list of unguaranteed transactions in
shared memory that can be accessed when hint bits are set. The list
would need to contain the Xid and the LSN of each unguaranteed
transaction. This would necessitate keeping the list of unguaranteed
transactions fairly small, so some care is required to ensure this. That
can be achieved by keeping commit_fsync_delay small or putting in a
trigger point at which an wannabe unguaranteed transaction is forced to
flush WAL instead. Some testing has shown that committing every 8
transactions has a considerable leap in performance in many cases.

2. As originally discussed, during crash recovery any in-flight
transactions would need to be explicitly aborted in clog, to override
the possibility that an unguaranteed transaction would have been marked
committed. An alternative would be to flush all unguaranteed
transactions prior to flushing dirty clog and multitrans pages. That
could be achieved by keeping the LSN of the last write to those pages
and performing XLogFlush up to that LSN when we write dirty pages. I'm
leaning towards the new alternative version now, since its cleaner and
it fits better with the way the rest of the server works.

3. WAL Writer could be used for various additional tasks, such as doing
the WAL cache-half-filled check. Those options have been ignored until
now, to avoid complicating discussion and review.

4. We probably need more padding in XLogCtlData to ensure that data
protected by WALInsertLock, WALWriteLock and infolck are in separate
cache lines to avoid CPU false sharing. That should be done whether or
not this patch goes ahead.

Tests, reviews and comments please?

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


Attachment