Home > mailing lists

Re: XLogInsert - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Re: XLogInsert
Date	September 14, 2009 03:42:24
Msg-id	f67928030909132042v432f7194le9e0eab2d1156b21@mail.gmail.com Whole thread Raw
In response to	Re: XLogInsert (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Patch LWlocks instrumentation Re: XLogInsert
List	pgsql-hackers

Tree view

On Wed, Aug 19, 2009 at 9:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:
> If I read the code correctly, the only thing that is irrevocable is
> that it writes into
> rdt->next, and if it saved an old copy of rdt first, then it could
> revoke the changes just
> by doing rdt_old->next=NULL. If that were done, then I think this
> code could be
> moved out of the section holding the WALInsertLock.

Hmm, I recall that the changes are ... or were ... more complex.
The tricky case I think is where we have to go back and redo the
block-backup decisions after discovering that the checkpoint REDO
pointer has just moved.

If you can get the work out of the WALInsertLock section for just a
few more instructions, it would definitely be worth doing.

I've attached a patch which removes the iteration over the blocks to be backed-up from the critical section of XLogInsert. Now those blocks are only looped over in one piece of code which both computes the CRC and builds the linked list, rather than having parallel loops.

I've used an elog statement (not shown in patch) to demonstrate that the "goto begin;" after detecting REDO race actually does get executed under a standard workload, (pgbench -c10). Two to 4 out of 10 the backends execute that code path for each checkpoint on my single CPU machine. By doing a kill -9 on a process, to simulate a crash, during the period after the goto begin is execercised but before the precipitating heckpoint completes, I can force it to use the written WAL records in recovery. The database automatically recovers and the results are self-consistent.

I cannot imagine any other races, rare events, or action at a distance that could come into play with this code change, so I cannot think of anything else to test at the moment.

I could not detect a speed difference with pgbench, but as I cannot get pgbench to be XLogInsert bound, that is not surprising. Using the only XLogInsert-bound test case I know of, parallel COPY into a skinny, unindexed table, using 8 parallel copies on a 4 x dual-core x86_64 and with fsync turned off (to approxiamately simulate SSD, which I do not have), I get a speed improvement of 2-4% with the patch over unpatched head. Maybe with more CPUs the benefit would be greater.

That small improvement is probably not very attractive, however I think the patch makes the overall code a bit cleaner, so it may be warranted on that ground. Indeed, my motivation for working on this is that I kept beating my head against the complexity of the old code, and thought that simplifying it would make future work easier.

Cheers,

Jeff

Attachment

XLogInsert_rdt2_v1.patch

pgsql-hackers by date:

From: Jeff Janes
Date: 14 September 2009, 03:09:17
Subject: test_fsync file overrun

From: Alvaro Herrera
Date: 14 September 2009, 03:46:58
Subject: Re: Rough draft: easier translation of psql help

Re: XLogInsert - Mailing list pgsql-hackers

Attachment

Previous

Next