On Mon, Jan 30, 2012 at 9:55 AM, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:
> Make group commit more effective.
>
> When a backend needs to flush the WAL, and someone else is already flushing
> the WAL, wait until it releases the WALInsertLock and check if we still need
> to do the flush or if the other backend already did the work for us, before
> acquiring WALInsertLock. This helps group commit, because when the WAL flush
> finishes, all the backends that were waiting for it can be woken up in one
> go, and the can all concurrently observe that they're done, rather than
> waking them up one by one in a cascading fashion.
>
> This is based on a new LWLock function, LWLockWaitUntilFree(), which has
> peculiar semantics. If the lock is immediately free, it grabs the lock and
> returns true. If it's not free, it waits until it is released, but then
> returns false without grabbing the lock. This is used in XLogFlush(), so
> that when the lock is acquired, the backend flushes the WAL, but if it's
> not, the backend first checks the current flush location before retrying.
>
> Original patch and benchmarking by Peter Geoghegan and Simon Riggs, although
> this patch as committed ended up being very different from that.
Either this patch, or something else committed this morning, is
causing "make check" to hang or run extremely slowly for me. I think
it's this patch, because I attached to a backend and stopped it a few
times, and all the backtraces look like this:
#0 0x00007fff8a545b22 in semop ()
#1 0x00000001001ff8df in PGSemaphoreLock (sema=0x103d7de70,
interruptOK=0 '\0') at pg_sema.c:418
#2 0x000000010024d7dd in LWLockWaitUntilFree (lockid=<value
temporarily unavailable, due to optimizations>, mode=<value
temporarily unavailable, due to optimizations>) at lwlock.c:666
#3 0x000000010005d3b3 in XLogFlush (record=<value temporarily
unavailable, due to optimizations>) at xlog.c:2148
#4 0x00000001000506bb in CommitTransaction () at xact.c:1113
#5 0x0000000100050b35 in CommitTransactionCommand () at xact.c:2613
#6 0x000000010025a403 in finish_xact_command () at postgres.c:2388
#7 0x000000010025d525 in exec_simple_query (query_string=0x101055638
"CREATE INDEX wowidx ON test_tsvector USING gin (a);") at
postgres.c:1052
#8 0x000000010025dfc1 in PostgresMain (argc=2, argv=<value
temporarily unavailable, due to optimizations>, username=<value
temporarily unavailable, due to optimizations>) at postgres.c:3881
#9 0x000000010020c258 in ServerLoop () at postmaster.c:3587
#10 0x000000010020d167 in PostmasterMain (argc=6, argv=0x100d08f40) at
postmaster.c:1110
#11 0x000000010019e745 in main (argc=6, argv=0x100d08f40) at main.c:199
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company