Re: Moving more work outside WALInsertLock - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Moving more work outside WALInsertLock |
Date | |
Msg-id | 4EF43837.8040306@enterprisedb.com Whole thread Raw |
In response to | Re: Moving more work outside WALInsertLock (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Moving more work outside WALInsertLock
|
List | pgsql-hackers |
On 16.12.2011 15:42, Heikki Linnakangas wrote: > On 16.12.2011 15:03, Simon Riggs wrote: >> On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >>> On 16.12.2011 14:37, Simon Riggs wrote: >>>> >>>> I already proposed a design for that using page-level share locks any >>>> reason not to go with that? >>> >>> Sorry, I must've missed that. Got a link? >> >> From nearly 4 years ago. >> >> http://grokbase.com/t/postgresql.org/pgsql-hackers/2008/02/reworking-wal-locking/145qrhllcqeqlfzntvn7kjefijey >> > > Ah, thanks. That is similar to what I'm experimenting, but a second > lwlock is still fairly heavy-weight. I think with many backends, you > will be beaten badly by contention on the spinlocks alone. > > I'll polish up and post what I've been experimenting with, so we can > discuss that. So, here's a WIP patch of what I've been working on. The WAL insertions is split into two stages: 1. Reserve the space from the WAL stream. This is done while holding a spinlock. The page holding the reserved space doesn't necessary need to be in cache yet, the reservation can run ahead of the WAL buffer cache. (quick testing suggests that a lwlock is too heavy-weight for this) 2. Ensure the page is in the WAL buffer cache. If not, initialize it, evicting old pages if needed. Then finish the CRC calculation of the header and memcpy the record in place. (if the record spans multiple pages, it operates on one page at a time, to avoid problems with running out of WAL buffers) As long as wal_buffers is high enough, and the I/O can keep up, stage 2 can happen in parallel in many backends. The WAL writer process pre-initializes new pages ahead of the insertions, so regular backends rarely need to do that. When a page is written out, with XLogWrite(), you need to wait for any in-progress insertions to the pages you're about to write out to finish. For that, every backend has slot with an XLogRecPtr in shared memory. Iẗ́'s set to the position where that backend is currently inserting to. If there's no insertion in-progress, it's invalid, but when it's valid it acts like a barrier, so that no-one is allowed to XLogWrite() beyond that position. That's very lightweight to the backends, but I'm using busy-waiting to wait on an insertion to finish ATM. That should be replaced with something smarter, that's the biggest missing part of the patch. One simple way to test the performance impact of this is: psql -c "DROP TABLE IF EXISTS foo; CREATE TABLE foo (id int4); CHECKPOINT" postgres echo "BEGIN; INSERT INTO foo SELECT i FROM generate_series(1, 10000) i; ROLLBACK" > parallel-insert-test.sql pgbench -n -T 10 -c4 -f parallel-insert-test.sql postgres On my dual-core laptop, this patch increases the tps on that from about 60 to 110. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: