Thread: lock contention on parallel COPY ?
I'm currently playing with Andrews parallel restore patch and it seems that pg is far from taking advantage of the hardware I have for testing (Dual Quad Core Xeon attached to Netapp with 68 spindles). with a concurrency of 4 - I see iowait hovering ~1% CPU load at 20% peak and around 150000 contextswitches/s. The load seems to progress at around the same rate as a single backend restore does on the same box. a profile during the load (a fair amount of tables sized ~10-30M rows each) looks fairly similiar to: samples % symbol name 1933314 21.8884 LWLockAcquire 1677808 18.9957 XLogInsert 848227 9.6034 LWLockRelease 414179 4.6892 DoCopy 332633 3.7660 CopyReadLine 266580 3.0181 UnpinBuffer 221693 2.5099 heap_formtuple 176939 2.0033 .plt 171842 1.9455 PinBuffer 160470 1.8168 GetNewObjectId 154095 1.7446 heap_insert 151813 1.7188 s_lock 117849 1.3343 LockBuffer 109530 1.2401 hash_search_with_hash_value 102169 1.1567 PageAddItem 91151 1.0320 pg_verify_mbstr_len 82538 0.9345 CopyGetData using --truncate-before-load seems to help a bit but it still seems to only barely utilizing the available resources. Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > samples % symbol name > 1933314 21.8884 LWLockAcquire > 1677808 18.9957 XLogInsert > 848227 9.6034 LWLockRelease > 414179 4.6892 DoCopy > 332633 3.7660 CopyReadLine > 266580 3.0181 UnpinBuffer > 221693 2.5099 heap_formtuple I suppose Andrew didn't yet put in the hack to avoid WAL logging during the COPY commands. The LWLockAcquires are presumably blocking on WALInsertLock, given that XLogInsert is also right up there ... regards, tom lane
Tom Lane wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > >> samples % symbol name >> 1933314 21.8884 LWLockAcquire >> 1677808 18.9957 XLogInsert >> 848227 9.6034 LWLockRelease >> 414179 4.6892 DoCopy >> 332633 3.7660 CopyReadLine >> 266580 3.0181 UnpinBuffer >> 221693 2.5099 heap_formtuple >> > > I suppose Andrew didn't yet put in the hack to avoid WAL logging > during the COPY commands. The LWLockAcquires are presumably > blocking on WALInsertLock, given that XLogInsert is also right > up there ... > > > Yes I did. That's what the --truncate-before-load switch does (or should do). cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> I suppose Andrew didn't yet put in the hack to avoid WAL logging > Yes I did. That's what the --truncate-before-load switch does (or should > do). Well, it doesn't seem to be having any effect. Maybe Stefan is testing a configuration with xlog archiving enabled? regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Tom Lane wrote: >>> I suppose Andrew didn't yet put in the hack to avoid WAL logging > >> Yes I did. That's what the --truncate-before-load switch does (or should >> do). > > Well, it doesn't seem to be having any effect. Maybe Stefan is testing > a configuration with xlog archiving enabled? heh no log archiving - I actually said that I'm now playing with --truncate-before-load which seems to cause a noticeable performance (as in IO generated) increase but I still see >130000 context switches/s and a profile that looks like: samples % symbol name 55526 16.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple 9748 2.9075 GetNewObjectId 8059 2.4037 pg_verify_mbstr_len 6825 2.0356 hash_search_with_hash_value 6386 1.9047 s_lock 5645 1.6837 heap_insert 5631 1.6795 PageAddItem 4723 1.4087 pg_atoi Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > heh no log archiving - I actually said that I'm now playing with > --truncate-before-load which seems to cause a noticeable performance (as > in IO generated) increase but I still see >130000 context switches/s and > a profile that looks like: > samples % symbol name > 55526 16.5614 LWLockAcquire > 29721 8.8647 DoCopy > 26581 7.9281 CopyReadLine > 25105 7.4879 LWLockRelease > 15743 4.6956 PinBuffer > 14725 4.3919 heap_formtuple Still a lot of contention for something, then. You might try turning on LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to get some evidence about what. regards, tom lane
Tom Lane wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >> heh no log archiving - I actually said that I'm now playing with >> --truncate-before-load which seems to cause a noticeable performance (as >> in IO generated) increase but I still see >130000 context switches/s and >> a profile that looks like: > >> samples % symbol name >> 55526 16.5614 LWLockAcquire >> 29721 8.8647 DoCopy >> 26581 7.9281 CopyReadLine >> 25105 7.4879 LWLockRelease >> 15743 4.6956 PinBuffer >> 14725 4.3919 heap_formtuple > > Still a lot of contention for something, then. You might try turning on > LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to > get some evidence about what. that one generates a huge amount of logs - output for ~60s into the load is available here: http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!) Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > that one generates a huge amount of logs - output for ~60s into the load > is available here: > http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!) Huh ... essentially all the contention is for OidGenLock. I take it you're loading into a table that has OIDs? regards, tom lane
Tom Lane wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >> that one generates a huge amount of logs - output for ~60s into the load >> is available here: >> http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!) > > Huh ... essentially all the contention is for OidGenLock. I take it > you're loading into a table that has OIDs? hmm that particular database has its ancient roots on 7.3 (or 7.4) and indeed it seems that some tables in the dump are preceeded by a SET default_with_oids = true; Will go and elimited those and try again. Stefan
On Fri, 2008-09-26 at 12:38 -0400, Tom Lane wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > > heh no log archiving - I actually said that I'm now playing with > > --truncate-before-load which seems to cause a noticeable performance (as > > in IO generated) increase but I still see >130000 context switches/s and > > a profile that looks like: > > > samples % symbol name > > 55526 16.5614 LWLockAcquire > > 29721 8.8647 DoCopy > > 26581 7.9281 CopyReadLine > > 25105 7.4879 LWLockRelease > > 15743 4.6956 PinBuffer > > 14725 4.3919 heap_formtuple > > Still a lot of contention for something, then. You might try turning on > LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to > get some evidence about what. Probably loading a table with a generated PK or loading data in ascending sequence, so its contending heavily for the rightmost edge of the index. We need to load data a block at a time and buffer the inserts into the index also, so we don't need to lock/unlock per row. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs <simon@2ndQuadrant.com> writes: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >> samples % symbol name >> 55526 16.5614 LWLockAcquire >> 29721 8.8647 DoCopy >> 26581 7.9281 CopyReadLine >> 25105 7.4879 LWLockRelease >> 15743 4.6956 PinBuffer >> 14725 4.3919 heap_formtuple > Probably loading a table with a generated PK or loading data in > ascending sequence, so its contending heavily for the rightmost edge of > the index. No, given that DoCopy and CopyReadLine are right up there, I think we're still looking at the COPY phase, not index building. The profile will probably change completely once index building starts... regards, tom lane
Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >>> samples % symbol name >>> 55526 16.5614 LWLockAcquire >>> 29721 8.8647 DoCopy >>> 26581 7.9281 CopyReadLine >>> 25105 7.4879 LWLockRelease >>> 15743 4.6956 PinBuffer >>> 14725 4.3919 heap_formtuple > >> Probably loading a table with a generated PK or loading data in >> ascending sequence, so its contending heavily for the rightmost edge of >> the index. > > No, given that DoCopy and CopyReadLine are right up there, I think we're > still looking at the COPY phase, not index building. > > The profile will probably change completely once index building > starts... yeah this profile is only showing the COPY phase ... Stefan
On Fri, 2008-09-26 at 14:00 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > >> samples % symbol name > >> 55526 16.5614 LWLockAcquire > >> 29721 8.8647 DoCopy > >> 26581 7.9281 CopyReadLine > >> 25105 7.4879 LWLockRelease > >> 15743 4.6956 PinBuffer > >> 14725 4.3919 heap_formtuple > > > Probably loading a table with a generated PK or loading data in > > ascending sequence, so its contending heavily for the rightmost edge of > > the index. > > No, given that DoCopy and CopyReadLine are right up there, I think we're > still looking at the COPY phase, not index building. > > The profile will probably change completely once index building > starts... Sorry, was assuming we were loading with indexes on, which is wrong. Agree the profile looks odd. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Fri, 2008-09-26 at 20:07 +0200, Stefan Kaltenbrunner wrote: > Tom Lane wrote: > > Simon Riggs <simon@2ndQuadrant.com> writes: > >> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > >>> samples % symbol name > >>> 55526 16.5614 LWLockAcquire > >>> 29721 8.8647 DoCopy > >>> 26581 7.9281 CopyReadLine > >>> 25105 7.4879 LWLockRelease > >>> 15743 4.6956 PinBuffer > >>> 14725 4.3919 heap_formtuple > > > >> Probably loading a table with a generated PK or loading data in > >> ascending sequence, so its contending heavily for the rightmost edge of > >> the index. > > > > No, given that DoCopy and CopyReadLine are right up there, I think we're > > still looking at the COPY phase, not index building. > > > > The profile will probably change completely once index building > > starts... > > yeah this profile is only showing the COPY phase ... Try using this Postgres core patch. It's a updated version of my fast_copy.v4.patch from Patches: Bulk Insert tuning 20 Mar 2008 -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support