Re: issue with gininsert under very high load - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: issue with gininsert under very high load
Date
Msg-id 52FCE762.9000305@dunslane.net
Whole thread Raw
In response to Re: issue with gininsert under very high load  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: issue with gininsert under very high load  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On 02/12/2014 04:04 PM, Heikki Linnakangas wrote:
> On 02/12/2014 10:50 PM, Andres Freund wrote:
>> On February 12, 2014 9:33:38 PM CET, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Andres Freund <andres@2ndquadrant.com> writes:
>>>> On 2014-02-12 14:39:37 -0500, Andrew Dunstan wrote:
>>>>> On investigation I found that a number of processes were locked
>>> waiting for
>>>>> one wedged process to end its transaction, which never happened
>>> (this
>>>>> transaction should normally take milliseconds). oprofile revealed
>>> that
>>>>> postgres was spending 87% of its time in s_lock(), and strace on the
>>> wedged
>>>>> process revealed that it was in a tight loop constantly calling
>>> select(). It
>>>>> did not respond to a SIGTERM.
>>>
>>>> That's a deficiency of the gin fastupdate cache: a) it bases it's
>>> size
>>>> on work_mem which usually makes it *far* too big b) it doesn't
>>> perform the
>>>> cleanup in one go if it can get a suitable lock, but does independent
>>>> locking for each entry. That usually leads to absolutely horrific
>>>> performance under concurreny.
>>>
>>> I'm not sure that what Andrew is describing can fairly be called a
>>> concurrent-performance problem.  It sounds closer to a stuck lock.
>>> Are you sure you've diagnosed it correctly?
>>
>> No. But I've several times seen similar backtraces where it wasn't 
>> actually stuck, just livelocked. I'm just on my mobile right now, but 
>> afair Andrew described a loop involving lots of semaphores and 
>> spinlock, that shouldn't be the case if it were actually stuck.
>> If there dozens of processes waiting on the same lock, cleaning up a 
>> large amount of items one by one, it's not surprising if its 
>> dramatically slow.
>
> Perhaps we should use a lock to enforce that only one process tries to 
> clean up the pending list at a time.
>

Is that going to serialize all these inserts?

cheers

andrew




pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Next
From: Bruce Momjian
Date:
Subject: Re: old warning in docs