Hi,
On 2018-06-04 16:47:29 +0300, Konstantin Knizhnik wrote:
> We in PostgresProc were faced with lock extension contention problem at two
> more customers and tried to use this patch (v13) to address this issue.
> Unfortunately replacing heavy lock with lwlock couldn't completely eliminate
> contention, now most of backends are blocked on conditional variable:
>
> 0x00007fb03a318903 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #0 0x00007fb03a318903 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #1 0x00000000007024ee in WaitEventSetWait ()
> #2 0x0000000000718fa6 in ConditionVariableSleep ()
> #3 0x000000000071954d in RelExtLockAcquire ()
That doesn't necessarily mean that the postgres code is to fault
here. It's entirely possible that the filesystem or storage is the
bottleneck. Could you briefly describe workload & hardware?
> Second problem we observed was even more critical: if backed is granted
> relation extension lock and then got some error before releasing this lock,
> then abort of the current transaction doesn't release this lock (unlike
> heavy weight lock) and the relation is kept locked.
> So database is actually stalled and server has to be restarted.
That obvioulsy needs to be fixed...
Greetings,
Andres Freund