On 18/12/2024 08:17, Richard Guo wrote:
> Recently, I've encountered a core dump several times on master, with a
> backtrace like the one below. This one happened on 0f23dedc9. I was
> running some fuzz testing and had started around 20 sessions
> concurrently.
>
> (gdb) bt
> #0 in GrantLockLocal at lock.c:1758
> #1 in GrantAwaitedLock at lock.c:1840
> #2 in LockErrorCleanup at proc.c:809
> #3 in AbortTransaction at xact.c:2846
> #4 in AbortCurrentTransactionInternal at xact.c:3520
> #5 in AbortCurrentTransaction at xact.c:3449
> #6 in PostgresMain at postgres.c:4535
> #7 in BackendMain at backend_startup.c:107
> #8 in postmaster_child_launch at launch_backend.c:274
> #9 in BackendStartup at postmaster.c:3391
> #10 in ServerLoop at postmaster.c:1678
> #11 in PostmasterMain at postmaster.c:1376
> #12 in main at main.c:224
>
> It seems that the lock request is not granted as expected, since
> locallock->lockOwners is a NULL pointer.
>
> (gdb) p locallock->lockOwners
> $4 = (LOCALLOCKOWNER *) 0x0
> (gdb) p locallock->numLockOwners
> $5 = 0
> (gdb) p locallock->maxLockOwners
> $6 = 8
>
> Unfortunately, I don't have a reliable way to trigger this issue. I'm
> wondering if anyone has any insights into what might be happening.
I don't know how that can happen, but I suspect commit 3c0fd64fec
because it changed things in that area. If you can find a way to
reproduce that even sporadically, that would be very helpful!
--
Heikki Linnakangas
Neon (https://neon.tech)