Thread: Circular-freelist bug is still there

Circular-freelist bug is still there

From
Tom Lane
Date:
I just saw the parallel regression tests hang up again.  Inspection
revealed that StrategyInvalidateBuffer() was stuck in an infinite loop
because the freelist was circular.

(gdb) p StrategyControl->listFreeBuffers
$5 = 579
(gdb) p BufferDescriptors[579]
$6 = {bufNext = 106, data = 4991904, tag = {rnode = {tblNode = 17142,     relNode = 143947}, blockNum = 0}, buf_id =
579,flags = 14, refcount = 0, io_in_progress_lock = 1179, cntx_lock = 1180, cntxDirty = 0 '\000', wait_backend_id = 0}
 
(gdb) p BufferDescriptors[106]
$7 = {bufNext = 684, data = 1117088, tag = {rnode = {tblNode = 17142,     relNode = 143989}, blockNum = 0}, buf_id =
106,flags = 14, refcount = 0, io_in_progress_lock = 233, cntx_lock = 234, cntxDirty = 0 '\000', wait_backend_id = 0}
 
(gdb) p BufferDescriptors[684]
$8 = {bufNext = 579, data = 5852064, tag = {rnode = {tblNode = 17142,     relNode = 143929}, blockNum = 0}, buf_id =
684,flags = 14, refcount = 0, io_in_progress_lock = 1389, cntx_lock = 1390, cntxDirty = 0 '\000', wait_backend_id = 0}
 
(gdb)

Don't have time to chase it right now, but you should know that there's
still a low-probability bug in there.
        regards, tom lane


Re: Circular-freelist bug is still there

From
Jan Wieck
Date:
Tom Lane wrote:
> I just saw the parallel regression tests hang up again.  Inspection
> revealed that StrategyInvalidateBuffer() was stuck in an infinite loop
> because the freelist was circular.
> 
> (gdb) p StrategyControl->listFreeBuffers
> $5 = 579
> (gdb) p BufferDescriptors[579]
> $6 = {bufNext = 106, data = 4991904, tag = {rnode = {tblNode = 17142,
>       relNode = 143947}, blockNum = 0}, buf_id = 579, flags = 14,
>   refcount = 0, io_in_progress_lock = 1179, cntx_lock = 1180,
>   cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb) p BufferDescriptors[106]
> $7 = {bufNext = 684, data = 1117088, tag = {rnode = {tblNode = 17142,
>       relNode = 143989}, blockNum = 0}, buf_id = 106, flags = 14,
>   refcount = 0, io_in_progress_lock = 233, cntx_lock = 234,
>   cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb) p BufferDescriptors[684]
> $8 = {bufNext = 579, data = 5852064, tag = {rnode = {tblNode = 17142,
>       relNode = 143929}, blockNum = 0}, buf_id = 684, flags = 14,
>   refcount = 0, io_in_progress_lock = 1389, cntx_lock = 1390,
>   cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb)
> 
> Don't have time to chase it right now, but you should know that there's
> still a low-probability bug in there.

I was under the assumption Neil was still working on this. Don't recall 
exactly why.

Anyhow, according to our discussion in early January I have changed the 
code in StrategyInvalidateBuffer() so that it clears out the buffer tag 
and the CDB's buffer tag. Also it will error out if the CDB is not found 
at all.

The BM_FREE flag (meaning BM_UNPINNED effectively) is gone and replaced 
with direct checks against the refcount.


Thanks for reminding,
Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: Circular-freelist bug is still there

From
Tom Lane
Date:
> Oh, okay.  So when's that fix going to be committed?

Never mind, I see you just did ...
        regards, tom lane


Re: Circular-freelist bug is still there

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> Tom Lane wrote:
>> I just saw the parallel regression tests hang up again.

> Anyhow, according to our discussion in early January I have changed the 
> code in StrategyInvalidateBuffer() so that it clears out the buffer tag 
> and the CDB's buffer tag. Also it will error out if the CDB is not found 
> at all.

Oh, okay.  So when's that fix going to be committed?
        regards, tom lane