Thread: Bug in new buffer freelist code

Bug in new buffer freelist code

From

Tom Lane

Date:

23 December 2003, 17:12:31

I just had the parallel regression tests hang up due to what appears to
be a bug in the new ARC code.  The CLUSTER test gets into an infinite
loop trying to do "CLUSTER clstr_1;".  The loop is in
StrategyInvalidateBuffer's check that the buffer is already in the
freelist; it isn't, and the freelist is circular.

(gdb) bt
#0  0x1fe8a8 in StrategyInvalidateBuffer (buf=0xc3a56f60) at freelist.c:733
#1  0x1fbf08 in FlushRelationBuffers (rel=0x400fa298, firstDelBlock=0)   at bufmgr.c:1596
#2  0x1479fc in swap_relfilenodes (r1=143786, r2=143915) at cluster.c:736
#3  0x147458 in rebuild_relation (OldHeap=0x2322b, indexOid=143788)   at cluster.c:455
#4  0x1473b0 in cluster_rel (rvtc=0x7b03bed8, recheck=0 '\000')   at cluster.c:395
#5  0x146ff4 in cluster (stmt=0x400b88a8) at cluster.c:232
#6  0x21c60c in ProcessUtility (parsetree=0x400b88a8, dest=0x400b88e8,   completionTag=0x7b03bbe8 "") at
utility.c:1033
... etc ...

(gdb) p *buf
$5 = {bufNext = -1, data = 7211904, tag = {rnode = {tblNode = 17142,     relNode = 143906}, blockNum = 0}, buf_id =
850,flags = 14, refcount = 0, io_in_progress_lock = 1721, cntx_lock = 1722, cntxDirty = 0 '\000', wait_backend_id = 0}
 
(gdb) p *StrategyControl
$1 = {target_T1_size = 423, listUnusedCDB = 249, listHead = {464, 967, 1692,   1227}, listTail = {968, 645, 1528,
1694},listSize = {364, 413, 584, 636}, listFreeBuffers = 839, num_lookup = 546939, num_hit = {1378, 246896, 282639,
3935},stat_report = 0, cdb = {{prev = 386, next = 23, list = 3,     buf_tag = {rnode = {tblNode = 17142, relNode =
19080},blockNum = 30},     buf_id = -1, t1_xid = 3402}}}
 
(gdb) p BufferDescriptors[839]
$2 = {bufNext = 839, data = 7121792, tag = {rnode = {tblNode = 17142,     relNode = 143906}, blockNum = 0}, buf_id =
839,flags = 14, refcount = 0, io_in_progress_lock = 1699, cntx_lock = 1700, cntxDirty = 0 '\000', wait_backend_id = 0}
 

So we've got a couple of problems here: buffers 839 and 850 both claim
to contain block 0 of rel 143906 (which is clstr_1), and the freelist
is circular.

This doesn't seem to be super reproducible, but there's definitely a
problem in there somewhere.
        regards, tom lane

Re: Bug in new buffer freelist code

From

Jan Wieck

Date:

23 December 2003, 20:43:38

Tom Lane wrote:
> I just had the parallel regression tests hang up due to what appears to
> be a bug in the new ARC code.  The CLUSTER test gets into an infinite
> loop trying to do "CLUSTER clstr_1;".  The loop is in
> StrategyInvalidateBuffer's check that the buffer is already in the
> freelist; it isn't, and the freelist is circular.

It seems to me that buffers that are thrown away via 
StrategyInvalidateBuffer() do not get their relnode and blocknum cleaned 
out. That causes FlushRelationBuffers() while doing a full scan of the 
whole buffer pool to find buffers that once contained the block again.

If buffer 839 once contained that block, and it was given up that way, 
and later on buffer 850 contains it, there is a CDB for it. If now 
FlushRelationBuffers() scans the buffer pool, it will find buffer 839 
first and call StrategyInvalidateBuffer() for it. That finds the CDB for 
buffer 850, and add's buffer 839 to the list again. Later on FlushRB() 
calls StrategyIB() for buffer 850 and we have the situation at hand.


Does that make sense?

Jan

> 
> (gdb) bt
> #0  0x1fe8a8 in StrategyInvalidateBuffer (buf=0xc3a56f60) at freelist.c:733
> #1  0x1fbf08 in FlushRelationBuffers (rel=0x400fa298, firstDelBlock=0)
>     at bufmgr.c:1596
> #2  0x1479fc in swap_relfilenodes (r1=143786, r2=143915) at cluster.c:736
> #3  0x147458 in rebuild_relation (OldHeap=0x2322b, indexOid=143788)
>     at cluster.c:455
> #4  0x1473b0 in cluster_rel (rvtc=0x7b03bed8, recheck=0 '\000')
>     at cluster.c:395
> #5  0x146ff4 in cluster (stmt=0x400b88a8) at cluster.c:232
> #6  0x21c60c in ProcessUtility (parsetree=0x400b88a8, dest=0x400b88e8,
>     completionTag=0x7b03bbe8 "") at utility.c:1033
> ... etc ...
> 
> (gdb) p *buf
> $5 = {bufNext = -1, data = 7211904, tag = {rnode = {tblNode = 17142,
>       relNode = 143906}, blockNum = 0}, buf_id = 850, flags = 14,
>   refcount = 0, io_in_progress_lock = 1721, cntx_lock = 1722,
>   cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb) p *StrategyControl
> $1 = {target_T1_size = 423, listUnusedCDB = 249, listHead = {464, 967, 1692,
>     1227}, listTail = {968, 645, 1528, 1694}, listSize = {364, 413, 584, 636},
>   listFreeBuffers = 839, num_lookup = 546939, num_hit = {1378, 246896, 282639,
>     3935}, stat_report = 0, cdb = {{prev = 386, next = 23, list = 3,
>       buf_tag = {rnode = {tblNode = 17142, relNode = 19080}, blockNum = 30},
>       buf_id = -1, t1_xid = 3402}}}
> (gdb) p BufferDescriptors[839]
> $2 = {bufNext = 839, data = 7121792, tag = {rnode = {tblNode = 17142,
>       relNode = 143906}, blockNum = 0}, buf_id = 839, flags = 14,
>   refcount = 0, io_in_progress_lock = 1699, cntx_lock = 1700,
>   cntxDirty = 0 '\000', wait_backend_id = 0}
> 
> So we've got a couple of problems here: buffers 839 and 850 both claim
> to contain block 0 of rel 143906 (which is clstr_1), and the freelist
> is circular.
> 
> This doesn't seem to be super reproducible, but there's definitely a
> problem in there somewhere.
> 
>             regards, tom lane


-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

23 December 2003, 21:04:41

Jan Wieck <JanWieck@Yahoo.com> writes:
> It seems to me that buffers that are thrown away via 
> StrategyInvalidateBuffer() do not get their relnode and blocknum cleaned 
> out.

Mmmm.  They definitely should be; if you look at the prior version of
buf_table.c, BufTableDelete did this:
   /*    * Clear the buffer's tag.  This doesn't matter for the hash table,    * since the buffer is already removed
fromit, but it ensures that    * sequential searches through the buffer table won't think the buffer    * is still
validfor its old page.    */   buf->tag.rnode.relNode = InvalidOid;   buf->tag.rnode.tblNode = InvalidOid;

I see you removed that from the current version, but the equivalent
thing needs to be done someplace.  Is StrategyInvalidateBuffer the
right place?

BTW, it kinda looks like the BM_DELETED flag bit is useless now?
        regards, tom lane

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

23 December 2003, 21:09:01

BTW, I just managed to reproduce the hang, after a whole afternoon of
trying ... only it was with a non-debug build.  Sigh.  Anyway, it seems
my HP machine has a significantly higher probability of showing the
problem than my Linux machine --- I have been unable to see the problem
in thirty or forty "make checks" on that one.

Let me know if there's any test I could run to confirm your theory,
assuming I can make it happen again after I finish rebuilding.
        regards, tom lane

Re: Bug in new buffer freelist code

From

Jan Wieck

Date:

23 December 2003, 21:25:43

Tom Lane wrote:

> BTW, I just managed to reproduce the hang, after a whole afternoon of
> trying ... only it was with a non-debug build.  Sigh.  Anyway, it seems
> my HP machine has a significantly higher probability of showing the
> problem than my Linux machine --- I have been unable to see the problem
> in thirty or forty "make checks" on that one.
> 
> Let me know if there's any test I could run to confirm your theory,
> assuming I can make it happen again after I finish rebuilding.

You could add another assertion that checks that the CDB found is 
actually pointing to the buffer that is being invalidated.

Looking at the bufmgr.c code for FlushRelationBuffers() ... it does 
exactly what I described ... leaving the relnode and buffernum for an 
invalidated buffer just where they are.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

23 December 2003, 21:33:33

Jan Wieck <JanWieck@Yahoo.com> writes:
>> Let me know if there's any test I could run to confirm your theory,
>> assuming I can make it happen again after I finish rebuilding.

> You could add another assertion that checks that the CDB found is 
> actually pointing to the buffer that is being invalidated.

Good thought.  What would that look like exactly?

> Looking at the bufmgr.c code for FlushRelationBuffers() ... it does 
> exactly what I described ... leaving the relnode and buffernum for an 
> invalidated buffer just where they are.

Yeah.  I don't think FlushRelationBuffers() is the place to fix it
though; if you try to fix it at that level there will be several places
to do it (DropBuffers etc).  It's unclear to me what the division of
labor is now between BufTableDelete and freelist.c, so I'm not sure
where you *should* fix it ... but it should be as close as possible to
where the buffer is removed from the hashtable, IMHO.
        regards, tom lane

Re: Bug in new buffer freelist code

From

Jan Wieck

Date:

24 December 2003, 01:04:33

Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
>> It seems to me that buffers that are thrown away via 
>> StrategyInvalidateBuffer() do not get their relnode and blocknum cleaned 
>> out.
> 
> Mmmm.  They definitely should be; if you look at the prior version of
> buf_table.c, BufTableDelete did this:
> 
>     /*
>      * Clear the buffer's tag.  This doesn't matter for the hash table,
>      * since the buffer is already removed from it, but it ensures that
>      * sequential searches through the buffer table won't think the buffer
>      * is still valid for its old page.
>      */
>     buf->tag.rnode.relNode = InvalidOid;
>     buf->tag.rnode.tblNode = InvalidOid;
> 
> I see you removed that from the current version, but the equivalent
> thing needs to be done someplace.  Is StrategyInvalidateBuffer the
> right place?

Yeah, looks like ... but I actually want to have a clean reproduction of 
the error before I attempt to fix it. Will look at it deeper.

> 
> BTW, it kinda looks like the BM_DELETED flag bit is useless now?

I think so, together with the BM_FREE which was redundant anyway.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

24 December 2003, 04:17:50

Jan Wieck <JanWieck@Yahoo.com> writes:
> Yeah, looks like ... but I actually want to have a clean reproduction of 
> the error before I attempt to fix it.

Well, I can tell you that just running "make check" over and over isn't
a real efficient way to reproduce the problem.  It might work to do
something like stopping a FlushRelationBuffers call in its tracks with
the debugger, meanwhile accessing the same rel with another backend.
        regards, tom lane

Re: Bug in new buffer freelist code

From

Jan Wieck

Date:

07 January 2004, 15:27:07

Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
>> Yeah, looks like ... but I actually want to have a clean reproduction of 
>> the error before I attempt to fix it.
> 
> Well, I can tell you that just running "make check" over and over isn't
> a real efficient way to reproduce the problem.  It might work to do
> something like stopping a FlushRelationBuffers call in its tracks with
> the debugger, meanwhile accessing the same rel with another backend.

So far I was not able to reproduce the problem. If it is what we think, 
then it should be possible to reproduce it with just one single DB 
connection, no?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

07 January 2004, 17:24:40

Jan Wieck <JanWieck@Yahoo.com> writes:
> So far I was not able to reproduce the problem. If it is what we think, 
> then it should be possible to reproduce it with just one single DB 
> connection, no?

The reason it is difficult to reproduce is that when
StrategyInvalidateBuffer puts a dead buffer on the freelist, it puts it
at the head; therefore, the very next call of StrategyGetBuffer will
cause that buffer to be recycled for some other purpose.  The only way
the dead buffer will survive long enough for another
FlushRelationBuffers scan to find it again is if some other backend does
some other StrategyInvalidateBuffer calls very soon.  Hmm ... actually,
just clustering a large relation might be enough, but I didn't try that.
With the small relations that the regression tests try to cluster, you
don't see the failure because the dead buffers are holding some other
page by the time the bug could occur.

What I did to make it reproducible was to insert the attached hack that
makes StrategyInvalidateBuffer put the dead buffer at the end of the
freelist not the start.  Then, with a freshly started postmaster, it
blows up immediately:

regression=# CREATE TABLE clstr_1 (a INT PRIMARY KEY);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "clstr_1_pkey" for table "clstr_1"
CREATE TABLE
regression=# INSERT INTO clstr_1 VALUES (2);
INSERT 335205 1
regression=# INSERT INTO clstr_1 VALUES (1);
INSERT 335206 1
regression=# CLUSTER clstr_1_pkey ON clstr_1;
CLUSTER
regression=# select * from clstr_1;a
---12
(2 rows)

regression=# CLUSTER clstr_1;
ERROR:  StrategyInvalidateBuffer() buffer 174 not in directory or freelist
CONTEXT:  writing block 0 of relation 154625/335218

With this mod we don't see the infinite loop, and the reason is that
inserting a buffer into the freelist when it's already there doesn't
make a circular freelist (it may lose some of the freelist, instead).
I'm not recommending the mod for anything except exhibiting this bug,
however.

I had a breakpoint in StrategyInvalidateBuffer while doing the above,
and saw the following sequence of passed-in buffers:

Breakpoint 3, StrategyInvalidateBuffer (buf=0xc1fa4e00) at freelist.c:711

$1 = {bufNext = -1, data = 1502080, tag = {rnode = {tblNode = 154625,     relNode = 335212}, blockNum = 0}, buf_id =
153,flags = 14, refcount = 0, io_in_progress_lock = 327, cntx_lock = 328, cntxDirty = 0 '\000', wait_backend_id = 0}
 

$2 = {bufNext = -1, data = 1559424, tag = {rnode = {tblNode = 154625,     relNode = 335218}, blockNum = 0}, buf_id =
160,flags = 14, refcount = 0, io_in_progress_lock = 341, cntx_lock = 342, cntxDirty = 0 '\000', wait_backend_id = 0}
 

-- this call is redundant (it's from DropRelationBuffers after a FlushRB)
$3 = {bufNext = 160, data = 1502080, tag = {rnode = {tblNode = 154625,     relNode = 335212}, blockNum = 0}, buf_id =
153,flags = 14, refcount = 0, io_in_progress_lock = 327, cntx_lock = 328, cntxDirty = 0 '\000', wait_backend_id = 0}
 

$4 = {bufNext = -1, data = 1493888, tag = {rnode = {tblNode = 154625,     relNode = 335214}, blockNum = 0}, buf_id =
152,flags = 14, refcount = 0, io_in_progress_lock = 325, cntx_lock = 326, cntxDirty = 0 '\000', wait_backend_id = 0}
 

$5 = {bufNext = -1, data = 1510272, tag = {rnode = {tblNode = 154625,     relNode = 335214}, blockNum = 1}, buf_id =
154,flags = 14, refcount = 0, io_in_progress_lock = 329, cntx_lock = 330, cntxDirty = 0 '\000', wait_backend_id = 0}
 

$6 = {bufNext = -1, data = 1665920, tag = {rnode = {tblNode = 154625,     relNode = 335220}, blockNum = 0}, buf_id =
173,flags = 14, refcount = 0, io_in_progress_lock = 367, cntx_lock = 368, cntxDirty = 0 '\000', wait_backend_id = 0}
 

$7 = {bufNext = -1, data = 1682304, tag = {rnode = {tblNode = 154625,     relNode = 335220}, blockNum = 1}, buf_id =
175,flags = 14, refcount = 0, io_in_progress_lock = 371, cntx_lock = 372, cntxDirty = 0 '\000', wait_backend_id = 0}
 

-- this call is redundant
$8 = {bufNext = 154, data = 1493888, tag = {rnode = {tblNode = 154625,     relNode = 335214}, blockNum = 0}, buf_id =
152,flags = 14, refcount = 0, io_in_progress_lock = 325, cntx_lock = 326, cntxDirty = 0 '\000', wait_backend_id = 0}
 

-- this call is redundant
$9 = {bufNext = 173, data = 1510272, tag = {rnode = {tblNode = 154625,     relNode = 335214}, blockNum = 1}, buf_id =
154,flags = 14, refcount = 0, io_in_progress_lock = 329, cntx_lock = 330, cntxDirty = 0 '\000', wait_backend_id = 0}
 

-- this call is redundant
$10 = {bufNext = 154, data = 1493888, tag = {rnode = {tblNode = 154625,     relNode = 335214}, blockNum = 0}, buf_id =
152,flags = 14, refcount = 0, io_in_progress_lock = 325, cntx_lock = 326, cntxDirty = 0 '\000', wait_backend_id = 0}
 

-- this call is redundant
$11 = {bufNext = 173, data = 1510272, tag = {rnode = {tblNode = 154625,     relNode = 335214}, blockNum = 1}, buf_id =
154,flags = 14, refcount = 0, io_in_progress_lock = 329, cntx_lock = 330, cntxDirty = 0 '\000', wait_backend_id = 0}
 

-- this call is redundant
$12 = {bufNext = 160, data = 1502080, tag = {rnode = {tblNode = 154625,     relNode = 335212}, blockNum = 0}, buf_id =
153,flags = 14, refcount = 0, io_in_progress_lock = 327, cntx_lock = 328, cntxDirty = 0 '\000', wait_backend_id = 0}
 

second CLUSTER starts here:

-- this call is redundant
$13 = {bufNext = 152, data = 1559424, tag = {rnode = {tblNode = 154625,     relNode = 335218}, blockNum = 0}, buf_id =
160,flags = 14, refcount = 0, io_in_progress_lock = 341, cntx_lock = 342, cntxDirty = 0 '\000', wait_backend_id = 0}
 

$14 = {bufNext = -1, data = 1674112, tag = {rnode = {tblNode = 154625,     relNode = 335218}, blockNum = 0}, buf_id =
174,flags = 14, refcount = 0, io_in_progress_lock = 369, cntx_lock = 370, cntxDirty = 0 '\000', wait_backend_id = 0}
 

Breakpoint 1, errfinish (dummy=0) at elog.c:320
320             int                     elevel = edata->elevel;
(gdb) bt
#0  errfinish (dummy=0) at elog.c:320
#1  0x29675c in elog_finish (elevel=1074079512,   fmt=0xa5da4 "StrategyInvalidateBuffer() buffer %d not in directory or
freeli
st") at elog.c:853
#2  0x1fedb0 in StrategyInvalidateBuffer (buf=0xc1fa50a0) at freelist.c:751
#3  0x1fc360 in FlushRelationBuffers (rel=0x400cb350, firstDelBlock=0)   at bufmgr.c:1596
#4  0x147c14 in swap_relfilenodes (r1=335212, r2=335221) at cluster.c:736
#5  0x147670 in rebuild_relation (OldHeap=0x51d75, indexOid=335214)   at cluster.c:455
#6  0x1475c8 in cluster_rel (rvtc=0x7b03b8b8, recheck=0 '\000')   at cluster.c:395
#7  0x14720c in cluster (stmt=0x400b8de0) at cluster.c:232
#8  0x21cc64 in ProcessUtility (parsetree=0x400b8de0, dest=0x400b8e20,   completionTag=0x7b03b5c8 "") at
utility.c:1033

Note that buffer 160 gets cleared twice in this sequence.  The second
time, the CDB for 174 gets found, leading to failure when 174 is cleared.

I believe the correct fix is to do CLEAR_BUFFERTAG on the buffer (and
maybe the CDB too?) when returning a buffer to the freelist in 
StrategyInvalidateBuffer.  It might also be worthwhile to add another
BM_FLAG bit that specifically indicates a buffer is on the freelist,
and set/clear/test that at appropriate spots.

I am actually of the opinion that the assertion code at the bottom of
StrategyInvalidateBuffer is all wrong: it should *never* be possible
to call StrategyInvalidateBuffer on something that's already in the
freelist, and the only reason you had that in there was because of
this mistaken failure to clear the buffertag, leading to wrong calls
from linear searches of the buffer array.  I think you should just
elog(ERROR) any time you fail to find a CDB in this routine.
        regards, tom lane


*** src/backend/storage/buffer/freelist.c.orig    Sat Nov 29 14:51:56 2003
--- src/backend/storage/buffer/freelist.c    Wed Jan  7 12:44:33 2004
***************
*** 718,725 ****         cdb->next = StrategyControl->listUnusedCDB;         StrategyControl->listUnusedCDB = cdb_id; 
!         buf->bufNext = StrategyControl->listFreeBuffers;
!         StrategyControl->listFreeBuffers = buf->buf_id;         return;     } 
--- 718,737 ----         cdb->next = StrategyControl->listUnusedCDB;         StrategyControl->listUnusedCDB = cdb_id; 
!         if (StrategyControl->listFreeBuffers < 0)
!         {
!             buf->bufNext = StrategyControl->listFreeBuffers;
!             StrategyControl->listFreeBuffers = buf->buf_id;
!         }
!         else
!         {
!             /* stash it at the bottom of the freelist */
!             buf_id = StrategyControl->listFreeBuffers;
!             while (BufferDescriptors[buf_id].bufNext >= 0)
!                 buf_id = BufferDescriptors[buf_id].bufNext;
!             buf->bufNext = -1;
!             BufferDescriptors[buf_id].bufNext = buf->buf_id;
!         }         return;     }

Re: Bug in new buffer freelist code

From

Jan Wieck

Date:

07 January 2004, 17:42:15

Tom Lane wrote:

> Note that buffer 160 gets cleared twice in this sequence.  The second
> time, the CDB for 174 gets found, leading to failure when 174 is cleared.
> 
> I believe the correct fix is to do CLEAR_BUFFERTAG on the buffer (and
> maybe the CDB too?) when returning a buffer to the freelist in 
> StrategyInvalidateBuffer.  It might also be worthwhile to add another
> BM_FLAG bit that specifically indicates a buffer is on the freelist,
> and set/clear/test that at appropriate spots.

That was the place I intended to do it.

> 
> I am actually of the opinion that the assertion code at the bottom of
> StrategyInvalidateBuffer is all wrong: it should *never* be possible
> to call StrategyInvalidateBuffer on something that's already in the
> freelist, and the only reason you had that in there was because of
> this mistaken failure to clear the buffertag, leading to wrong calls
> from linear searches of the buffer array.  I think you should just
> elog(ERROR) any time you fail to find a CDB in this routine.

That is corrent.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Bug in new buffer freelist code

From

Neil Conway

Date:

07 January 2004, 17:48:23

Tom Lane <tgl@sss.pgh.pa.us> writes:
> It might also be worthwhile to add another BM_FLAG bit that
> specifically indicates a buffer is on the freelist, and
> set/clear/test that at appropriate spots.

ISTM that BM_FREE should indicate this, or am I misunderstanding you?

-Neil

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

07 January 2004, 17:55:21

Neil Conway <neilc@samurai.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> It might also be worthwhile to add another BM_FLAG bit that
>> specifically indicates a buffer is on the freelist, and
>> set/clear/test that at appropriate spots.

> ISTM that BM_FREE should indicate this, or am I misunderstanding you?

No.  BM_FREE means refcount == 0.  In the old implementation this meant
it was "on the freelist", but in the ARC code there are several
"freelists".  I was speaking specifically of the freelist that's used to
hold completely-dead buffers (buffers not containing any valid page).

It might be a good idea to rename BM_FREE to something else, perhaps
BM_UNPINNED, since I can recall being confused about what it meant too.

Also, if Jan likes the idea of adding a flag bit for this purpose, maybe
there should be a flag bit associated with each of the ARC freelists,
so you can tell positively where a "free" buffer is supposed to be.

The dead-buffer freelist may not need a flag bit per se, since the
condition of the buffertag being "clear" should be equivalent.
        regards, tom lane

Re: Bug in new buffer freelist code

From

Neil Conway

Date:

07 January 2004, 18:24:52

Tom Lane <tgl@sss.pgh.pa.us> writes:
> It might be a good idea to rename BM_FREE to something else, perhaps
> BM_UNPINNED, since I can recall being confused about what it meant
> too.

If all it indicates is refcount == 0, ISTM we can just get rid of it
altogether, and just check the shared refcount directly.

> Also, if Jan likes the idea of adding a flag bit for this purpose,
> maybe there should be a flag bit associated with each of the ARC
> freelists, so you can tell positively where a "free" buffer is
> supposed to be.

Seems like a good idea to me...

-Neil

Re: Bug in new buffer freelist code

From

Tom Lane

Date:

07 January 2004, 18:36:56

Neil Conway <neilc@samurai.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> It might be a good idea to rename BM_FREE to something else, perhaps
>> BM_UNPINNED, since I can recall being confused about what it meant
>> too.

> If all it indicates is refcount == 0, ISTM we can just get rid of it
> altogether, and just check the shared refcount directly.

Good point.
        regards, tom lane