Re: [HACKERS] strange behaviour on pooled alloc - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] strange behaviour on pooled alloc
Date
Msg-id 199902042155.QAA19391@candle.pha.pa.us
Whole thread Raw
In response to strange behaviour on pooled alloc  (jwieck@debis.com (Jan Wieck))
Responses Re: [HACKERS] strange behaviour on pooled alloc
List pgsql-hackers
> Hi,
> 
>     I'm  continuing  with  the pooled palloc() stuff and am stuck
>     with a very  strange  thing.  I've  reverted  my  changes  to
>     palloc()  and am doing all the memory block pool handling now
>     in aset.c.
> 
>     The benefit from this will be that I later  can  easily  make
>     palloc() etc. macros.

Sounds good.

>     The  new  version of the AllocSet...() functions does not use
>     ordered set.  it manages the block pools itself. Has the same
>     10%  speedup and I expect some more from the macro version of
>     palloc(). It aligns small  allocations  to  power  of  2  for
>     better  reusability  of  free'd  chunks  which  are held in 8
>     different free lists per alloc set depending on  their  size.
>     It lost the ability of AllocSetDump() - who need's that?

No one.

> 
>     First  I  found some bad places where memory is used after it
>     has been free'd. One was in the portal manager with a  portal
>     memory  context  struct!  I'm  pretty  sure  that I found all
>     because  I  tested   by   memset()   'ing   all   memory   on
>     AllocSetFree() and AllocSetReset() with different values.


Good.

>     The  strange behaviour now is that depending on the blocksize
>     and the limit for block/single alloction I use for the pools,
>     the  portals_p2  regression test fails or not. The failure is
>     that the cursor foo24 does  not  return  data  if  the  pools
>     blocksize  is  greater/equal  16K and the smallchunk limit is
>     2K. It returns the correct row if one of them is  less.  More
>     irritating  is  that  it  only  fails  if  run  inside  'make
>     runtest'. If I put multiple portals_p2 tests into  the  tests
>     list, all fail the same. But if the test is run manually with
>     the same psql switches, it succeeds.
> 
>     All  this  behaviour  is  identical  on  two   Linux   2.1.88
>     installations.  One has gcc-2.8.1 and glibc-2.0.13, the other
>     gcc-2.7.2.1 and libc.5.
> 
>     I have absolutely no clue what's going  on  here.  Anyone  an
>     idea how to track this down?

My recommendation is to apply the fix and let others debug it.  Someone
will find the cause.  Just give them a reproducable test case.  In many
cases, more eyes or another OS shows the error much clearer.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: jwieck@debis.com (Jan Wieck)
Date:
Subject: strange behaviour on pooled alloc
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)