Re: tuplesort memory usage: grow_memtuples - Mailing list pgsql-hackers

From Tom Lane
Subject Re: tuplesort memory usage: grow_memtuples
Date
Msg-id 28535.1358402415@sss.pgh.pa.us
Whole thread Raw
In response to Re: tuplesort memory usage: grow_memtuples  (Peter Geoghegan <peter@2ndquadrant.com>)
List pgsql-hackers
Peter Geoghegan <peter@2ndquadrant.com> writes:
> I took another look at this.

Since Greg S. seems to have lost interest in committing this, I am
picking it up.

> My strategy for preventing overflow is to use a uint64, and to use
> Min()/Max() as appropriate. As Robert mentioned, even a 64-bit integer
> could overflow here, and I account for that.

It seems to me that there's a much more robust way to do this, namely
use float8 arithmetic.  Anybody have a problem with this version of
the last-increment branch?
       /*        * This will be the last increment of memtupsize.  Abandon doubling        * strategy and instead
increaseas much as we safely can.        *        * To stay within allowedMem, we can't increase memtupsize by more
  * than availMem / sizeof(SortTuple) elements.  In practice, we want        * to increase it by considerably less,
becausewe need to leave some        * space for the tuples to which the new array slots will refer.  We        * assume
thenew tuples will be about the same size as the tuples        * we've already seen, and thus we can extrapolate from
thespace        * consumption so far to estimate an appropriate new size for the        * memtuples array.  The optimal
valuemight be higher or lower than        * this estimate, but it's hard to know that in advance.        *        *
Thiscalculation is definitely safe against enlarging the array so        * much that LACKMEM becomes true, because the
memorycurrently used        * includes the present array; thus, there would be enough allowedMem        * for the new
arrayelements even if no other memory were currently        * used.        *        * We do the arithmetic in float8,
becauseotherwise the product of        * memtupsize and allowedMem would be quite likely to overflow.  Any        *
inaccuracyin the result should be insignificant, but just for        * paranoia's sake, we bound the result to be 1 to
2times the        * existing value.  (A little algebra shows that grow_ratio must be        * less than 2 here, so
exceptfor roundoff error the Min/Max bounds        * should never do anything.)        *        * Note: it might seem
thatwe need to worry about memtupsize * 2        * overflowing an int, but the MaxAllocSize bound applied below will
   * ensure that can't happen.        */       double        grow_ratio;
 
       grow_ratio = (double) state->allowedMem / (double) memNowUsed;       newmemtupsize = (int) (memtupsize *
grow_ratio);
       newmemtupsize = Max(newmemtupsize, memtupsize);       newmemtupsize = Min(newmemtupsize, memtupsize * 2);
       /* We won't make any further enlargement attempts */       state->growmemtuples = false;


> I also added a brief note within tuplestore.c to the effect that the
> two buffer sizing strategies are not in sync.

My inclination is to just copy the whole grow_memtuples function into
tuplestore.c too.  There's no very good reason why tuplestore should
be stupider than tuplesort about this.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: Hot Standby conflict resolution handling
Next
From: Tom Lane
Date:
Subject: Re: CF3+4