Home > mailing lists

Re: Performance on inserts - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Performance on inserts
Date	August 25, 2000 22:00:48
Msg-id	9612.967244422@sss.pgh.pa.us Whole thread Raw
In response to	Re: Performance on inserts (Alfred Perlstein <bright@wintelcom.net>)
Responses	Re: Performance on inserts (Jules Bean <jules@jellybean.co.uk>)
List	pgsql-hackers

Tree view

Alfred Perlstein <bright@wintelcom.net> writes:
> A slightly incorrect startscan point is better than starting
> from the beginning every time.

Not for lookups, it isn't ;-).

I did think about making the btree-descending code act differently
for inserts than lookups, but AFAICS that wouldn't buy anything for
this problem, since until you get down to the leaf level you can't
see where there's free space anyway. You'd simply be exchanging the
problem of missing free space that's too far to the right for the
problem of missing free space that's to the left of where you chose
to start looking.

The algorithm does seem to work quite nicely just the way I described
it, although it turns out I was off base about a good probability
setting. I find that something up around 0.99 seems to be good.
Using the same (perhaps overly simplistic) test case:

# tuples inserted 6.5 current+random hack @ 0.99 Time index size Time index size
1536 <1sec 90112 <1sec 106496
3072 1.56 163840 <1sec 188416
6144 3.70 286720 1.40 376832
12288 9.73 532480 2.65 688128
24576 93.26 1024000 5.22 1368064
49152 363.23 2007040 10.34 2727936
98304 22.07 5545984
196608 45.60 11141120
393216 92.53 22290432

I tried probabilities from 0.67 to 0.999 and found that runtimes didn't
vary a whole lot (though this is near the minimum), while index size
consistently got larger as the probability of moving right decreased.
The runtime is nicely linear throughout the range.

The index size increase might look like a bit of a jump, but actually
this is right where we want it to be. The old code was effectively
packing each page as full as it could be under these conditions. That's
not what you want for a btree. Steady-state load factor for btrees is
usually quoted as somewhere around 70%, and this method manages to
approximate that pretty well with a move-right probability of 0.99 or
a bit less.
regards, tom lane

pgsql-hackers by date:

From: Vince Vielhaber
Date: 25 August 2000, 21:50:48
Subject: Pure ODBMS (fwd)

From: Thomas Lockhart
Date: 26 August 2000, 00:34:15
Subject: Re: Performance on inserts

Re: Performance on inserts - Mailing list pgsql-hackers

Previous

Next