Home > mailing lists

Re: [HACKERS] strange behavior of UPDATE - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] strange behavior of UPDATE
Date	September 21, 1999 21:17:44
Msg-id	24700.937963003@sss.pgh.pa.us Whole thread Raw
In response to	Re: [HACKERS] strange behavior of UPDATE (Bruce Momjian <maillist@candle.pha.pa.us>)
List	pgsql-hackers

Tree view

Bruce Momjian <maillist@candle.pha.pa.us> writes:
> Tom, I assume this is done, right?

Only partly.  I changed _bt_binsrch to use a pure binary search,
but didn't yet get around to fixing _bt_findsplitloc (it's more
complicated :-().

Vadim seemed to think that a better solution would be to fix the
comparison rules so that no two entries are ever considered to have
equal keys anyway.  So I put my change on the back burner ... but if
his change doesn't happen, mine ought to get done.


>> I wrote:
>>>> then a possible explanation for the problem would be that our
>>>> btree index code isn't very fast when there are large numbers of
>>>> identical keys.
>> 
>> Ah-hah, a lucky guess!  I went back and looked at the profile stats
>> I'd extracted from Edmund's "update" example.  This Linux box has
>> the same semi-functional gprof as someone else was using a while
>> ago --- the timings are bogus, but the call counts seem right.
>> And what I find are entries like this:
>> 
>> 0.00    0.00  284977/284977      _bt_binsrch [3174]
>> [3177]   0.0    0.00    0.00  284977         _bt_firsteq [3177]
>> 0.00    0.00 21784948/24713758     _bt_compare [3169]
>> 
>> 0.00    0.00     426/35632       _bt_split [53]
>> 0.00    0.00   35206/35632       _bt_insertonpg [45]
>> [3185]   0.0    0.00    0.00   35632         _bt_findsplitloc [3185]
>> 0.00    0.00 5093972/8907411     _bt_itemcmp [3171]
>> 
>> In other words, _bt_firsteq is averaging almost 100 comparisons per
>> call, _bt_findsplitloc well over that.  Both of these routines are
>> evidently designed on the assumption that there will be relatively
>> few duplicate keys --- they reduce to linear scans when there are
>> many duplicates.
>> 
>> _bt_firsteq shouldn't exist at all; the only routine that calls it
>> is _bt_binsrch, which does a fast binary search of the index page.
>> _bt_binsrch should be fixed so that the binary search logic does the
>> right thing for equal-key cases, rather than degrading to a linear
>> search.  I am less confident that I understand _bt_findsplitloc's place
>> in the great scheme of things, but it could certainly be revised to use
>> a binary search rather than linear scan.
>> 
>> This benchmark is clearly overemphasizing the equal-key case, but
>> I think it ought to be fixed independently of whether we want to
>> look good on a dubious benchmark ... equal keys are not uncommon in
>> real-world scenarios after all.
>> 
>> Next question is do we want to risk twiddling this code so soon before
>> 6.5, or wait till after?

pgsql-hackers by date:

From: Tom Lane
Date: 21 September 1999, 21:12:40
Subject: Re: [HACKERS] Partial fix for INSERT...SELECT problems

From: Tom Lane
Date: 21 September 1999, 21:20:47
Subject: Re: [HACKERS] inherited GROUP BY is busted ... I need some help here

Re: [HACKERS] strange behavior of UPDATE - Mailing list pgsql-hackers

Previous

Next