Re: [HACKERS] strange behavior of UPDATE - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] strange behavior of UPDATE
Date
Msg-id 199909211942.PAA15704@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] strange behavior of UPDATE  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] strange behavior of UPDATE
List pgsql-hackers
Tom, I assume this is done, right?


> I wrote:
> > then a possible explanation for the problem would be that our
> > btree index code isn't very fast when there are large numbers of
> > identical keys.
> 
> Ah-hah, a lucky guess!  I went back and looked at the profile stats
> I'd extracted from Edmund's "update" example.  This Linux box has
> the same semi-functional gprof as someone else was using a while
> ago --- the timings are bogus, but the call counts seem right.
> And what I find are entries like this:
> 
>                 0.00    0.00  284977/284977      _bt_binsrch [3174]
> [3177]   0.0    0.00    0.00  284977         _bt_firsteq [3177]
>                 0.00    0.00 21784948/24713758     _bt_compare [3169]
> 
>                 0.00    0.00     426/35632       _bt_split [53]
>                 0.00    0.00   35206/35632       _bt_insertonpg [45]
> [3185]   0.0    0.00    0.00   35632         _bt_findsplitloc [3185]
>                 0.00    0.00 5093972/8907411     _bt_itemcmp [3171]
> 
> In other words, _bt_firsteq is averaging almost 100 comparisons per
> call, _bt_findsplitloc well over that.  Both of these routines are
> evidently designed on the assumption that there will be relatively
> few duplicate keys --- they reduce to linear scans when there are
> many duplicates.
> 
> _bt_firsteq shouldn't exist at all; the only routine that calls it
> is _bt_binsrch, which does a fast binary search of the index page.
> _bt_binsrch should be fixed so that the binary search logic does the
> right thing for equal-key cases, rather than degrading to a linear
> search.  I am less confident that I understand _bt_findsplitloc's place
> in the great scheme of things, but it could certainly be revised to use
> a binary search rather than linear scan.
> 
> This benchmark is clearly overemphasizing the equal-key case, but
> I think it ought to be fixed independently of whether we want to
> look good on a dubious benchmark ... equal keys are not uncommon in
> real-world scenarios after all.
> 
> Next question is do we want to risk twiddling this code so soon before
> 6.5, or wait till after?
> 
>             regards, tom lane
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Partial fix for INSERT...SELECT problems
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] INSERT INTO view means what exactly?