Re: gistchoose vs. bloat - Mailing list pgsql-hackers

From Tom Lane
Subject Re: gistchoose vs. bloat
Date
Msg-id 23619.1359056699@sss.pgh.pa.us
Whole thread Raw
In response to Re: gistchoose vs. bloat  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> I did some experimenting with that. I used the same test case Alexander
> did, with geonames data, and compared unpatched version, the original
> patch, and the attached patch that biases the first "best" tuple found,
> but still sometimes chooses the other equally good ones.

>      testname    | initsize | finalsize | idx_blks_read | idx_blks_hit
> ----------------+----------+-----------+---------------+--------------
>   patched-10-4mb | 75497472 |  90202112 |       5853604 |     10178331
>   unpatched-4mb  | 75145216 |  94863360 |       5880676 |     10185647
>   unpatched-4mb  | 75587584 |  97165312 |       5903107 |     10183759
>   patched-2-4mb  | 74768384 |  81403904 |       5768124 |     10193738
>   origpatch-4mb  | 74883072 |  82182144 |       5783412 |     10185373

> I think the conclusion is that all of these patches are effective. The
> 1/10 variant is less effective, as expected, as it's closer in behavior
> to the unpatched behavior than the others. The 1/2 variant seems as good
> as the original patch.

At least on this example, it seems a tad better, if you look at index
size.

> A table full of duplicates isn't very realistic, but overall, I'm
> leaning towards my version of this patch (gistchoose-2.patch). It has
> less potential for causing a regression in existing applications, but is
> just as effective in the original scenario of repeated delete+insert.

+1 for this patch, but I think the comments could use more work.  I was
convinced it was wrong on first examination, mainly because it's hard to
follow the underdocumented look_further_on_equal logic.  I propose the
attached, which is the same logic with better comments (I also chose to
rename and invert the sense of the state variable, because it seemed
easier to follow this way ... YMMV on that though).

            regards, tom lane


Attachment

pgsql-hackers by date:

Previous
From: Jameison Martin
Date:
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Next
From: Noah Misch
Date:
Subject: Re: Materialized views WIP patch