Re: BRIN index and aborted transaction - Mailing list pgsql-hackers

From Robert Haas
Subject Re: BRIN index and aborted transaction
Date
Msg-id CA+TgmobkcA8U1aFQR7+X4aWRnNoCyBmwf9h_wXZeCUXS_r7sBQ@mail.gmail.com
Whole thread Raw
In response to Re: BRIN index and aborted transaction  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Wed, Jul 22, 2015 at 3:20 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Hm, well, I am not sure that we want to pay the overhead of
> re-summarization every time we prune a single tuple from a block range.
> That's going to make vacuum much slower, I assume (without measuring);
> many page ranges are going to be re-summarized without this actually
> changing the range.
>
> For minmax, it would work well to be able to tell whether the deleted
> tuple had a value that was either the min or the max; if so it is
> possible that the range can be decreased, otherwise not.  I'm not sure
> that this would work for inclusion, though.  For geometric types it
> means you check whether the value in the deleted tuple overlaps one of
> the borders of the bounding box.  I don't know whether this actually
> makes sense.  (The obvious thing, which is whether the value overlaps
> the bounding box, is also obviously useless because all values overlap
> the bounding box by definition.)
>
> I think this would require a new support procedure for opclasses.

Yeah, you could have something that basically says "If SUMMARY didn't
need to cover VALUE, could that change the result?".  A stupid opclass
could always return true.  A minmax opclass could return true if the
value is the min or max, and false otherwise.  etc.

>> We know during phase one of vacuum whether we saw any dead tuples in
>> page range X-Y; if yes, re-summarize.  The only reason not to do this
>> is if it causes us to do a lot of resummarization that frequently
>> fails to produce a smaller range. Do you have any experimental data
>> suggesting that this is or is not a problem?
>
> Well, the other issue is that vacuum is at arms length from a BRIN
> index.  Vacuum doesn't provide the deleted-tuples array in a format
> convenient for brin to access it; currently the only way we provide
> access is a callback function that the index AM can call for every
> single indexed TID to indicate whether it is to be removed or not.  BRIN
> doesn't have TIDs, so it cannot call it usefully.  (We could make it
> call once for every possible TID in a page, but that would be very
> wasteful).
>
> I guess we could provide a different callback that provides per-block
> information rather than per-tuple; or perhaps something completely
> different like simply the pointer to the deleted-TIDs array.
> I vaguely recall somebody mentioned the current setup isn't great for
> GIN either, so maybe we can find something that solves both cases?
>
> I think this requires that BRIN calls heap_fetch() for each deleted
> tuple as it is pruned.  This seems terrible from a performance point of
> view.
>
> There has to be a better way.  I'll give it a spin.

Cool.  I'm not sure exactly what the right solution is either, but it
seems like the current situation could very well lead to degrading
index performance over time, with no way to put that right except to
rebuild the index completely.  So it seems worth trying to improve
things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: A little RLS oversight?
Next
From: Peter Geoghegan
Date:
Subject: Re: Eliminating CREATE INDEX comparator TID tie-breaker overhead