Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CABOikdPdtHcgmCxC2KUvoe30eyzcPygg22Bs5zWGWEnftFGgPA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


On Tue, Mar 14, 2017 at 7:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Pavan Deolasee wrote:
> On Tue, Mar 14, 2017 at 7:17 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:

> > I have already commented about the executor involvement in btrecheck();
> > that doesn't seem good.  I previously suggested to pass the EState down
> > from caller, but that's not a great idea either since you still need to
> > do the actual FormIndexDatum.  I now think that a workable option would
> > be to compute the values/isnulls arrays so that btrecheck gets them
> > already computed.
>
> I agree with your complaint about modularity violation. What I am unclear
> is how passing values/isnulls array will fix that. The way code is
> structured currently, recheck routines are called by index_fetch_heap(). So
> if we try to compute values/isnulls in that function, we'll still need
> access EState, which AFAIU will lead to similar violation. Or am I
> mis-reading your idea?

You're right, it's still a problem.  (Honestly, I think the whole idea
of trying to compute a fake index tuple starting from a just-read heap
tuple is a problem in itself;

Why do you think so?
 
I just wonder if there's a way to do the
recheck that doesn't involve such a thing.)

I couldn't find a better way without a lot of complex infrastructure. Even though we now have ability to mark index pointers and we know that a given pointer either points to the pre-WARM chain or post-WARM chain, this does not solve the case when an index does not receive a new entry. In that case, both pre-WARM and post-WARM tuples are reachable via the same old index pointer. The only way we could deal with this is to mark index pointers as "common", "pre-warm" and "post-warm". But that would require us to update the old pointer's state from "common" to "pre-warm" for the index whose keys are being updated. May be it's doable, but might be more complex than the current approach.
 

> I wonder if we should instead invent something similar to IndexRecheck(),
> but instead of running ExecQual(), this new routine will compare the index
> values by the given HeapTuple against given IndexTuple. ISTM that for this
> to work we'll need to modify all callers of index_getnext() and teach them
> to invoke the AM specific recheck method if xs_tuple_recheck flag is set to
> true by index_getnext().

Yeah, grumble, that idea does sound intrusive, but perhaps it's
workable.  What about bitmap indexscans?  AFAICS we already have a
recheck there natively, so we only need to mark the page as lossy, which
we're already doing anyway.

Yeah, bitmap indexscans should be ok. We need recheck logic only to avoid duplicate scans and since a TID can only occur once in the bitmap, there is no risk for duplicate results.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] WIP: Faster Expression Processing v4
Next
From: "Joshua D. Drake"
Date:
Subject: [HACKERS] Odd listen_addresses behavior