Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id 20170308193038.rl4oe27hb4u4zhx4@alvherre.pgsql
Whole thread Raw
In response to Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas wrote:
> On Wed, Mar 8, 2017 at 12:14 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > Alvaro Herrera wrote:
> >> Here's a rebased set of patches.  This is the same Pavan posted; I only
> >> fixed some whitespace and a trivial conflict in indexam.c, per 9b88f27cb42f.
> >
> > Jaime noted that I forgot the attachments.  Here they are
> 
> If I recall correctly, the main concern about 0001 was whether it
> might negatively affect performance, and testing showed that, if
> anything, it was a little better. Does that sound right?

Not really -- it's a bit slower actually in a synthetic case measuring
exactly the slowed-down case.  See
https://www.postgresql.org/message-id/CAD__OugK12ZqMWWjZiM-YyuD1y8JmMy6x9YEctNiF3rPp6hy0g@mail.gmail.com
I bet in normal cases it's unnoticeable.  If WARM flies, then it's going
to provide a larger improvement than is lost to this.

> Regarding 0002, I think this could use some documentation someplace
> explaining the overall theory of operation.  README.HOT, maybe?

Hmm.  Yeah, we should have something to that effect.  0005 includes
README.WARM, but I think there should be some place unified that
explains the whole thing.

> +     * Most often and unless we are dealing with a pg-upgraded cluster, the
> +     * root offset information should be cached. So there should not be too
> +     * much overhead of fetching this information. Also, once a tuple is
> +     * updated, the information will be copied to the new version. So it's not
> +     * as if we're going to pay this price forever.
> 
> What if a tuple is updated -- presumably clearing the
> HEAP_LATEST_TUPLE on the tuple at the end of the chain -- and then the
> update aborts?  Then we must be back to not having this information.

I will leave this question until I have grokked how this actually works.

> One overall question about this patch series is how we feel about
> using up this many bits.  0002 uses a bit from infomask, and 0005 uses
> a bit from infomask2.  I'm not sure if that's everything, and then I
> think we're steeling some bits from the item pointers, too.  While the
> performance benefits of the patch sound pretty good based on the test
> results so far, this is definitely the very last time we'll be able to
> implement a feature that requires this many bits.

Yeah, this patch series uses a lot of bits.  At some point we should
really add the "last full-scanned by version X" we discussed a long time
ago, and free the MOVED_IN / MOVED_OFF bits that have been unused for so
long.  Sadly, once we add that, we need to wait one more release before
we can use the bits anyway.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] background sessions
Next
From: Peter van Hardenberg
Date:
Subject: Re: [HACKERS] SQL/JSON in PostgreSQL