Re: HOT patch - version 15 - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: HOT patch - version 15
Date
Msg-id 46E0072A.1040506@enterprisedb.com
Whole thread Raw
In response to Re: HOT patch - version 15  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: HOT patch - version 15  (Florian Pflug <fgp.phlo.org@gmail.com>)
Re: HOT patch - version 15  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> When I suggested that we get rid of the LP_DELETE flag for heap tuples,
>> the tuple-level fragmentation and all that, and just take the vacuum
>> lock and call PageRepairFragmentation, I was thinking that we'd do it in
>> heap_update and only when we run out of space on the page. But as Greg
>> said, it doesn't work because you're already holding a reference to at
>> least one tuple on the page, the one you're updating, by the time you
>> get to heap_update. That's why I put the pruning code to heap_fetch
>> instead. Yes, though the amortized cost is the same, it does push the
>> pruning work to the foreground query path.
>
> The amortized cost is only "the same" if every heap_fetch is associated
> with a heap update.  I feel pretty urgently unhappy about this choice.
> Have you tested the impact of the patch on read-mostly workloads?

I haven't. Someone should. We have a tester working on a test suite with
many small CPU-bound performance test cases; hopefully we'll get those
test cases and results out soon.

Assuming the rule for when to prune would be the same whether we do it
in heap_fetch or heap_update, I don't see how the total cost would be
different. (that's a bad assumption, though, see below)

>>> Another real problem with doing pruning only in UPDATE path is that
>>> we may end up with long HOT chains if the page does not receive a
>>> UPDATE, after many consecutive HOT updates.
>
> How is that, if the same number of prune attempts would occur?

It wouldn't. To avoid the long HOT chains, we want to prune more often
than what's needed to just make room for updates. I'm not sure what the
exact rules are in the current patch.

That's a pretty sensitive tradeoff, we want to prune often to cut the
long HOT chains, but not too often because it's pretty expensive to
acquire the vacuum lock and move tuples around. I don't think we've
found the optimal solution yet. Separating the pruning and defragmenting
might help.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: HOT patch - version 15
Next
From: Florian Pflug
Date:
Subject: Re: HOT patch - version 15