Re: HOT patch - version 15 - Mailing list pgsql-patches

From Pavan Deolasee
Subject Re: HOT patch - version 15
Date
Msg-id 2e78013d0709100257s345821e4w4315d2342fab9d67@mail.gmail.com
Whole thread Raw
In response to Re: HOT patch - version 15  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
List pgsql-patches


On 9/10/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:


I'm repeating myself, but I would split that second operation into two
parts while we think about this: pruning the entire page, and
defragmenting (PageRepairFragmentation). Tom wondered why they're
separated in the patch. As the patch stands, there is no reason, but I
feel that separating them and doing them at different times might be an
important piece in the puzzle.



I think we all agree that defragmentation should be done less aggressively
and ideally when we are going to insert a tuple in the page and running
low on free space in the page. If we could really do it in heap_update(), after
we figure out that there is not enough free space available in the page,
that will be great. We already have exclusive lock on the page and hopefully
we also have the vacuum-lock. (This is how earlier version of the patch used to work - before we took out all the complexity associated with tracking LP_DELETEd tuples,
tracking fragmented tuples and replaced them with simple PageRepairFragmentation
Since in the earlier version, we never used to call PageRepairFragmentation,
we could easily do pruning in heap_update())

If we can't find a way to defragment inside heap_update(), then we have three
choices:

1. Pass a hint to heap_fetch that a UPDATE may follow. If we are running low
on free space, we try to prune and defragment at that point.

2. Move some of the work to bgwriter.

3. Use some heuristic to decide whether to defragment or not.

I am not sure how easy it is to know apriori that we are fetching a tuple
for UPDATE. Assuming we can, this seems like a good idea.

Eventually we may want to move some work to bgwriter or some other
background process. But my take would be to save that for 8.4

As a starting point, we are doing 3 in the patch. We always try to
keep just one tuple worth of space free in the page. So if an UPDATE
takes up  remaining free space in the page, the page will be
pruned/defraged in the subsequent lookup (index or seq). In read-mostly scenario,
only the first lookup would trigger prune/defrag, but next many lookups
would be quick. In update-mostly scenario, most of the heap_fetches
would anyways end up doing update, so doing pruning/defragmentation
in heap_fetch shouldn't be too bad. For a balanced scenario, we might
be moving cost of pruning/defragmentation to the SELECT path, but
UPDATEs would be quick.


The other issue is when to prune the HOT chain. I tend to agree that
the chances of having long HOT chains are less and the cost of following the
chain won't be too significant. At the same we probably don't want to live
with very long chains forever. An example would be: a tuple gets HOT updated
several times in a transaction. If the page is never updated again, the long HOT
chain would never be pruned. I like Simon's idea of pruning the chain if it
goes beyond a limit. Instead of adding any significant complexity to
track length of each HOT chain, we can just mark the page with a flag
if heap_hot_fetch() detects a long chain. The next lookup (index or seq)
of the page will try to prune the page and clear the flag if successful.

I also agree that we should avoid taking exclusive lock before checking
free space in heap_page_prune_defrag(). That might be unsafe, but we
don't care if you occasionally skip the maintenance work (or do it a
little early)

Thanks,
Pavan

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Simon Riggs
Date:
Subject: Re: HOT patch - version 15
Next
From: "Heikki Linnakangas"
Date:
Subject: Re: HOT patch - version 15