Re: HOT patch, missing things - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: HOT patch, missing things
Date
Msg-id 46B9850B.7080704@enterprisedb.com
Whole thread Raw
In response to Re: HOT patch, missing things  ("Simon Riggs" <simon@2ndquadrant.com>)
Responses Re: HOT patch, missing things  (Gregory Stark <stark@enterprisedb.com>)
Re: HOT patch, missing things  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs wrote:
> On Tue, 2007-08-07 at 19:01 +0100, Heikki Linnakangas wrote:
>> There's three things clearly missing in the patch:
>>
>> 1. HOT updates on tables with expression or partial indexes. Hasn't been
>> done yet because it should be pretty straightforward and we've had more
>> important things to do. Though not critical, should be finished before
>> release in my opinion.
> 
> Only if we really are pretty much finished.
> 
>> 2. Pointer swinging. At the moment, after a row is HOT updated, the only
>> way to get rid of the redirecting line pointer is to run VACUUM FULL or
>> CLUSTER (or delete or cold update the row and vacuum). If we want to
>> implement pointer swinging before release, we have to get started now.
>> If we're happy to release without it and address the issue in later
>> releases if it seems important, we need to make a conscious decision on
>> that, now. I personally think we can release without it.
> 
> I think so too.
> 
>> 3. Statistics and autovacuum integration. How should HOT updates be
>> taken into account when deciding when to autovacuum and autoanalyze?
>> There's a FIXME comment in analyze.c related to this as well. What
>> additional statio counters do we need? The patch adds counters for HOT
>> updates and for following a HOT chain. Should we have counters for
>> pruning and defraging a page as well?
> 
> ISTM we should have stat counters (not statio counters) that describe
> the number of row versions defragged. Statio counters refer to block
> accesses, so HOT has nothing at all to do with that. Not sure we need to
> know about pruning, any more than we need to know about hint bits
> setting.
> 
> We should then perform a vacuum if
> ( number of cold updates
> + number of hot updates 
> + number of deletes
> - number of row versions removed by defragging)
>> (autovacuum threshold * size of table)
> 
> Defragging could remove deletes and cold updates as well as hot updates,
> so we must take that into account.

It seems you're confusing pruning and defragging. I should probably
update the glossary I wrote earlier...

In pruning, any HOT updated tuples that are no longer visible to anyone
are removed by marking the line pointer as unused, or turning it into a
redirecting line pointer if there isn't one already. Because no tuples
are moved, you only need a regular exclusive lock on the page to prune.

Defragmenting a page means calling the good old PageRepairFragmentation
function. We need a vacuum strength lock to do that, because it moves
existing tuples around in the page to squeeze out any empty space
between tuples.

In defragmenting, we can't count number of row versions removed, because
they've already been removed, and all that's left is empty space.

I think the formula for triggering autovacuum should be left unchanged,
(# of dead tuples > autovacuum threshold).

# of dead tuples should be increased by cold update and deletes, as
before. A hot update shouldn't increase it, because the old version can
be removed by pruning.

Because we can truncate dead tuples, even from cold updates and deletes,
to redirected dead line pointers which take much less space than dead
tuples, maybe we should increase the default autovacuum threshold?

The analyze threshold is trickier. HOT updates should probably be taken
into account, but with a smaller weight than other updates.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Pavan Deolasee"
Date:
Subject: Re: HOT patch, missing things
Next
From: "Pavan Deolasee"
Date:
Subject: Re: HOT patch - version 13