Thread: HOT updates in index-less tables

HOT updates in index-less tables

From
Tom Lane
Date:
If a table has no indexes, we will always decide that any same-page
update operation is a HOT update, since obviously it isn't modifying
any indexed columns.  But is there any benefit to doing so?  I don't
see one offhand, and it has a downside: we're very likely to
encounter broken HOT chains if an index is created later.  That leads
to the sort of unexpected behavior exhibited here:
http://archives.postgresql.org/pgsql-performance/2010-11/msg00216.php

I'm thinking maybe HeapSatisfiesHOTUpdate should be changed so that it
always returns false if the relation has no indexes, which could be
checked cheaply via relation->rd_rel->relhasindex.
        regards, tom lane


Re: HOT updates in index-less tables

From
Hannu Krosing
Date:
On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:
> If a table has no indexes, we will always decide that any same-page
> update operation is a HOT update, since obviously it isn't modifying
> any indexed columns.  But is there any benefit to doing so? 

If we do the in-page "mini vacuum" even without HOT, then there should
be no benefit from index-less HOT updates. 

If we don't try the mini vacuum in this case, then some current
behaviuors could become much worse, say a table with one row and load of
updates.

> I don't
> see one offhand, and it has a downside: we're very likely to
> encounter broken HOT chains if an index is created later.  That leads
> to the sort of unexpected behavior exhibited here:
> http://archives.postgresql.org/pgsql-performance/2010-11/msg00216.php
> 
> I'm thinking maybe HeapSatisfiesHOTUpdate should be changed so that it
> always returns false if the relation has no indexes, which could be
> checked cheaply via relation->rd_rel->relhasindex.
> 
>             regards, tom lane
> 

-- 
-------
Hannu Krosing
PostgreSQL Infinite Scalability and Performance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/



Re: HOT updates in index-less tables

From
Tom Lane
Date:
Hannu Krosing <hannu@2ndQuadrant.com> writes:
> On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:
>> If a table has no indexes, we will always decide that any same-page
>> update operation is a HOT update, since obviously it isn't modifying
>> any indexed columns.  But is there any benefit to doing so? 

> If we do the in-page "mini vacuum" even without HOT, then there should
> be no benefit from index-less HOT updates. 

AFAICS we do: heap_update marks the page as prunable whether it's a HOT
update or not.  The only difference between treating the update as HOT vs
not-HOT is that if there was more than one HOT update, the intermediate
tuples could be completely reclaimed by page pruning (ie, their line
pointers go away too).  With not-HOT updates, the intermediate line
pointers would have to remain in DEAD state until vacuum, since page
pruning wouldn't know if there were index entries pointing at them.
But that seems like a pretty tiny penalty.
        regards, tom lane


Re: HOT updates in index-less tables

From
Hannu Krosing
Date:
On Sat, 2010-11-13 at 12:13 -0500, Tom Lane wrote:
> Hannu Krosing <hannu@2ndQuadrant.com> writes:
> > On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:
> >> If a table has no indexes, we will always decide that any same-page
> >> update operation is a HOT update, since obviously it isn't modifying
> >> any indexed columns.  But is there any benefit to doing so? 
> 
> > If we do the in-page "mini vacuum" even without HOT, then there should
> > be no benefit from index-less HOT updates. 
> 
> AFAICS we do: heap_update marks the page as prunable whether it's a HOT
> update or not.  The only difference between treating the update as HOT vs
> not-HOT is that if there was more than one HOT update, the intermediate
> tuples could be completely reclaimed by page pruning (ie, their line
> pointers go away too).  With not-HOT updates, the intermediate line
> pointers would have to remain in DEAD state until vacuum, 

How hard would it be to make the pruning logic be aware of there being
no indexes and thus no possibility of  index entries pointing at any
tuple ?

> since page
> pruning wouldn't know if there were index entries pointing at them.
> But that seems like a pretty tiny penalty.
> 
>             regards, tom lane

-- 
-------
Hannu Krosing
PostgreSQL Infinite Scalability and Performance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/



Re: HOT updates in index-less tables

From
Tom Lane
Date:
Hannu Krosing <hannu@2ndQuadrant.com> writes:
> On Sat, 2010-11-13 at 12:13 -0500, Tom Lane wrote:
>> AFAICS we do: heap_update marks the page as prunable whether it's a HOT
>> update or not.  The only difference between treating the update as HOT vs
>> not-HOT is that if there was more than one HOT update, the intermediate
>> tuples could be completely reclaimed by page pruning (ie, their line
>> pointers go away too).  With not-HOT updates, the intermediate line
>> pointers would have to remain in DEAD state until vacuum, 

> How hard would it be to make the pruning logic be aware of there being
> no indexes and thus no possibility of  index entries pointing at any
> tuple ?

I think it's problematic, because heap_page_prune can be executed with
only AccessShareLock on the table, which means there's a race condition
against concurrent CREATE INDEX.  You could look at relhasindex easily
enough, but that doesn't prove there's not a CREATE INDEX in progress.
        regards, tom lane


Re: HOT updates in index-less tables

From
Robert Haas
Date:
On Sat, Nov 13, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hannu Krosing <hannu@2ndQuadrant.com> writes:
>> On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:
>>> If a table has no indexes, we will always decide that any same-page
>>> update operation is a HOT update, since obviously it isn't modifying
>>> any indexed columns.  But is there any benefit to doing so?
>
>> If we do the in-page "mini vacuum" even without HOT, then there should
>> be no benefit from index-less HOT updates.
>
> AFAICS we do: heap_update marks the page as prunable whether it's a HOT
> update or not.  The only difference between treating the update as HOT vs
> not-HOT is that if there was more than one HOT update, the intermediate
> tuples could be completely reclaimed by page pruning (ie, their line
> pointers go away too).  With not-HOT updates, the intermediate line
> pointers would have to remain in DEAD state until vacuum, since page
> pruning wouldn't know if there were index entries pointing at them.
> But that seems like a pretty tiny penalty.

I'm not at all convinced that's a tiny penalty.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: HOT updates in index-less tables

From
Heikki Linnakangas
Date:
On 14.11.2010 00:29, Robert Haas wrote:
> On Sat, Nov 13, 2010 at 12:13 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> Hannu Krosing<hannu@2ndQuadrant.com>  writes:
>>> On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:
>>>> If a table has no indexes, we will always decide that any same-page
>>>> update operation is a HOT update, since obviously it isn't modifying
>>>> any indexed columns.  But is there any benefit to doing so?
>>
>>> If we do the in-page "mini vacuum" even without HOT, then there should
>>> be no benefit from index-less HOT updates.
>>
>> AFAICS we do: heap_update marks the page as prunable whether it's a HOT
>> update or not.  The only difference between treating the update as HOT vs
>> not-HOT is that if there was more than one HOT update, the intermediate
>> tuples could be completely reclaimed by page pruning (ie, their line
>> pointers go away too).  With not-HOT updates, the intermediate line
>> pointers would have to remain in DEAD state until vacuum, since page
>> pruning wouldn't know if there were index entries pointing at them.
>> But that seems like a pretty tiny penalty.
>
> I'm not at all convinced that's a tiny penalty.

Me neither. It's a tiny penalty when you consider one update, but if you 
repeatedly update the same tuple, you accumulate dead line pointers 
until the next real vacuum runs. With HOT updates, you reach a steady 
state where page pruning is all you need. Then again, if you're 
repeatedly updating a row in a table with no indexes, presumably it's a 
very small table or you would create an index on it. And frequently 
autovacuuming a small index is quite cheap too.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: HOT updates in index-less tables

From
Merlin Moncure
Date:
On Sun, Nov 14, 2010 at 1:12 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 14.11.2010 00:29, Robert Haas wrote:
>>
>> On Sat, Nov 13, 2010 at 12:13 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>>
>>> Hannu Krosing<hannu@2ndQuadrant.com>  writes:
>>>>
>>>> On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:
>>>>>
>>>>> If a table has no indexes, we will always decide that any same-page
>>>>> update operation is a HOT update, since obviously it isn't modifying
>>>>> any indexed columns.  But is there any benefit to doing so?
>>>
>>>> If we do the in-page "mini vacuum" even without HOT, then there should
>>>> be no benefit from index-less HOT updates.
>>>
>>> AFAICS we do: heap_update marks the page as prunable whether it's a HOT
>>> update or not.  The only difference between treating the update as HOT vs
>>> not-HOT is that if there was more than one HOT update, the intermediate
>>> tuples could be completely reclaimed by page pruning (ie, their line
>>> pointers go away too).  With not-HOT updates, the intermediate line
>>> pointers would have to remain in DEAD state until vacuum, since page
>>> pruning wouldn't know if there were index entries pointing at them.
>>> But that seems like a pretty tiny penalty.
>>
>> I'm not at all convinced that's a tiny penalty.
>
> Me neither. It's a tiny penalty when you consider one update, but if you
> repeatedly update the same tuple, you accumulate dead line pointers until
> the next real vacuum runs. With HOT updates, you reach a steady state where
> page pruning is all you need. Then again, if you're repeatedly updating a
> row in a table with no indexes, presumably it's a very small table or you
> would create an index on it. And frequently autovacuuming a small index is
> quite cheap too.

The case here is when you have say a control table that is managing a
gapless sequence, or a materialization table with a very small number
of records.  These type of tables get updated very frequently, perhaps
in every transaction.  People without detailed implementation
knowledge of postgresql might assume that leaving an index off the
table is faster in these situations.

The danger here is that if autovacuum is stalled for whatever reason,
you get exponentially bad behavior as the table gets stuffed with bad
records.  index-less hot was put in intentionally.  As autovacuum gets
smarter and smarter, the reasoning to do this get weaker.

merlin