Re: CREATE INDEX and HOT - revised design - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: CREATE INDEX and HOT - revised design
Date
Msg-id 4601565A.1080705@enterprisedb.com
Whole thread Raw
In response to Re: CREATE INDEX and HOT - revised design  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: CREATE INDEX and HOT - revised design  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Sorry, I was a bit too quick to respond. I didn't understand at first 
how this differs from Pavan's/Simon's proposals.

Let me answer my own questions.

Heikki Linnakangas wrote:
> Bruce Momjian wrote:
>> A different idea is to flag the _index_ as using HOT for the table or
>> not, using a boolean in pg_index.  The idea is that when a new index is
>> created, it has its HOT boolean set to false and indexes all tuples and
>> ignores HOT chains.  Then doing lookups using that index, the new index
>> does not follow HOT chains.  We also add a boolean to pg_class to
>> indicate no new HOT chains should be created and set that to false once
>> the new index is created.  Then, at some later time when all HOT chains
>> are dead, we can enable HOT chain following for the new index and allow
>> new HOT chains to be created.
> 
> When exactly would all HOT chains be dead? AFAICS, that would be after 
> the xid of CREATE INDEX gets older than oldest xmin, and VACUUM is run 
> to prune and pointer-swing all HOT chains.

I still think that's true.

> Would we have to wait after setting the new forbid_hot_updates-flag in 
> pg_class, to make sure everyone sees the change? What if CREATE INDEX 
> crashes, would we need a vacuum to reset the flag?

You wouldn't need to do any extra waits to set the forbid_hot_updates 
flag, CREATE INDEX locks the table and already sends a relcache 
invalidations to make the new index visible. CREATE INDEX CONCURRENTLY 
waits already.

>> A more sophisticated idea would be to place an xid, rather than a
>> boolean, in pg_index to indicate which chains were created after the
>> index was created to control whether the index should follow that HOT
>> chain, or ignore it.  The xmax of the head of the HOT chain can be used
>> as an indicator of when the chain was created.  Transactions started
>> before the pg_index xid could continue following the old rules and
>> insert into the _new_ index for HOT chain additions, and new
>> transactions would create HOT chains that could skip adding to the new
>> index.  Cleanup of the hybrid HOT chains (some indexes take part, some
>> do not) would be more complex.  
> 
> What xid would you place in pg_index? Xid of the transaction running 
> CREATE INDEX, ReadNewTransactionId() or what?

Apparently ReadNewTransactionId to make sure there's no existing tuples 
with an xmax smaller than that.

> How does that work if you have a transaction that begins before CREATE 
> INDEX, and updates something after CREATE INDEX?

You actually explained that above...

The HOT_UPDATED flag on a tuple would basically mean that all indexes 
with xid < xmax doesn't contain an index pointer for the tuple, and all 
others do. When inserting new updated tuples, we'd also need to maintain 
that invariant.

>> I know we have xid wrap-around, but I think the VACUUM FREEZE could
>> handle it by freezing the pg_index xid column value when it does the
>> table.
> 
> I don't think you can freeze the xid-column, we went through a similar 
> discussion on pg_class.relfrozenxid. But you can move it forward to 
> oldest xmin.

You could actually "freeze" the column, because unlike relfrozenid we 
never need to unfreeze it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Shane Ambler
Date:
Subject: Re: Money type todos?
Next
From: "Simon Riggs"
Date:
Subject: Re: HOT WIP Patch - Version 5.0