Re: CREATE INDEX and HOT - revised design - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: CREATE INDEX and HOT - revised design
Date
Msg-id 1174636236.3826.20.camel@localhost.localdomain
Whole thread Raw
In response to Re: CREATE INDEX and HOT - revised design  ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Responses Re: CREATE INDEX and HOT - revised design
Re: CREATE INDEX and HOT - revised design
List pgsql-hackers
Ühel kenal päeval, N, 2007-03-22 kell 23:30, kirjutas Pavan Deolasee:
> 
> 
> On 3/22/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>         "Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
>         > When CREATE INDEX starts, it acquires ShareLock on the
>         table.
>         > At this point we may have one or more HOT-update chains in
>         the 
>         > table. Tuples in this chain may be visible to one or more
>         running
>         > transactions. The fact that we have ShareLock on the table
>         means
>         > that all tuples in the chain except the one at the head
>         either
>         > RECENTLY_DEAD or were UPDATEd by the same transaction
>         > that is now running CREATE INDEX.
>         
>         This logic seems pretty questionable to me in view of the
>         recent VACUUM
>         FULL bug fixes.  We now know that an update chain can contain
>         tuples that 
>         appear DEAD but are later than ones that are
>         RECENTLY_DEAD.  How are
>         you defining a HOT chain exactly --- will it be just a part of
>         an
>         update chain that consists only of contiguous tuples that
>         appear live or
>         RECENTLY_DEAD to the CREATE INDEX transaction?
> 
> 
> HOT-chain is something which contains contiguous tuples that share
> the same index keys for all the existing indexes and share a single
> index entry. It does not really matter if there are RECENTLY_DEAD 
> tuples that appear before a DEAD tuple because from VACUUM FULL
> bug fix we know that they are DEAD too.
> 
> My argument is that its enough to index only the  LIVE tuple which
> is at the end of the chain if we don't use the new index for queries 
> in transactions which were started before CREATE INDEX.

You mean, which were started before CREATE INDEX completes ?

Just wait for all concurrent transactions to complete before marking the
index as usable in plans.

>  I am
> proposing to do that by storing an xid in the pg_index row. 

I don't think it is a good idea to store xid's anywhere but in xmin/xmax
columns, as doing so would cause nasty xid wraparound problems.

Instead you should wait, after completeing the index , for all
concurrent transactions to end before you mark the index as "usable for
queries", similar to the way CREATE INDEX CONCURRENTLY does.

> A
> special case is where a tuple is UPDATEd multiple times by
> the same transaction which is also creating the index, in which case 
> there are more than one LIVE versions of the tuple. But again
> we are safe by indexing only the latest version because all other
> versions would be invisible (even to us) once CREATE INDEX commits.
> 
> 
>         > In fact, the serializable transactions started before CREATE
>         INDEX
>         > can not anyway see the index so all this is done to handle
>         > read-committed transactions.
>         
>         You are laboring under an illusion that system catalog
>         accesses are MVCC. 
>         SnapshotNow does not behave that way: the system can see the
>         new index
>         as soon as it's committed.  (It had better, since it has to
>         start
>         updating the index immediately, whether it's safe to scan it
>         or not.) 
>         I'm not sure whether that's fundamental to your argument or
>         not, but
>         it's certainly wrong.
> 
> 
> Oh, thanks for pointing that out. But thats certainly not fundamental
> to the argument as you probably already guessed. The xid still
> controls the usage of index for query planning, somewhat similar to
> "isindvalid" flag for CREATE INDEX CONCURRENTLY.

Xids are unstable and will come back to bite you after 2G transactions.

Why not just use the "isindvalid" flag ?


-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




pgsql-hackers by date:

Previous
From: Andrew - Supernews
Date:
Subject: Re: LIKE optimization in UTF-8 and locale-C
Next
From: Hannu Krosing
Date:
Subject: Re: CREATE INDEX and HOT - revised design