Re: CREATE INDEX and HOT - revised design - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: CREATE INDEX and HOT - revised design
Date
Msg-id 2e78013d0703220941y31aba7f2i442740159db1cf4e@mail.gmail.com
Whole thread Raw
In response to Re: CREATE INDEX and HOT - revised design  (Bruce Momjian <bruce@momjian.us>)
Responses Re: CREATE INDEX and HOT - revised design  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: CREATE INDEX and HOT - revised design  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-hackers


On 3/21/07, Bruce Momjian <bruce@momjian.us> wrote:


A different idea is to flag the _index_ as using HOT for the table or
not, using a boolean in pg_index.  The idea is that when a new index is
created, it has its HOT boolean set to false and indexes all tuples and
ignores HOT chains.  Then doing lookups using that index, the new index
does not follow HOT chains.  We also add a boolean to pg_class to
indicate no new HOT chains should be created and set that to false once
the new index is created.  Then, at some later time when all HOT chains
are dead, we can enable HOT chain following for the new index and allow
new HOT chains to be created.

A more sophisticated idea would be to place an xid, rather than a
boolean, in pg_index to indicate which chains were created after the
index was created to control whether the index should follow that HOT
chain, or ignore it.  The xmax of the head of the HOT chain can be used
as an indicator of when the chain was created.  Transactions started
before the pg_index xid could continue following the old rules and
insert into the _new_ index for HOT chain additions, and new
transactions would create HOT chains that could skip adding to the new
index.  Cleanup of the hybrid HOT chains (some indexes take part, some
do not) would be more complex.



Bruce, thanks for bringing up this idea.

As I think more about this idea, I think I am able to extend this further
to solve the problems we discussed around it. One of my conerns
were that the change the basic structure of heap
with HOT-chains so that it should be possible to just look at the
heap tuple and say whether it has any index pointer or not.

The way I propose to extend/modify the idea is to use pg_index xid
as suggested by Bruce to mark the index. This xid would guide the
visibility of the index. As we all know, CREATE INDEX locks out
UPDATEs on the table and further UPDATEs are possible only after
the transaction creating the new index commits.

When CREATE INDEX starts, it acquires ShareLock on the table.
At this point we may have one or more HOT-update chains in the
table. Tuples in this chain may be visible to one or more running
transactions. The fact that we have ShareLock on the table means
that all tuples in the chain except the one at the head either
RECENTLY_DEAD or were UPDATEd by the same transaction
that is now running CREATE INDEX.

With this background, I propose to index ONLY the head of the
HOT-chain. The TID of the root tuple is used instead of the actual
TID of the tuple being indexed. This index will not be available to
the transactions which are started before the CREATE INDEX
transaction. Just like we use "indisvalid" flag to avoid including
an invalid index in the plan, we use the pg_index "xid" to decide
whether to use the index in the plan or not. Only transactions with
txid > pg_index:xid can see the index and use it.

In fact, the serializable transactions started before CREATE INDEX
can not anyway see the index so all this is done to handle
read-committed transactions.

In this proposal we indexed only the latest version. But none of the
transactions started after CREATE INDEX can anyway see the
older tuples and hence we should be fine even if we don't index
them in the new index. And none of the older transaction can see
the index, so again we are safe. The design also helps us to
preserve the heap HOT semantics and chain pruning and does not
need VACUUM or any special handling.

Can anyone spot a hole in this logic ? Comments ?

Thanks,
Pavan


--

EnterpriseDB     http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Regression failure in PL/Tcl
Next
From: Andreas Pflug
Date:
Subject: Re: TOASTing smaller things