Thread: HOT WIP Patch - version 6.3

HOT WIP Patch - version 6.3

From
"Pavan Deolasee"
Date:
Please see the HOT version 6.3 patch posted on pgsql-patches.<br />I've implemented support for CREATE INDEX and CREATE
INDEX<br/>CONCURRENTLY based on the recent discussions. The implementation<br />is not yet complete and needs some more
testing/work/discussion<br />before we can start considering it for review.<br /><br />One of the regression test case
failsbecause CIC now works in<br />three phases. In the first phase, we just create the catalog entry<br />for the
indexand commit the transaction. If the index_build fails <br />because of any error (say, unique key constraint) the
indexcreation<br />fails, but the catalog entry remains.<br /><br />CREATE INDEX:<br />-----------------<br /><br />The
implementationis based on having an extra attribute in pg_index <br />to track the transaction xid which created the
indexand then use<br />that information to decide whether the newly created index should<br />be used in a query or
not.Here are couple of TODO items:<br /><br />Plan Invalidation: <br /><br />We decided to store transaction id of the
toplevel transaction in<br />the cached plan if one or more potentially useful indexes are<br />not available while
planninga query. And then replan if the<br />current transaction id is different that the one stored with the <br
/>plan.I'm not very well familiar with this code, so any suggestions<br />how to do it in a clean way ?<br /><br
/>Makingindex available in the creating transaction:<br /><br />This is an important TODO item. We would like to make
the<br />index immediately available to the transaction which created it,<br />if the transaction is running in
read-committedmode. If the <br />transaction is running in SERIALIZABLE mode, then we can't do<br />much because we
mighthave skipped one or more RECENTLY_DEAD <br />tuples while building the index and hence index can not be used.<br
/><br/>The way we build index now is that we only index the tuple at the head<br />of the HOT-chain. So there could be
DELETE_IN_PROGRESS<br/>tuples (updated/deleted by the transaction which is creating the <br />index) which we skipped
whilebuilding the index. My question<br />is, is there a case where this transaction may use the new index<br />and
stillsee those tuples ? I know that the DELETE_IN_PROGRESS<br />tuples are visible if there are any open cursors. But
thenplans for <br />these open cursors can not be changed until they are closed<br />and reopened, isn't it ? Tom
mentionedabout recursive plpgsql<br />functions where the outer instance can use an older snapshot.<br />I tried that
butcould not produce a scenario where the outer instance <br />could see the DELETE_IN_PROGRESS tuple if the tuple is
updated<br/>in the inner instance. Can someone help me with an example where<br />a read-committed transaction would
usethe newly created index<br />and still see the DELETE_IN_PROGRESS tuple ? <br /><br /><br />CREATE INDEX
CONCURRENTLY:<br/>------------------------------<br /><br />One of the item which needs review and discussion is the
handling<br/>on unique key checks while creating the index concurrently. We build<br /> the index in three phases. In
thefirst phase, we just create the catalog<br />entry and mark index invalid for inserts. This ensures that
transactions<br/>started after that won't create HOT-chains that break the HOT property <br />for the new index. In the
secondphase, we build the index by applying<br />the reference snapshot to the heap tuples. In the third phase, we<br
/>validatethe index and insert any missing entries.<br /><br />In this phase, we only insert if index entry for the
roottuple is missing. <br />So there is just one insert operation which covers all the tuples in the<br />HOT-chain. In
orderto check unique key violations, inside<br />_bt_check_unique() function when a duplicate key is found, we
follow<br/>the entire HOT-chain and check if any tuple in the chain is live. If so, <br />unique key violation
constraintis raised. IOW if any two HOT-chains<br />share the same key and have one live tuple, unique key
constraint<br/>is considered violated. Can anyone spot a hole in this logic ?<br /><br /><br />Thanks, <br />Pavan<br
clear="all"/><br />-- <br /><br />EnterpriseDB     <a
href="http://www.enterprisedb.com">http://www.enterprisedb.com</a>