HOT WIP Patch - version 6.3 - Mailing list pgsql-hackers
From | Pavan Deolasee |
---|---|
Subject | HOT WIP Patch - version 6.3 |
Date | |
Msg-id | 2e78013d0704020521k1e6d0adeg98347fcdb4a9a4c7@mail.gmail.com Whole thread Raw |
List | pgsql-hackers |
Please see the HOT version 6.3 patch posted on pgsql-patches.<br />I've implemented support for CREATE INDEX and CREATE INDEX<br/>CONCURRENTLY based on the recent discussions. The implementation<br />is not yet complete and needs some more testing/work/discussion<br />before we can start considering it for review.<br /><br />One of the regression test case failsbecause CIC now works in<br />three phases. In the first phase, we just create the catalog entry<br />for the indexand commit the transaction. If the index_build fails <br />because of any error (say, unique key constraint) the indexcreation<br />fails, but the catalog entry remains.<br /><br />CREATE INDEX:<br />-----------------<br /><br />The implementationis based on having an extra attribute in pg_index <br />to track the transaction xid which created the indexand then use<br />that information to decide whether the newly created index should<br />be used in a query or not.Here are couple of TODO items:<br /><br />Plan Invalidation: <br /><br />We decided to store transaction id of the toplevel transaction in<br />the cached plan if one or more potentially useful indexes are<br />not available while planninga query. And then replan if the<br />current transaction id is different that the one stored with the <br />plan.I'm not very well familiar with this code, so any suggestions<br />how to do it in a clean way ?<br /><br />Makingindex available in the creating transaction:<br /><br />This is an important TODO item. We would like to make the<br />index immediately available to the transaction which created it,<br />if the transaction is running in read-committedmode. If the <br />transaction is running in SERIALIZABLE mode, then we can't do<br />much because we mighthave skipped one or more RECENTLY_DEAD <br />tuples while building the index and hence index can not be used.<br /><br/>The way we build index now is that we only index the tuple at the head<br />of the HOT-chain. So there could be DELETE_IN_PROGRESS<br/>tuples (updated/deleted by the transaction which is creating the <br />index) which we skipped whilebuilding the index. My question<br />is, is there a case where this transaction may use the new index<br />and stillsee those tuples ? I know that the DELETE_IN_PROGRESS<br />tuples are visible if there are any open cursors. But thenplans for <br />these open cursors can not be changed until they are closed<br />and reopened, isn't it ? Tom mentionedabout recursive plpgsql<br />functions where the outer instance can use an older snapshot.<br />I tried that butcould not produce a scenario where the outer instance <br />could see the DELETE_IN_PROGRESS tuple if the tuple is updated<br/>in the inner instance. Can someone help me with an example where<br />a read-committed transaction would usethe newly created index<br />and still see the DELETE_IN_PROGRESS tuple ? <br /><br /><br />CREATE INDEX CONCURRENTLY:<br/>------------------------------<br /><br />One of the item which needs review and discussion is the handling<br/>on unique key checks while creating the index concurrently. We build<br /> the index in three phases. In thefirst phase, we just create the catalog<br />entry and mark index invalid for inserts. This ensures that transactions<br/>started after that won't create HOT-chains that break the HOT property <br />for the new index. In the secondphase, we build the index by applying<br />the reference snapshot to the heap tuples. In the third phase, we<br />validatethe index and insert any missing entries.<br /><br />In this phase, we only insert if index entry for the roottuple is missing. <br />So there is just one insert operation which covers all the tuples in the<br />HOT-chain. In orderto check unique key violations, inside<br />_bt_check_unique() function when a duplicate key is found, we follow<br/>the entire HOT-chain and check if any tuple in the chain is live. If so, <br />unique key violation constraintis raised. IOW if any two HOT-chains<br />share the same key and have one live tuple, unique key constraint<br/>is considered violated. Can anyone spot a hole in this logic ?<br /><br /><br />Thanks, <br />Pavan<br clear="all"/><br />-- <br /><br />EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a>
pgsql-hackers by date: