Maintaining cluster order on insert - Mailing list pgsql-patches
From | Heikki Linnakangas |
---|---|
Subject | Maintaining cluster order on insert |
Date | |
Msg-id | 44DA31B1.3090700@enterprisedb.com Whole thread Raw |
Responses |
Re: Maintaining cluster order on insert
Re: Maintaining cluster order on insert Re: Maintaining cluster order on insert |
List | pgsql-patches |
While thinking about index-organized-tables and similar ideas, it occurred to me that there's some low-hanging-fruit: maintaining cluster order on inserts by trying to place new heap tuples close to other similar tuples. That involves asking the index am where on the heap the new tuple should go, and trying to insert it there before using the FSM. Using the new fillfactor parameter makes it more likely that there's room on the page. We don't worry about the order within the page. The API I'm thinking of introduces a new optional index am function, amsuggestblock (suggestions for a better name are welcome). It gets the same parameters as aminsert, and returns the heap block number that would be optimal place to put the new tuple. It's be called from ExecInsert before inserting the heap tuple, and the suggestion is passed on to heap_insert and RelationGetBufferForTuple. I wrote a little patch to implement this for btree, attached. This could be optimized by changing the existing aminsert API, because as it is, an insert will have to descend the btree twice. Once in amsuggestblock and then in aminsert. amsuggestblock could keep the right index page pinned so aminsert could locate it quicker. But I wanted to keep this simple for now. Another improvement might be to allow amsuggestblock to return a list of suggestions, but that makes it more expensive to insert if there isn't room in the suggested pages, since heap_insert will have to try them all before giving up. Comments regarding the general idea or the patch? There should probably be a index option to turn the feature on and off. You'll want to turn it off when you first load a table, and turn it on after CLUSTER to keep it clustered. Since there's been discussion on keeping the TODO list more up-to-date, I hereby officially claim the "Automatically maintain clustering on a table" TODO item :). Feel free to bombard me with requests for status reports. And just to be clear, I'm not trying to sneak this into 8.2 anymore, this is 8.3 stuff. I won't be implementing a background daemon described on the TODO item, since that would essentially be an online version of CLUSTER. Which sure would be nice, but that's a different story. - Heikki Index: doc/src/sgml/catalogs.sgml =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/doc/src/sgml/catalogs.sgml,v retrieving revision 2.129 diff -c -r2.129 catalogs.sgml *** doc/src/sgml/catalogs.sgml 31 Jul 2006 20:08:55 -0000 2.129 --- doc/src/sgml/catalogs.sgml 8 Aug 2006 16:17:21 -0000 *************** *** 499,504 **** --- 499,511 ---- <entry>Function to parse and validate reloptions for an index</entry> </row> + <row> + <entry><structfield>amsuggestblock</structfield></entry> + <entry><type>regproc</type></entry> + <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry> + <entry>Get the best place in the heap to put a new tuple</entry> + </row> + </tbody> </tgroup> </table> Index: doc/src/sgml/indexam.sgml =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/doc/src/sgml/indexam.sgml,v retrieving revision 2.16 diff -c -r2.16 indexam.sgml *** doc/src/sgml/indexam.sgml 31 Jul 2006 20:08:59 -0000 2.16 --- doc/src/sgml/indexam.sgml 8 Aug 2006 17:15:25 -0000 *************** *** 391,396 **** --- 391,414 ---- <function>amoptions</> to test validity of options settings. </para> + <para> + <programlisting> + BlockNumber + amsuggestblock (Relation indexRelation, + Datum *values, + bool *isnull, + Relation heapRelation); + </programlisting> + Gets the optimal place in the heap for a new tuple. The parameters + correspond the parameters for <literal>aminsert</literal>. + This function is called on the clustered index before a new tuple + is inserted to the heap, and it should choose the optimal insertion + target page on the heap in such manner that the heap stays as close + as possible to the index order. + <literal>amsuggestblock</literal> can return InvalidBlockNumber if + the index am doesn't have a suggestion. + </para> + </sect1> <sect1 id="index-scanning"> Index: src/backend/access/heap/heapam.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.218 diff -c -r1.218 heapam.c *** src/backend/access/heap/heapam.c 31 Jul 2006 20:08:59 -0000 1.218 --- src/backend/access/heap/heapam.c 8 Aug 2006 16:17:21 -0000 *************** *** 1325,1330 **** --- 1325,1335 ---- * use_fsm is passed directly to RelationGetBufferForTuple, which see for * more info. * + * suggested_blk can be set by the caller to hint heap_insert which + * block would be the best place to put the new tuple in. heap_insert can + * ignore the suggestion, if there's not enough room on that block. + * InvalidBlockNumber means no preference. + * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated * to match the stored tuple; in particular tup->t_self receives the actual *************** *** 1333,1339 **** */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; --- 1338,1344 ---- */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm, BlockNumber suggested_blk) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; *************** *** 1386,1392 **** /* Find buffer to insert this tuple into */ buffer = RelationGetBufferForTuple(relation, heaptup->t_len, ! InvalidBuffer, use_fsm); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); --- 1391,1397 ---- /* Find buffer to insert this tuple into */ buffer = RelationGetBufferForTuple(relation, heaptup->t_len, ! InvalidBuffer, use_fsm, suggested_blk); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); *************** *** 1494,1500 **** Oid simple_heap_insert(Relation relation, HeapTuple tup) { ! return heap_insert(relation, tup, GetCurrentCommandId(), true, true); } /* --- 1499,1506 ---- Oid simple_heap_insert(Relation relation, HeapTuple tup) { ! return heap_insert(relation, tup, GetCurrentCommandId(), true, ! true, InvalidBlockNumber); } /* *************** *** 2079,2085 **** { /* Assume there's no chance to put heaptup on same page. */ newbuf = RelationGetBufferForTuple(relation, heaptup->t_len, ! buffer, true); } else { --- 2085,2092 ---- { /* Assume there's no chance to put heaptup on same page. */ newbuf = RelationGetBufferForTuple(relation, heaptup->t_len, ! buffer, true, ! InvalidBlockNumber); } else { *************** *** 2096,2102 **** */ LockBuffer(buffer, BUFFER_LOCK_UNLOCK); newbuf = RelationGetBufferForTuple(relation, heaptup->t_len, ! buffer, true); } else { --- 2103,2110 ---- */ LockBuffer(buffer, BUFFER_LOCK_UNLOCK); newbuf = RelationGetBufferForTuple(relation, heaptup->t_len, ! buffer, true, ! InvalidBlockNumber); } else { Index: src/backend/access/heap/hio.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/hio.c,v retrieving revision 1.63 diff -c -r1.63 hio.c *** src/backend/access/heap/hio.c 3 Jul 2006 22:45:37 -0000 1.63 --- src/backend/access/heap/hio.c 9 Aug 2006 18:03:01 -0000 *************** *** 93,98 **** --- 93,100 ---- * any committed data of other transactions. (See heap_insert's comments * for additional constraints needed for safe usage of this behavior.) * + * If the caller has a suggestion, it's passed in suggestedBlock. + * * We always try to avoid filling existing pages further than the fillfactor. * This is OK since this routine is not consulted when updating a tuple and * keeping it on the same page, which is the scenario fillfactor is meant *************** *** 103,109 **** */ Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer, bool use_fsm) { Buffer buffer = InvalidBuffer; Page pageHeader; --- 105,112 ---- */ Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer, bool use_fsm, ! BlockNumber suggestedBlock) { Buffer buffer = InvalidBuffer; Page pageHeader; *************** *** 135,142 **** otherBlock = InvalidBlockNumber; /* just to keep compiler quiet */ /* ! * We first try to put the tuple on the same page we last inserted a tuple ! * on, as cached in the relcache entry. If that doesn't work, we ask the * shared Free Space Map to locate a suitable page. Since the FSM's info * might be out of date, we have to be prepared to loop around and retry * multiple times. (To insure this isn't an infinite loop, we must update --- 138,147 ---- otherBlock = InvalidBlockNumber; /* just to keep compiler quiet */ /* ! * We first try to put the tuple on the page suggested by the caller, if ! * any. Then we try to put the tuple on the same page we last inserted a ! * tuple on, as cached in the relcache entry. If that doesn't work, we ! * ask the * shared Free Space Map to locate a suitable page. Since the FSM's info * might be out of date, we have to be prepared to loop around and retry * multiple times. (To insure this isn't an infinite loop, we must update *************** *** 144,152 **** * not to be suitable.) If the FSM has no record of a page with enough * free space, we give up and extend the relation. * ! * When use_fsm is false, we either put the tuple onto the existing target ! * page or extend the relation. */ if (len + saveFreeSpace <= MaxTupleSize) targetBlock = relation->rd_targblock; else --- 149,167 ---- * not to be suitable.) If the FSM has no record of a page with enough * free space, we give up and extend the relation. * ! * When use_fsm is false, we skip the fsm lookup if neither the suggested ! * nor the cached last insertion page has enough room, and extend the ! * relation. ! * ! * The fillfactor is taken into account when calculating the free space ! * on the cached target block, and when using the FSM. The suggested page ! * is used whenever there's enough room in it, regardless of the fillfactor, ! * because that's exactly the purpose the space is reserved for in the ! * first place. */ + if (suggestedBlock != InvalidBlockNumber) + targetBlock = suggestedBlock; + else if (len + saveFreeSpace <= MaxTupleSize) targetBlock = relation->rd_targblock; else *************** *** 219,224 **** --- 234,244 ---- */ pageHeader = (Page) BufferGetPage(buffer); pageFreeSpace = PageGetFreeSpace(pageHeader); + + /* If we're trying the suggested block, don't care about fillfactor */ + if (targetBlock == suggestedBlock && len <= pageFreeSpace) + return buffer; + if (len + saveFreeSpace <= pageFreeSpace) { /* use this page as future insert target, too */ *************** *** 241,246 **** --- 261,275 ---- ReleaseBuffer(buffer); } + /* If we just tried the suggested block, try the cached target + * block next, before consulting the FSM. */ + if(suggestedBlock == targetBlock) + { + targetBlock = relation->rd_targblock; + suggestedBlock = InvalidBlockNumber; + continue; + } + /* Without FSM, always fall out of the loop and extend */ if (!use_fsm) break; Index: src/backend/access/index/genam.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/index/genam.c,v retrieving revision 1.58 diff -c -r1.58 genam.c *** src/backend/access/index/genam.c 31 Jul 2006 20:08:59 -0000 1.58 --- src/backend/access/index/genam.c 8 Aug 2006 16:17:21 -0000 *************** *** 259,261 **** --- 259,275 ---- pfree(sysscan); } + + /* + * This is a dummy implementation of amsuggestblock, to be used for index + * access methods that don't or can't support it. It just returns + * InvalidBlockNumber, which means "no preference". + * + * This is probably not a good best place for this function, but it doesn't + * fit naturally anywhere else either. + */ + Datum + dummysuggestblock(PG_FUNCTION_ARGS) + { + PG_RETURN_UINT32(InvalidBlockNumber); + } Index: src/backend/access/index/indexam.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/index/indexam.c,v retrieving revision 1.94 diff -c -r1.94 indexam.c *** src/backend/access/index/indexam.c 31 Jul 2006 20:08:59 -0000 1.94 --- src/backend/access/index/indexam.c 8 Aug 2006 16:17:21 -0000 *************** *** 18,23 **** --- 18,24 ---- * index_rescan - restart a scan of an index * index_endscan - end a scan * index_insert - insert an index tuple into a relation + * index_suggestblock - get desired insert location for a heap tuple * index_markpos - mark a scan position * index_restrpos - restore a scan position * index_getnext - get the next tuple from a scan *************** *** 202,207 **** --- 203,237 ---- BoolGetDatum(check_uniqueness))); } + /* ---------------- + * index_suggestblock - get desired insert location for a heap tuple + * + * The returned BlockNumber is the *heap* page that is the best place + * to insert the given tuple to, according to the index am. The best + * place is usually one that maintains the cluster order. + * ---------------- + */ + BlockNumber + index_suggestblock(Relation indexRelation, + Datum *values, + bool *isnull, + Relation heapRelation) + { + FmgrInfo *procedure; + + RELATION_CHECKS; + GET_REL_PROCEDURE(amsuggestblock); + + /* + * have the am's suggestblock proc do all the work. + */ + return DatumGetUInt32(FunctionCall4(procedure, + PointerGetDatum(indexRelation), + PointerGetDatum(values), + PointerGetDatum(isnull), + PointerGetDatum(heapRelation))); + } + /* * index_beginscan - start a scan of an index with amgettuple * Index: src/backend/access/nbtree/nbtinsert.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/nbtree/nbtinsert.c,v retrieving revision 1.142 diff -c -r1.142 nbtinsert.c *** src/backend/access/nbtree/nbtinsert.c 25 Jul 2006 19:13:00 -0000 1.142 --- src/backend/access/nbtree/nbtinsert.c 9 Aug 2006 17:51:33 -0000 *************** *** 146,151 **** --- 146,221 ---- } /* + * _bt_suggestblock() -- Find the heap block of the closest index tuple. + * + * The logic to find the target should match _bt_doinsert, otherwise + * we'll be making bad suggestions. + */ + BlockNumber + _bt_suggestblock(Relation rel, IndexTuple itup, Relation heapRel) + { + int natts = rel->rd_rel->relnatts; + OffsetNumber offset; + Page page; + BTPageOpaque opaque; + + ScanKey itup_scankey; + BTStack stack; + Buffer buf; + IndexTuple curitup; + BlockNumber suggestion = InvalidBlockNumber; + + /* we need an insertion scan key to do our search, so build one */ + itup_scankey = _bt_mkscankey(rel, itup); + + /* find the first page containing this key */ + stack = _bt_search(rel, natts, itup_scankey, false, &buf, BT_READ); + if(!BufferIsValid(buf)) + { + /* The index was completely empty. No suggestion then. */ + return InvalidBlockNumber; + } + /* we don't need the stack, so free it right away */ + _bt_freestack(stack); + + page = BufferGetPage(buf); + opaque = (BTPageOpaque) PageGetSpecialPointer(page); + + /* Find the location in the page where the new index tuple would go to. */ + + offset = _bt_binsrch(rel, buf, natts, itup_scankey, false); + if (offset > PageGetMaxOffsetNumber(page)) + { + /* _bt_binsrch returned pointer to end-of-page. It means that + * there was no equal items on the page, and the new item should + * be inserted as the last tuple of the page. There could be equal + * items on the next page, however. + * + * At the moment, we just ignore the potential equal items on the + * right, and pretend there isn't any. We could instead walk right + * to the next page to check that, but let's keep it simple for now. + */ + offset = OffsetNumberPrev(offset); + } + if(offset < P_FIRSTDATAKEY(opaque)) + { + /* We landed on an empty page. We could step left or right until + * we find some items, but let's keep it simple for now. + */ + } else { + /* We're now positioned at the index tuple that we're interested in. */ + + curitup = (IndexTuple) PageGetItem(page, PageGetItemId(page, offset)); + suggestion = ItemPointerGetBlockNumber(&curitup->t_tid); + } + + _bt_relbuf(rel, buf); + _bt_freeskey(itup_scankey); + + return suggestion; + } + + /* * _bt_check_unique() -- Check for violation of unique index constraint * * Returns InvalidTransactionId if there is no conflict, else an xact ID Index: src/backend/access/nbtree/nbtree.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/nbtree/nbtree.c,v retrieving revision 1.149 diff -c -r1.149 nbtree.c *** src/backend/access/nbtree/nbtree.c 10 May 2006 23:18:39 -0000 1.149 --- src/backend/access/nbtree/nbtree.c 9 Aug 2006 18:04:02 -0000 *************** *** 228,233 **** --- 228,265 ---- } /* + * btsuggestblock() -- find the best place in the heap to put a new tuple. + * + * This uses the same logic as btinsert to find the place where the index + * tuple would go if this was a btinsert call. + * + * There's room for improvement here. An insert operation will descend + * the tree twice, first by btsuggestblock, then by btinsert. Things + * might have changed in between, so that the heap tuple is actually + * not inserted in the optimal page, but since this is just an + * optimization, it's ok if it happens sometimes. + */ + Datum + btsuggestblock(PG_FUNCTION_ARGS) + { + Relation rel = (Relation) PG_GETARG_POINTER(0); + Datum *values = (Datum *) PG_GETARG_POINTER(1); + bool *isnull = (bool *) PG_GETARG_POINTER(2); + Relation heapRel = (Relation) PG_GETARG_POINTER(3); + IndexTuple itup; + BlockNumber suggestion; + + /* generate an index tuple */ + itup = index_form_tuple(RelationGetDescr(rel), values, isnull); + + suggestion =_bt_suggestblock(rel, itup, heapRel); + + pfree(itup); + + PG_RETURN_UINT32(suggestion); + } + + /* * btgettuple() -- Get the next tuple in the scan. */ Datum Index: src/backend/executor/execMain.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/executor/execMain.c,v retrieving revision 1.277 diff -c -r1.277 execMain.c *** src/backend/executor/execMain.c 31 Jul 2006 01:16:37 -0000 1.277 --- src/backend/executor/execMain.c 8 Aug 2006 16:17:21 -0000 *************** *** 892,897 **** --- 892,898 ---- resultRelInfo->ri_RangeTableIndex = resultRelationIndex; resultRelInfo->ri_RelationDesc = resultRelationDesc; resultRelInfo->ri_NumIndices = 0; + resultRelInfo->ri_ClusterIndex = -1; resultRelInfo->ri_IndexRelationDescs = NULL; resultRelInfo->ri_IndexRelationInfo = NULL; /* make a copy so as not to depend on relcache info not changing... */ *************** *** 1388,1394 **** heap_insert(estate->es_into_relation_descriptor, tuple, estate->es_snapshot->curcid, estate->es_into_relation_use_wal, ! false); /* never any point in using FSM */ /* we know there are no indexes to update */ heap_freetuple(tuple); IncrAppended(); --- 1389,1396 ---- heap_insert(estate->es_into_relation_descriptor, tuple, estate->es_snapshot->curcid, estate->es_into_relation_use_wal, ! false, /* never any point in using FSM */ ! InvalidBlockNumber); /* we know there are no indexes to update */ heap_freetuple(tuple); IncrAppended(); *************** *** 1419,1424 **** --- 1421,1427 ---- ResultRelInfo *resultRelInfo; Relation resultRelationDesc; Oid newId; + BlockNumber suggestedBlock; /* * get the heap tuple out of the tuple table slot, making sure we have a *************** *** 1467,1472 **** --- 1470,1479 ---- if (resultRelationDesc->rd_att->constr) ExecConstraints(resultRelInfo, slot, estate); + /* Ask the index am of the clustered index for the + * best place to put it */ + suggestedBlock = ExecSuggestBlock(slot, estate); + /* * insert the tuple * *************** *** 1475,1481 **** */ newId = heap_insert(resultRelationDesc, tuple, estate->es_snapshot->curcid, ! true, true); IncrAppended(); (estate->es_processed)++; --- 1482,1488 ---- */ newId = heap_insert(resultRelationDesc, tuple, estate->es_snapshot->curcid, ! true, true, suggestedBlock); IncrAppended(); (estate->es_processed)++; Index: src/backend/executor/execUtils.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/executor/execUtils.c,v retrieving revision 1.139 diff -c -r1.139 execUtils.c *** src/backend/executor/execUtils.c 4 Aug 2006 21:33:36 -0000 1.139 --- src/backend/executor/execUtils.c 9 Aug 2006 18:05:05 -0000 *************** *** 31,36 **** --- 31,37 ---- * ExecOpenIndices \ * ExecCloseIndices | referenced by InitPlan, EndPlan, * ExecInsertIndexTuples / ExecInsert, ExecUpdate + * ExecSuggestBlock Referenced by ExecInsert * * RegisterExprContextCallback Register function shutdown callback * UnregisterExprContextCallback Deregister function shutdown callback *************** *** 874,879 **** --- 875,881 ---- IndexInfo **indexInfoArray; resultRelInfo->ri_NumIndices = 0; + resultRelInfo->ri_ClusterIndex = -1; /* fast path if no indexes */ if (!RelationGetForm(resultRelation)->relhasindex) *************** *** 913,918 **** --- 915,925 ---- /* extract index key information from the index's pg_index info */ ii = BuildIndexInfo(indexDesc); + /* Remember which index is the clustered one. + * It's used to call the suggestblock-method on inserts */ + if(indexDesc->rd_index->indisclustered) + resultRelInfo->ri_ClusterIndex = i; + relationDescs[i] = indexDesc; indexInfoArray[i] = ii; i++; *************** *** 1062,1067 **** --- 1069,1137 ---- } } + /* ---------------------------------------------------------------- + * ExecSuggestBlock + * + * This routine asks the index am where a new heap tuple + * should be placed. + * ---------------------------------------------------------------- + */ + BlockNumber + ExecSuggestBlock(TupleTableSlot *slot, + EState *estate) + { + ResultRelInfo *resultRelInfo; + int i; + Relation relationDesc; + Relation heapRelation; + ExprContext *econtext; + Datum values[INDEX_MAX_KEYS]; + bool isnull[INDEX_MAX_KEYS]; + IndexInfo *indexInfo; + + /* + * Get information from the result relation info structure. + */ + resultRelInfo = estate->es_result_relation_info; + i = resultRelInfo->ri_ClusterIndex; + if(i == -1) + return InvalidBlockNumber; /* there was no clustered index */ + + heapRelation = resultRelInfo->ri_RelationDesc; + relationDesc = resultRelInfo->ri_IndexRelationDescs[i]; + indexInfo = resultRelInfo->ri_IndexRelationInfo[i]; + + /* You can't cluster on a partial index */ + Assert(indexInfo->ii_Predicate == NIL); + + /* + * We will use the EState's per-tuple context for evaluating + * index expressions (creating it if it's not already there). + */ + econtext = GetPerTupleExprContext(estate); + + /* Arrange for econtext's scan tuple to be the tuple under test */ + econtext->ecxt_scantuple = slot; + + /* + * FormIndexDatum fills in its values and isnull parameters with the + * appropriate values for the column(s) of the index. + */ + FormIndexDatum(indexInfo, + slot, + estate, + values, + isnull); + + /* + * The index AM does the rest. + */ + return index_suggestblock(relationDesc, /* index relation */ + values, /* array of index Datums */ + isnull, /* null flags */ + heapRelation); + } + /* * UpdateChangedParamSet * Add changed parameters to a plan node's chgParam set Index: src/include/access/genam.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/genam.h,v retrieving revision 1.65 diff -c -r1.65 genam.h *** src/include/access/genam.h 31 Jul 2006 20:09:05 -0000 1.65 --- src/include/access/genam.h 9 Aug 2006 17:53:44 -0000 *************** *** 93,98 **** --- 93,101 ---- ItemPointer heap_t_ctid, Relation heapRelation, bool check_uniqueness); + extern BlockNumber index_suggestblock(Relation indexRelation, + Datum *values, bool *isnull, + Relation heapRelation); extern IndexScanDesc index_beginscan(Relation heapRelation, Relation indexRelation, *************** *** 123,128 **** --- 126,133 ---- extern FmgrInfo *index_getprocinfo(Relation irel, AttrNumber attnum, uint16 procnum); + extern Datum dummysuggestblock(PG_FUNCTION_ARGS); + /* * index access method support routines (in genam.c) */ Index: src/include/access/heapam.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/heapam.h,v retrieving revision 1.114 diff -c -r1.114 heapam.h *** src/include/access/heapam.h 3 Jul 2006 22:45:39 -0000 1.114 --- src/include/access/heapam.h 8 Aug 2006 16:17:21 -0000 *************** *** 156,162 **** extern void setLastTid(const ItemPointer tid); extern Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm); extern HTSU_Result heap_delete(Relation relation, ItemPointer tid, ItemPointer ctid, TransactionId *update_xmax, CommandId cid, Snapshot crosscheck, bool wait); --- 156,162 ---- extern void setLastTid(const ItemPointer tid); extern Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm, BlockNumber suggestedblk); extern HTSU_Result heap_delete(Relation relation, ItemPointer tid, ItemPointer ctid, TransactionId *update_xmax, CommandId cid, Snapshot crosscheck, bool wait); Index: src/include/access/hio.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/hio.h,v retrieving revision 1.32 diff -c -r1.32 hio.h *** src/include/access/hio.h 13 Jul 2006 17:47:01 -0000 1.32 --- src/include/access/hio.h 8 Aug 2006 16:17:21 -0000 *************** *** 21,26 **** extern void RelationPutHeapTuple(Relation relation, Buffer buffer, HeapTuple tuple); extern Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer, bool use_fsm); #endif /* HIO_H */ --- 21,26 ---- extern void RelationPutHeapTuple(Relation relation, Buffer buffer, HeapTuple tuple); extern Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer, bool use_fsm, BlockNumber suggestedblk); #endif /* HIO_H */ Index: src/include/access/nbtree.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/nbtree.h,v retrieving revision 1.103 diff -c -r1.103 nbtree.h *** src/include/access/nbtree.h 7 Aug 2006 16:57:57 -0000 1.103 --- src/include/access/nbtree.h 8 Aug 2006 16:17:21 -0000 *************** *** 467,472 **** --- 467,473 ---- extern Datum btbulkdelete(PG_FUNCTION_ARGS); extern Datum btvacuumcleanup(PG_FUNCTION_ARGS); extern Datum btoptions(PG_FUNCTION_ARGS); + extern Datum btsuggestblock(PG_FUNCTION_ARGS); /* * prototypes for functions in nbtinsert.c *************** *** 476,481 **** --- 477,484 ---- extern Buffer _bt_getstackbuf(Relation rel, BTStack stack, int access); extern void _bt_insert_parent(Relation rel, Buffer buf, Buffer rbuf, BTStack stack, bool is_root, bool is_only); + extern BlockNumber _bt_suggestblock(Relation rel, IndexTuple itup, + Relation heapRel); /* * prototypes for functions in nbtpage.c Index: src/include/catalog/pg_am.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/catalog/pg_am.h,v retrieving revision 1.46 diff -c -r1.46 pg_am.h *** src/include/catalog/pg_am.h 31 Jul 2006 20:09:05 -0000 1.46 --- src/include/catalog/pg_am.h 8 Aug 2006 16:17:21 -0000 *************** *** 65,70 **** --- 65,71 ---- regproc amvacuumcleanup; /* post-VACUUM cleanup function */ regproc amcostestimate; /* estimate cost of an indexscan */ regproc amoptions; /* parse AM-specific parameters */ + regproc amsuggestblock; /* suggest a block where to put heap tuple */ } FormData_pg_am; /* ---------------- *************** *** 78,84 **** * compiler constants for pg_am * ---------------- */ ! #define Natts_pg_am 23 #define Anum_pg_am_amname 1 #define Anum_pg_am_amstrategies 2 #define Anum_pg_am_amsupport 3 --- 79,85 ---- * compiler constants for pg_am * ---------------- */ ! #define Natts_pg_am 24 #define Anum_pg_am_amname 1 #define Anum_pg_am_amstrategies 2 #define Anum_pg_am_amsupport 3 *************** *** 102,123 **** #define Anum_pg_am_amvacuumcleanup 21 #define Anum_pg_am_amcostestimate 22 #define Anum_pg_am_amoptions 23 /* ---------------- * initial contents of pg_am * ---------------- */ ! DATA(insert OID = 403 ( btree 5 1 1 t t t t f t btinsert btbeginscan btgettuple btgetmulti btrescan btendscan btmarkposbtrestrpos btbuild btbulkdelete btvacuumcleanup btcostestimate btoptions )); DESCR("b-tree index access method"); #define BTREE_AM_OID 403 ! DATA(insert OID = 405 ( hash 1 1 0 f f f f f f hashinsert hashbeginscan hashgettuple hashgetmulti hashrescan hashendscanhashmarkpos hashrestrpos hashbuild hashbulkdelete hashvacuumcleanup hashcostestimate hashoptions )); DESCR("hash index access method"); #define HASH_AM_OID 405 ! DATA(insert OID = 783 ( gist 100 7 0 f t t t t t gistinsert gistbeginscan gistgettuple gistgetmulti gistrescan gistendscangistmarkpos gistrestrpos gistbuild gistbulkdelete gistvacuumcleanup gistcostestimate gistoptions )); DESCR("GiST index access method"); #define GIST_AM_OID 783 ! DATA(insert OID = 2742 ( gin 100 4 0 f f f f t f gininsert ginbeginscan gingettuple gingetmulti ginrescan ginendscanginmarkpos ginrestrpos ginbuild ginbulkdelete ginvacuumcleanup gincostestimate ginoptions )); DESCR("GIN index access method"); #define GIN_AM_OID 2742 --- 103,125 ---- #define Anum_pg_am_amvacuumcleanup 21 #define Anum_pg_am_amcostestimate 22 #define Anum_pg_am_amoptions 23 + #define Anum_pg_am_amsuggestblock 24 /* ---------------- * initial contents of pg_am * ---------------- */ ! DATA(insert OID = 403 ( btree 5 1 1 t t t t f t btinsert btbeginscan btgettuple btgetmulti btrescan btendscan btmarkposbtrestrpos btbuild btbulkdelete btvacuumcleanup btcostestimate btoptions btsuggestblock)); DESCR("b-tree index access method"); #define BTREE_AM_OID 403 ! DATA(insert OID = 405 ( hash 1 1 0 f f f f f f hashinsert hashbeginscan hashgettuple hashgetmulti hashrescan hashendscanhashmarkpos hashrestrpos hashbuild hashbulkdelete hashvacuumcleanup hashcostestimate hashoptions dummysuggestblock)); DESCR("hash index access method"); #define HASH_AM_OID 405 ! DATA(insert OID = 783 ( gist 100 7 0 f t t t t t gistinsert gistbeginscan gistgettuple gistgetmulti gistrescan gistendscangistmarkpos gistrestrpos gistbuild gistbulkdelete gistvacuumcleanup gistcostestimate gistoptions dummysuggestblock)); DESCR("GiST index access method"); #define GIST_AM_OID 783 ! DATA(insert OID = 2742 ( gin 100 4 0 f f f f t f gininsert ginbeginscan gingettuple gingetmulti ginrescan ginendscanginmarkpos ginrestrpos ginbuild ginbulkdelete ginvacuumcleanup gincostestimate ginoptions dummysuggestblock )); DESCR("GIN index access method"); #define GIN_AM_OID 2742 Index: src/include/catalog/pg_proc.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.420 diff -c -r1.420 pg_proc.h *** src/include/catalog/pg_proc.h 6 Aug 2006 03:53:44 -0000 1.420 --- src/include/catalog/pg_proc.h 9 Aug 2006 18:06:44 -0000 *************** *** 682,687 **** --- 682,689 ---- DESCR("btree(internal)"); DATA(insert OID = 2785 ( btoptions PGNSP PGUID 12 f f t f s 2 17 "1009 16" _null_ _null_ _null_ btoptions -_null_ )); DESCR("btree(internal)"); + DATA(insert OID = 2852 ( btsuggestblock PGNSP PGUID 12 f f t f v 4 23 "2281 2281 2281 2281" _null_ _null_ _null_ btsuggestblock - _null_ )); + DESCR("btree(internal)"); DATA(insert OID = 339 ( poly_same PGNSP PGUID 12 f f t f i 2 16 "604 604" _null_ _null_ _null_ poly_same - _null_)); DESCR("same as?"); *************** *** 3936,3941 **** --- 3938,3946 ---- DATA(insert OID = 2749 ( arraycontained PGNSP PGUID 12 f f t f i 2 16 "2277 2277" _null_ _null_ _null_ arraycontained- _null_ )); DESCR("anyarray contained"); + DATA(insert OID = 2853 ( dummysuggestblock PGNSP PGUID 12 f f t f v 4 23 "2281 2281 2281 2281" _null_ _null_ _null_ dummysuggestblock - _null_ )); + DESCR("dummy amsuggestblock implementation (internal)"); + /* * Symbolic values for provolatile column: these indicate whether the result * of a function is dependent *only* on the values of its explicit arguments, Index: src/include/executor/executor.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/executor/executor.h,v retrieving revision 1.128 diff -c -r1.128 executor.h *** src/include/executor/executor.h 4 Aug 2006 21:33:36 -0000 1.128 --- src/include/executor/executor.h 8 Aug 2006 16:17:21 -0000 *************** *** 271,276 **** --- 271,277 ---- extern void ExecCloseIndices(ResultRelInfo *resultRelInfo); extern void ExecInsertIndexTuples(TupleTableSlot *slot, ItemPointer tupleid, EState *estate, bool is_vacuum); + extern BlockNumber ExecSuggestBlock(TupleTableSlot *slot, EState *estate); extern void RegisterExprContextCallback(ExprContext *econtext, ExprContextCallbackFunction function, Index: src/include/nodes/execnodes.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/nodes/execnodes.h,v retrieving revision 1.158 diff -c -r1.158 execnodes.h *** src/include/nodes/execnodes.h 4 Aug 2006 21:33:36 -0000 1.158 --- src/include/nodes/execnodes.h 8 Aug 2006 16:17:21 -0000 *************** *** 257,262 **** --- 257,264 ---- * NumIndices # of indices existing on result relation * IndexRelationDescs array of relation descriptors for indices * IndexRelationInfo array of key/attr info for indices + * ClusterIndex index to the IndexRelationInfo array of the + * clustered index, or -1 if there's none * TrigDesc triggers to be fired, if any * TrigFunctions cached lookup info for trigger functions * TrigInstrument optional runtime measurements for triggers *************** *** 272,277 **** --- 274,280 ---- int ri_NumIndices; RelationPtr ri_IndexRelationDescs; IndexInfo **ri_IndexRelationInfo; + int ri_ClusterIndex; TriggerDesc *ri_TrigDesc; FmgrInfo *ri_TrigFunctions; struct Instrumentation *ri_TrigInstrument; Index: src/include/utils/rel.h =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/utils/rel.h,v retrieving revision 1.91 diff -c -r1.91 rel.h *** src/include/utils/rel.h 3 Jul 2006 22:45:41 -0000 1.91 --- src/include/utils/rel.h 8 Aug 2006 16:17:21 -0000 *************** *** 116,121 **** --- 116,122 ---- FmgrInfo amvacuumcleanup; FmgrInfo amcostestimate; FmgrInfo amoptions; + FmgrInfo amsuggestblock; } RelationAmInfo;
pgsql-patches by date: