Index: doc/src/sgml/indices.sgml =================================================================== RCS file: /projects/cvsroot/pgsql/doc/src/sgml/indices.sgml,v retrieving revision 1.57 diff -c -p -c -r1.57 indices.sgml *** doc/src/sgml/indices.sgml 24 May 2006 11:01:39 -0000 1.57 --- doc/src/sgml/indices.sgml 13 Jul 2006 17:25:24 -0000 *************** SELECT am.amname AS index_method, *** 824,829 **** --- 824,911 ---- + + Operational Issues Related to Indexes + + + index + operational issues + + + + For large tables or for index types that require extensive work to create + creating indexes can be a long operation. In the case of large data + warehousing applications it can easily take hours or even days to build + indexes. It's important to understand the impact creating indexes has on a + system. + + + + Normally Postgres locks the table and performs the entire + index build in a single operation. Other transactions can still read the + table, but if they try to update, insert, or delete from the table they + will block until the index build is finished. This is the fastest way to + build an index. + + + Postgres does support building indexes without locking the + table using the ONLINE option to CREATE + INDEX: + + + CREATE INDEX test_index ON test_table ONLINE; + + + + + When the ONLINE option is used Postgres must perform an + additional step to build the index. As a result the index build will take + longer, sometimes as much as twice as long. This is in addition to the + added time it will take due to the other load from queries executing in + parallel. Online builds of all index types which allow concurrent access are + supported. This includes all standard index types included with Postgres. + + + + In addition to requiring extra processing, before beginning the extra step + Postgres must wait until other transactions accessing the + table being indexed have finished. Depending on the types of work being run + on your database this step could take arbitrarily long. Long running + queries which update, insert or delete from the table will prevent the + build from continuing until they commit or roll back. Even a busy system with + many short transactions reading from the table may make it hard for the + index build to proceed. + + + + If a problem arises during this second phase such as a unique constraint + violation an invalid index will remain. This index will be + ignored by Postgres when executing queries because the + index build operation did not complete correctly. It must be either dropped + with DROP INDEX or repaired with REINDEX INDEX. + Note that Postgres does not support online reindexes so to + avoid downtime it is necessary to drop and redefine the index. + + + + Online builds of unique indexes are supported however there are some + caveats. As mentioned above constraint violations may sometimes not be + detected until after the index is created resulting in an invalid index. + Also the constraint begins being enforced for other transactions as soon as + the invalid index is created, ie, before the second phase begins. This + could result in constraint violations being detected in other queries prior + to the index becoming available for use or even in cases where the index + build eventually fails. + + + + Online builds of expression indexes and partial indexes are supported. + Errors occurring in the evaluation of these expressions could cause + behaviour similar to that described above for unique constraint violations. + + + + Examining Index Usage Index: doc/src/sgml/ref/create_index.sgml =================================================================== RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/create_index.sgml,v retrieving revision 1.55 diff -c -p -c -r1.55 create_index.sgml *** doc/src/sgml/ref/create_index.sgml 11 Jul 2006 21:05:57 -0000 1.55 --- doc/src/sgml/ref/create_index.sgml 13 Jul 2006 17:25:24 -0000 *************** CREATE [ UNIQUE ] INDEX column | ( expression ) } [ opclass ] [, ...] ) [ WITH ( storage_parameter = value [, ... ] ) ] [ TABLESPACE tablespace ] + [ ONLINE] [ WHERE predicate ] *************** CREATE [ UNIQUE ] INDEX + When the ONLINE clause is present Postgres + will build the index without taking any locks that prevent concurrent DML. + This allows a new index to be created in a production system without + interfering with regular operation, though the added I/O load will still + of course have some impact. Additionally, to achieve this + Postgres must do extra work resulting in the index build + taking substantially longer than a regular index build. There are other + caveats for online index builds, especially if an error such as a unique + constraint violation occurs, that should be understood when using it. See + the chapter of the Postgres Manual titled + Indexes for more information. + + + When the WHERE clause is present, a partial index is created. A partial index is an index that contains entries for only a portion of Index: doc/src/sgml/ref/reindex.sgml =================================================================== RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/reindex.sgml,v retrieving revision 1.31 diff -c -p -c -r1.31 reindex.sgml *** doc/src/sgml/ref/reindex.sgml 4 Jul 2006 18:07:24 -0000 1.31 --- doc/src/sgml/ref/reindex.sgml 13 Jul 2006 17:25:24 -0000 *************** REINDEX { INDEX | TABLE | DATABASE | SYS *** 61,66 **** --- 61,78 ---- for an index, and wish to ensure that the change has taken full effect. + + + + An index build with the ONLINE option failed resulting in an + invalid index. Such indexes are useless but it can be + convenient to use REINDEX to rebuild them. Note that the + REINDEX will not be an online build. To build the index + without interfering with production you should drop and reissue the + CREATE INDEX with the ONLINE option. + + + Index: src/backend/bootstrap/bootparse.y =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/bootstrap/bootparse.y,v retrieving revision 1.82 diff -c -p -c -r1.82 bootparse.y *** src/backend/bootstrap/bootparse.y 3 Jul 2006 22:45:37 -0000 1.82 --- src/backend/bootstrap/bootparse.y 13 Jul 2006 17:25:24 -0000 *************** Boot_DeclareIndexStmt: *** 255,261 **** $10, NULL, NIL, NIL, false, false, false, ! false, false, true, false); do_end(); } ; --- 255,261 ---- $10, NULL, NIL, NIL, false, false, false, ! false, false, true, false, false); do_end(); } ; *************** Boot_DeclareUniqueIndexStmt: *** 273,279 **** $11, NULL, NIL, NIL, true, false, false, ! false, false, true, false); do_end(); } ; --- 273,279 ---- $11, NULL, NIL, NIL, true, false, false, ! false, false, true, false, false); do_end(); } ; Index: src/backend/catalog/index.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/catalog/index.c,v retrieving revision 1.268 diff -c -p -c -r1.268 index.c *** src/backend/catalog/index.c 3 Jul 2006 22:45:37 -0000 1.268 --- src/backend/catalog/index.c 13 Jul 2006 17:25:25 -0000 *************** *** 32,43 **** --- 32,45 ---- #include "catalog/index.h" #include "catalog/indexing.h" #include "catalog/pg_constraint.h" + #include "catalog/pg_operator.h" #include "catalog/pg_opclass.h" #include "catalog/pg_type.h" #include "executor/executor.h" #include "miscadmin.h" #include "optimizer/clauses.h" #include "parser/parse_expr.h" + #include "storage/proc.h" #include "storage/procarray.h" #include "storage/smgr.h" #include "utils/builtins.h" *************** *** 47,52 **** --- 49,55 ---- #include "utils/memutils.h" #include "utils/relcache.h" #include "utils/syscache.h" + #include "utils/tuplesort.h" /* non-export function prototypes */ *************** static void AppendAttributeTuples(Relati *** 59,65 **** static void UpdateIndexRelation(Oid indexoid, Oid heapoid, IndexInfo *indexInfo, Oid *classOids, ! bool primary); static void index_update_stats(Relation rel, bool hasindex, bool isprimary, Oid reltoastidxid, double reltuples); static Oid IndexGetRelation(Oid indexId); --- 62,69 ---- static void UpdateIndexRelation(Oid indexoid, Oid heapoid, IndexInfo *indexInfo, Oid *classOids, ! bool primary, ! bool isvalid); static void index_update_stats(Relation rel, bool hasindex, bool isprimary, Oid reltoastidxid, double reltuples); static Oid IndexGetRelation(Oid indexId); *************** UpdateIndexRelation(Oid indexoid, *** 313,319 **** Oid heapoid, IndexInfo *indexInfo, Oid *classOids, ! bool primary) { int2vector *indkey; oidvector *indclass; --- 317,324 ---- Oid heapoid, IndexInfo *indexInfo, Oid *classOids, ! bool primary, ! bool isvalid) { int2vector *indkey; oidvector *indclass; *************** UpdateIndexRelation(Oid indexoid, *** 381,386 **** --- 386,392 ---- values[Anum_pg_index_indisunique - 1] = BoolGetDatum(indexInfo->ii_Unique); values[Anum_pg_index_indisprimary - 1] = BoolGetDatum(primary); values[Anum_pg_index_indisclustered - 1] = BoolGetDatum(false); + values[Anum_pg_index_indisvalid - 1] = BoolGetDatum(isvalid); values[Anum_pg_index_indkey - 1] = PointerGetDatum(indkey); values[Anum_pg_index_indclass - 1] = PointerGetDatum(indclass); values[Anum_pg_index_indexprs - 1] = exprsDatum; *************** UpdateIndexRelation(Oid indexoid, *** 426,432 **** * isconstraint: index is owned by a PRIMARY KEY or UNIQUE constraint * allow_system_table_mods: allow table to be a system catalog * skip_build: true to skip the index_build() step for the moment; caller ! * must do it later (typically via reindex_index()) * * Returns OID of the created index. */ --- 432,439 ---- * isconstraint: index is owned by a PRIMARY KEY or UNIQUE constraint * allow_system_table_mods: allow table to be a system catalog * skip_build: true to skip the index_build() step for the moment; caller ! * must do it later (typically via reindex_index()) ! * online: Avoid taking locks blocking writers to the table while building. * * Returns OID of the created index. */ *************** index_create(Oid heapRelationId, *** 443,449 **** bool istoast, bool isconstraint, bool allow_system_table_mods, ! bool skip_build) { Relation pg_class; Relation heapRelation; --- 450,457 ---- bool istoast, bool isconstraint, bool allow_system_table_mods, ! bool skip_build, ! bool online) { Relation pg_class; Relation heapRelation; *************** index_create(Oid heapRelationId, *** 456,464 **** pg_class = heap_open(RelationRelationId, RowExclusiveLock); /* ! * Only SELECT ... FOR UPDATE/SHARE are allowed while doing this ! */ ! heapRelation = heap_open(heapRelationId, ShareLock); /* * The index will be in the same namespace as its parent table, and is --- 464,474 ---- pg_class = heap_open(RelationRelationId, RowExclusiveLock); /* ! * Only SELECT ... FOR UPDATE/SHARE are allowed while doing this unless ! * we're doing an online index build in which case we block vacuum but not ! * UPDATE/INSERT/DELETE ! */ ! heapRelation = heap_open(heapRelationId, (online ? ShareUpdateExclusiveLock : ShareLock)); /* * The index will be in the same namespace as its parent table, and is *************** index_create(Oid heapRelationId, *** 582,588 **** * ---------------- */ UpdateIndexRelation(indexRelationId, heapRelationId, indexInfo, ! classObjectId, isprimary); /* * Register constraint and dependencies for the index. --- 592,598 ---- * ---------------- */ UpdateIndexRelation(indexRelationId, heapRelationId, indexInfo, ! classObjectId, isprimary, !online); /* * Register constraint and dependencies for the index. *************** IndexBuildHeapScan(Relation heapRelation *** 1420,1457 **** break; case HEAPTUPLE_INSERT_IN_PROGRESS: ! /* ! * Since caller should hold ShareLock or better, we should ! * not see any tuples inserted by open transactions --- ! * unless it's our own transaction. (Consider INSERT ! * followed by CREATE INDEX within a transaction.) An ! * exception occurs when reindexing a system catalog, ! * because we often release lock on system catalogs before ! * committing. */ - if (!TransactionIdIsCurrentTransactionId( - HeapTupleHeaderGetXmin(heapTuple->t_data)) - && !IsSystemRelation(heapRelation)) - elog(ERROR, "concurrent insert in progress"); indexIt = true; tupleIsAlive = true; break; case HEAPTUPLE_DELETE_IN_PROGRESS: ! ! /* ! * Since caller should hold ShareLock or better, we should ! * not see any tuples deleted by open transactions --- ! * unless it's our own transaction. (Consider DELETE ! * followed by CREATE INDEX within a transaction.) An ! * exception occurs when reindexing a system catalog, ! * because we often release lock on system catalogs before ! * committing. ! */ ! Assert(!(heapTuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI)); ! if (!TransactionIdIsCurrentTransactionId( ! HeapTupleHeaderGetXmax(heapTuple->t_data)) ! && !IsSystemRelation(heapRelation)) ! elog(ERROR, "concurrent delete in progress"); indexIt = true; tupleIsAlive = false; break; --- 1430,1447 ---- break; case HEAPTUPLE_INSERT_IN_PROGRESS: ! /* We used to insist the caller held ShareLock or better ! * but we no longer do. Online index builds may well get ! * here holding a weaker lock allowing concurrent updates. ! * In that case we could index or not index the tuple but ! * we always prefer to index it to aim for a better quality ! * index structure. */ indexIt = true; tupleIsAlive = true; break; case HEAPTUPLE_DELETE_IN_PROGRESS: ! /* See INSERT_IN_PROGRESS case above */ indexIt = true; tupleIsAlive = false; break; *************** reindex_index(Oid indexId) *** 1632,1637 **** --- 1622,1948 ---- heap_close(heapRelation, NoLock); } + + + /* State and callback function for online index builds. + * + * The strategy for online index builds is to build the index regularly + * recognizing that some tuples may be missing from the index because they were + * inserted by a concurrent transaction after the index build had passed the + * page they were inserted into. At this point the index is marked as not being + * usable for query plans. + * + * Then validate_index is called on the index to insert any missing index + * tuples. It gathers up all the existing TIDs from the index using bulkdelete + * (hence validate_index_callback_from_indexscan which merely stores the TIDs + * in a tuple store and returns false). Sorts them, and does a kind of "merge" + * against the heap to find any missing tuples. We can't reuse + * IndexBuildHeapScan for this purpose since we need to exclude + * INSERT_IN_PROGRESS tuples and in any case we want to avoid executing + * expressions or predicates for records already in the index. + * + * This is a brute force strategy. We could try to be clever storing new tuples + * in a special area of the table (such making the table append-only by setting + * use_fsm) or by using clever visibility checks and probing the index instead + * of scanning the whole thing. However most of these strategies would be + * impossible or at least very hard to support for arbitrary index methods. + * They would also be more fragile and harder to protect against duplicate or + * missing index entries. This strategy is nicely robust. + */ + + struct v_i_state + { + Tuplesortstate *tuplesort; + ItemPointer cursor; + bool finished; + Relation heapRel; + int itups, htups, tups_inserted; + }; + + /* This is called from bulkdelete to gather the tids in a tuplesort */ + static bool + validate_index_callback_from_indexscan(ItemPointer itemptr, void *opaque) + { + struct v_i_state *state = opaque; + tuplesort_putdatum(state->tuplesort, PointerGetDatum(itemptr), false); + state->itups++; + return false; /* of course we're not actually deleting anything */ + } + + /* validate_index_heapscan - This is mostly copied from IndexBuildHeapScan but + * the expression executions and predicate tests are postponed until after the + * tid check. Also we need to be able to distinguish INSERT_IN_PROGRESS from + * other cases + * + * The only tricky bit is that caller (ie validate_index) *MUST* ensure that at + * the time this scan begins no in progress transactions are unaware of the new + * index. We must take care here not duplicate inserts so we ignore + * INSERT_IN_PROGRESS tuples since we can safely assume they will be indexed by + * the inserted. + */ + + static void + validate_index_heapscan(Relation heapRelation, + Relation indexRelation, + IndexInfo *indexInfo, + struct v_i_state *state) + { + HeapScanDesc scan; + HeapTuple heapTuple; + Datum values[INDEX_MAX_KEYS]; + bool isnull[INDEX_MAX_KEYS]; + List *predicate; + TupleTableSlot *slot; + EState *estate; + ExprContext *econtext; + TransactionId OldestXmin; + /* state variables used to merge */ + ItemPointer heapcursor, indexcursor = NULL; + bool tuplesort_empty = false; + + /* + * sanity checks + */ + Assert(OidIsValid(indexRelation->rd_rel->relam)); + + /* + * Need an EState for evaluation of index expressions and partial-index + * predicates. Also a slot to hold the current tuple. + */ + estate = CreateExecutorState(); + econtext = GetPerTupleExprContext(estate); + slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation)); + + /* Arrange for econtext's scan tuple to be the tuple under test */ + econtext->ecxt_scantuple = slot; + + /* Set up execution state for predicate, if any. */ + predicate = (List *) + ExecPrepareExpr((Expr *) indexInfo->ii_Predicate, + estate); + + /* + * Ok, begin our scan of the base relation. We use SnapshotAny because we + * must retrieve all tuples and do our own time qual checks. + */ + OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared); + + scan = heap_beginscan(heapRelation, /* relation */ + SnapshotAny, /* seeself */ + 0, /* number of keys */ + NULL); /* scan key */ + + /* + * Scan all tuples in the base relation. + */ + while ((heapTuple = heap_getnext(scan, ForwardScanDirection)) != NULL) + { + bool indexIt; + CHECK_FOR_INTERRUPTS(); + + /* + * We could possibly get away with not locking the buffer here, + * since caller should hold ShareLock on the relation, but let's + * be conservative about it. + * + * XXX Is that still true for validate_index? + */ + LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE); + switch (HeapTupleSatisfiesVacuum(heapTuple->t_data, OldestXmin, + scan->rs_cbuf)) + { + /* If the tuple is inserted by a transaction that could still be running + * it's important that we ignore the heap tuple. The transaction will + * definitely index it itself (that's why we waited before starting this + * process for all transactions to see the index) and we may have caught it + * between inserting the heap tuple and the index entry + */ + case HEAPTUPLE_DEAD: + case HEAPTUPLE_INSERT_IN_PROGRESS: + case HEAPTUPLE_DELETE_IN_PROGRESS: + /* connot continue here because we must release the buffer lock */ + indexIt = false; + break; + + /* Normal case, we have to make sure the tuple is in the index */ + case HEAPTUPLE_LIVE: + case HEAPTUPLE_RECENTLY_DEAD: + indexIt = true; + break; + + default: + indexIt = false; /* shut the compiler up */ + elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result"); + } + LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK); + + if (!indexIt) + continue; + + state->htups++; + + /* "merge" by skipping through the index tuples until we get to (or past) + * the current heap tuple. + */ + + heapcursor = &heapTuple->t_self;; + + while (!tuplesort_empty && (!indexcursor || tid_cmp_internal(indexcursor, heapcursor) < 0)) { + bool ts_isnull; + if (indexcursor) + pfree(indexcursor); + tuplesort_empty = !tuplesort_getdatum(state->tuplesort, true, (Datum*)&indexcursor, &ts_isnull); + Assert(tuplesort_empty || !ts_isnull); + } + + MemoryContextReset(econtext->ecxt_per_tuple_memory); + + if (tuplesort_empty || tid_cmp_internal(indexcursor, heapcursor) > 0) { + /* we've overshot which means this heap tuple is missing from the index + * + * We've still postponed predicate checks until here, and expression + * calcualtions */ + + ExecStoreTuple(heapTuple, slot, InvalidBuffer, false); + if (predicate != NIL && !ExecQual(predicate, econtext, false)) + continue; + + /* + * For the current heap tuple, extract all the attributes we use in + * this index, and note which are null. This also performs evaluation + * of any expressions needed. + */ + FormIndexDatum(indexInfo, + slot, + estate, + values, + isnull); + + /* If the unique constraint fails then we throw an error. Currently + * this means the DBA has to manually drop the invalid index but we + * do have a solution for this to be implemented in the future */ + index_insert(indexRelation, + values, + isnull, + heapcursor, + heapRelation, + indexInfo->ii_Unique); + state->tups_inserted++; + } + } + + heap_endscan(scan); + + ExecDropSingleTupleTableSlot(slot); + + FreeExecutorState(estate); + + /* These may have been pointing to the now-gone estate */ + indexInfo->ii_ExpressionsState = NIL; + indexInfo->ii_PredicateState = NIL; + } + + void + validate_index(Oid IndexId) + { + Oid heapId; + Relation heapRelation, indexRelation, pg_index; + IndexInfo *indexInfo; + HeapTuple tuple; + Form_pg_index index; + struct v_i_state state; + IndexVacuumInfo ivinfo; + IndexBulkDeleteResult stats; + + pg_index = heap_open(IndexRelationId, RowExclusiveLock); + + tuple = SearchSysCacheCopy(INDEXRELID, + ObjectIdGetDatum(IndexId), + 0, 0, 0); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for index %u", IndexId); + index = (Form_pg_index) GETSTRUCT(tuple); + Assert(index->indexrelid = IndexId); + + /* Open and lock the parent heap relation. */ + heapId = IndexGetRelation(IndexId); + heapRelation = heap_open(heapId, ShareUpdateExclusiveLock); /* Allow concurrent DML but Vacuum so we can do bulkdelete */ + /* And the target index relation */ + indexRelation = index_open(IndexId); + LockRelation(indexRelation, AccessShareLock); + + /* Check this after we open the parent heap so we don't get two + * validate_indexes running concurrently. It wouldn't cause any damage just + * be pointless */ + if (index->indisvalid) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("index does not need validation"))); + + + /* Fetch info needed for index_build */ + indexInfo = BuildIndexInfo(indexRelation); + + ivinfo.index = indexRelation; + ivinfo.vacuum_full = false; + ivinfo.message_level = DEBUG2; + ivinfo.num_heap_tuples = -1; + + /* We have to sort the tids to "merge" them with the heap.*/ + state.tuplesort = tuplesort_begin_datum(TIDOID, TID_LT_OP, maintenance_work_mem, false); + state.itups = 0; + /* Now gather up all the tids in the index */ + index_bulk_delete(&ivinfo, &stats, validate_index_callback_from_indexscan, &state); + + /* And we're off */ + tuplesort_performsort(state.tuplesort); + + /* Now scan the heap and "merge" it with the index + * + * We don't do a real symmetric "merge", instead we read in all the heap + * pages and pull out index tuples until we find the matching tid. If we + * don't find one before we get past the tid we're looking for then it's + * missing. + * + * We reuse the same state object -- why not + */ + + state.htups = state.tups_inserted = 0; + + validate_index_heapscan(heapRelation, + indexRelation, + indexInfo, + &state); + + tuplesort_end(state.tuplesort); + + ereport(NOTICE, + (errmsg("Found %d heap tuples, %d index tuples. Inserted %d missing tuples", + state.htups, state.itups, state.tups_inserted))); + + /* Index is now valid -- update pg_index and catalog index */ + index->indisvalid = true; + simple_heap_update(pg_index, &tuple->t_self, tuple); + CatalogUpdateIndexes(pg_index, tuple); + + /* Make the new index using by queries in our transaction. This ought not + * be necessary if we're wrapped up in a single command */ + CommandCounterIncrement(); + + /* I don't think this is necessary, the index is always listed in the + * relcache anyways and is just ignored by plancat which is reconstructed + * on each command + */ + CacheInvalidateRelcache(heapRelation); + + /* Close rels, but keep locks */ + index_close(indexRelation); + heap_close(heapRelation, NoLock); + heap_close(pg_index, NoLock); + } + + + /* * reindex_relation - This routine is used to recreate all indexes * of a relation (and optionally its toast relation too, if any). Index: src/backend/commands/indexcmds.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/commands/indexcmds.c,v retrieving revision 1.144 diff -c -p -c -r1.144 indexcmds.c *** src/backend/commands/indexcmds.c 11 Jul 2006 17:04:13 -0000 1.144 --- src/backend/commands/indexcmds.c 13 Jul 2006 17:25:25 -0000 *************** static bool relationHasPrimaryKey(Relati *** 84,89 **** --- 84,90 ---- * 'skip_build': make the catalog entries but leave the index file empty; * it will be filled later. * 'quiet': suppress the NOTICE chatter ordinarily provided for constraints. + * 'online': Avoid taking locks blocking writers to the table while building. */ void DefineIndex(RangeVar *heapRelation, *************** DefineIndex(RangeVar *heapRelation, *** 101,107 **** bool is_alter_table, bool check_rights, bool skip_build, ! bool quiet) { Oid *classObjectId; Oid accessMethodId; --- 102,109 ---- bool is_alter_table, bool check_rights, bool skip_build, ! bool quiet, ! bool online) { Oid *classObjectId; Oid accessMethodId; *************** DefineIndex(RangeVar *heapRelation, *** 115,120 **** --- 117,123 ---- Datum reloptions; IndexInfo *indexInfo; int numberOfAttributes; + HeapTuple indexTuple; /* * count attributes in index *************** DefineIndex(RangeVar *heapRelation, *** 132,139 **** /* * Open heap relation, acquire a suitable lock on it, remember its OID ! */ ! rel = heap_openrv(heapRelation, ShareLock); /* Note: during bootstrap may see uncataloged relation */ if (rel->rd_rel->relkind != RELKIND_RELATION && --- 135,146 ---- /* * Open heap relation, acquire a suitable lock on it, remember its OID ! * ! * Only SELECT ... FOR UPDATE/SHARE are allowed while doing this unless ! * we're doing an online index build in which case we block vacuum but not ! * UPDATE/INSERT/DELETE ! */ ! rel = heap_openrv(heapRelation, (online ? ShareUpdateExclusiveLock : ShareLock)); /* Note: during bootstrap may see uncataloged relation */ if (rel->rd_rel->relkind != RELKIND_RELATION && *************** DefineIndex(RangeVar *heapRelation, *** 409,418 **** primary ? "PRIMARY KEY" : "UNIQUE", indexRelationName, RelationGetRelationName(rel)))); ! index_create(relationId, indexRelationName, indexRelationId, indexInfo, accessMethodId, tablespaceId, classObjectId, reloptions, primary, false, isconstraint, ! allowSystemTableMods, skip_build); } --- 416,542 ---- primary ? "PRIMARY KEY" : "UNIQUE", indexRelationName, RelationGetRelationName(rel)))); ! indexRelationId = ! index_create(relationId, indexRelationName, indexRelationId, indexInfo, accessMethodId, tablespaceId, classObjectId, reloptions, primary, false, isconstraint, ! allowSystemTableMods, skip_build, online); ! ! if (!online) { ! return; ! } ! ! /* Phase 2 of online index build */ ! ! CommitTransactionCommand(); ! StartTransactionCommand(); ! ! #if 0 ! /* If we fail at this point do some fancy footwork to drop the table */ ! PG_TRY(); ! #endif ! { ! unsigned sleeps; ! ! /* we need to do this so we can get the latest snapshots below without ! * triggering an error */ ! GetTransactionSnapshot(); ! ! /* We're ready to go ahead when no running transaction could have the table ! * open with the old list of indexes. If we can take an exclusive lock then ! * there are none and anybody who opens it now will get the new index in ! * their relcache entry. Alternatively if our Xmin reaches our own ! * transaction then we know no running transactions that started before us ! * are left anyways. ! */ ! CHECK_FOR_INTERRUPTS(); ! for (sleeps=0; /* until break */ ; sleeps++ ) { ! Relation rel = RelationIdGetRelation(relationId); ! if (!RelationIsValid(rel)) ! ereport(ERROR, ! (errcode(ERRCODE_INTERNAL_ERROR), /* XXX */ ! errmsg("online index build failed due to concurrent DROP TABLE"))); ! ! if (ConditionalLockRelation(rel, ExclusiveLock)) { ! UnlockRelation(rel,ExclusiveLock); ! RelationClose(rel); ! break; ! } ! RelationClose(rel); ! ! if (TransactionIdEquals(GetLatestSnapshot()->xmin, GetTopTransactionId())) ! break; ! ! if (!sleeps) ! ereport(NOTICE, ! (errmsg("waiting for new index to be visible to other transactions"))); ! ! ! /* .5s */ ! pg_usleep(500000L); ! CHECK_FOR_INTERRUPTS(); ! } ! ! /* We have to protect against someone jumping in the window between ! * transactions and changing our index or table structure. The only piece ! * of state we care about here is the OID of the index however. Make sure ! * it hasn't been dropped (or worse, though extremely unlikley, replaced ! * with some other index entirely). Make sure it's still the index we want ! * with some belt-and-suspenders checks that probably can't ever happen. ! * Most other DDL either can work just fine on invalid indexes and ! * validate_index will work fine or else it's their responsibility to ! * detect and fail if they're run on an invalid index. ! */ ! ! indexTuple = SearchSysCache(INDEXRELID, ! ObjectIdGetDatum(indexRelationId), ! 0, 0, 0); ! if (!HeapTupleIsValid(indexTuple)) ! ereport(ERROR, ! (errcode(ERRCODE_INTERNAL_ERROR), /* XXX */ ! errmsg("online index build failed due to concurrent DROP INDEX"))); ! { ! Form_pg_index result = (Form_pg_index) GETSTRUCT(indexTuple); ! if (result->indrelid != relationId || result->indisvalid) ! ereport(ERROR, ! (errcode(ERRCODE_INTERNAL_ERROR), /* XXX */ ! errmsg("online index build failed due to concurrent DDL"))); ! } ! ReleaseSysCache(indexTuple); ! ! /* functions in indexes may want a snapshot set */ ! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); ! ! validate_index(indexRelationId); ! ! CommitTransactionCommand(); ! } ! ! #if 0 ! /* Some tedious fancy footwork to drop the index if a unique constraint or ! * other problem arises */ ! PG_CATCH(); ! { ! 1 Copy error context (in portal memory context?) ! 2 abort the transaction ! 3 start new transaction ! ! ObjectAddress object; ! object.classId = RelationRelationId; ! object.objectId = indexRelationId; ! object.objectSubId = 0; ! performDeletion(&object, behavior); ! ! 5 commit ! 6 start new transaction ! 7 restore error context ! 8 throw said error ! ! PG_RE_THROW(); ! } ! #endif ! ! StartTransactionCommand(); } *************** ReindexIndex(RangeVar *indexRelation) *** 940,946 **** { Oid indOid; HeapTuple tuple; ! indOid = RangeVarGetRelid(indexRelation, false); tuple = SearchSysCache(RELOID, ObjectIdGetDatum(indOid), --- 1064,1073 ---- { Oid indOid; HeapTuple tuple; ! Relation pg_index; ! HeapTuple pg_index_tuple; ! Form_pg_index index; ! indOid = RangeVarGetRelid(indexRelation, false); tuple = SearchSysCache(RELOID, ObjectIdGetDatum(indOid), *************** ReindexIndex(RangeVar *indexRelation) *** 953,964 **** (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("\"%s\" is not an index", indexRelation->relname))); ! /* Check permissions */ if (!pg_class_ownercheck(indOid, GetUserId())) aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS, indexRelation->relname); ReleaseSysCache(tuple); reindex_index(indOid); --- 1080,1111 ---- (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("\"%s\" is not an index", indexRelation->relname))); ! /* Check permissions */ if (!pg_class_ownercheck(indOid, GetUserId())) aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS, indexRelation->relname); + + /* Check if this is an invalid index left over from an online create. This + * gives DBAs two outs if ithey're create fails. They can drop the table or + * they can reindex it. */ + pg_index = heap_open(IndexRelationId, RowExclusiveLock); + pg_index_tuple = SearchSysCacheCopy(INDEXRELID, + ObjectIdGetDatum(indOid), + 0, 0, 0); + if (!HeapTupleIsValid(pg_index_tuple)) + elog(ERROR, "cache lookup failed for index %u", indOid); + + index = (Form_pg_index)GETSTRUCT(pg_index_tuple); + + if (!index->indisvalid) { + index->indisvalid = true; + simple_heap_update(pg_index, &pg_index_tuple->t_self, pg_index_tuple); + CatalogUpdateIndexes(pg_index, pg_index_tuple); + } + heap_close(pg_index, NoLock); + ReleaseSysCache(tuple); reindex_index(indOid); Index: src/backend/commands/tablecmds.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/commands/tablecmds.c,v retrieving revision 1.194 diff -c -p -c -r1.194 tablecmds.c *** src/backend/commands/tablecmds.c 11 Jul 2006 18:26:10 -0000 1.194 --- src/backend/commands/tablecmds.c 13 Jul 2006 17:25:26 -0000 *************** ATExecAddIndex(AlteredTableInfo *tab, Re *** 3874,3880 **** true, /* is_alter_table */ check_rights, skip_build, ! quiet); } /* --- 3874,3881 ---- true, /* is_alter_table */ check_rights, skip_build, ! quiet, ! false); } /* *************** AlterTableCreateToastTable(Oid relOid, b *** 6710,6716 **** BTREE_AM_OID, rel->rd_rel->reltablespace, classObjectId, (Datum) 0, ! true, true, false, true, false); /* * Store the toast table's OID in the parent relation's pg_class row --- 6711,6717 ---- BTREE_AM_OID, rel->rd_rel->reltablespace, classObjectId, (Datum) 0, ! true, true, false, true, false, false); /* * Store the toast table's OID in the parent relation's pg_class row Index: src/backend/nodes/copyfuncs.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/nodes/copyfuncs.c,v retrieving revision 1.342 diff -c -p -c -r1.342 copyfuncs.c *** src/backend/nodes/copyfuncs.c 3 Jul 2006 22:45:38 -0000 1.342 --- src/backend/nodes/copyfuncs.c 13 Jul 2006 17:25:26 -0000 *************** _copyIndexStmt(IndexStmt *from) *** 2028,2033 **** --- 2028,2034 ---- COPY_SCALAR_FIELD(unique); COPY_SCALAR_FIELD(primary); COPY_SCALAR_FIELD(isconstraint); + COPY_SCALAR_FIELD(online); return newnode; } Index: src/backend/nodes/equalfuncs.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/nodes/equalfuncs.c,v retrieving revision 1.276 diff -c -p -c -r1.276 equalfuncs.c *** src/backend/nodes/equalfuncs.c 3 Jul 2006 22:45:38 -0000 1.276 --- src/backend/nodes/equalfuncs.c 13 Jul 2006 17:25:26 -0000 *************** _equalIndexStmt(IndexStmt *a, IndexStmt *** 958,963 **** --- 958,964 ---- COMPARE_SCALAR_FIELD(unique); COMPARE_SCALAR_FIELD(primary); COMPARE_SCALAR_FIELD(isconstraint); + COMPARE_SCALAR_FIELD(online); return true; } Index: src/backend/nodes/outfuncs.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/nodes/outfuncs.c,v retrieving revision 1.277 diff -c -p -c -r1.277 outfuncs.c *** src/backend/nodes/outfuncs.c 3 Jul 2006 22:45:39 -0000 1.277 --- src/backend/nodes/outfuncs.c 13 Jul 2006 17:25:26 -0000 *************** _outIndexStmt(StringInfo str, IndexStmt *** 1346,1351 **** --- 1346,1352 ---- WRITE_BOOL_FIELD(unique); WRITE_BOOL_FIELD(primary); WRITE_BOOL_FIELD(isconstraint); + WRITE_BOOL_FIELD(online); } static void Index: src/backend/optimizer/util/plancat.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/optimizer/util/plancat.c,v retrieving revision 1.120 diff -c -p -c -r1.120 plancat.c *** src/backend/optimizer/util/plancat.c 11 Jul 2006 17:26:58 -0000 1.120 --- src/backend/optimizer/util/plancat.c 13 Jul 2006 17:25:26 -0000 *************** get_relation_info(Oid relationObjectId, *** 133,138 **** --- 133,150 ---- indexRelation = index_open(indexoid); index = indexRelation->rd_index; + /* Do not include invalid indexes in the results -- this means it's + * important that this function only be used for planner and + * optimizer consumers. Not for DDL or for finding indexes to be + * updated. Those functions need to know about invalid indexes so + * they can update them appropriately */ + + if (!index->indisvalid) { + index_close(indexRelation); + continue; + } + + info = makeNode(IndexOptInfo); info->indexoid = index->indexrelid; Index: src/backend/parser/gram.y =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/parser/gram.y,v retrieving revision 2.551 diff -c -p -c -r2.551 gram.y *** src/backend/parser/gram.y 3 Jul 2006 22:45:39 -0000 2.551 --- src/backend/parser/gram.y 13 Jul 2006 17:25:27 -0000 *************** static void doNegateFloat(Value *v); *** 270,276 **** %type overlay_placing substr_from substr_for %type opt_instead opt_analyze ! %type index_opt_unique opt_verbose opt_full %type opt_freeze opt_default opt_recheck %type opt_binary opt_oids copy_delimiter --- 270,276 ---- %type overlay_placing substr_from substr_for %type opt_instead opt_analyze ! %type index_opt_unique index_opt_online opt_verbose opt_full %type opt_freeze opt_default opt_recheck %type opt_binary opt_oids copy_delimiter *************** static void doNegateFloat(Value *v); *** 402,408 **** NOCREATEROLE NOCREATEUSER NOINHERIT NOLOGIN_P NONE NOSUPERUSER NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF NUMERIC ! OBJECT_P OF OFF OFFSET OIDS OLD ON ONLY OPERATOR OPTION OR ORDER OUT_P OUTER_P OVERLAPS OVERLAY OWNED OWNER PARTIAL PASSWORD PLACING POSITION --- 402,408 ---- NOCREATEROLE NOCREATEUSER NOINHERIT NOLOGIN_P NONE NOSUPERUSER NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF NUMERIC ! OBJECT_P OF OFF OFFSET OIDS OLD ON ONLINE ONLY OPERATOR OPTION OR ORDER OUT_P OUTER_P OVERLAPS OVERLAY OWNED OWNER PARTIAL PASSWORD PLACING POSITION *************** opt_granted_by: GRANTED BY RoleId { *** 3653,3659 **** *****************************************************************************/ IndexStmt: CREATE index_opt_unique INDEX index_name ON qualified_name ! access_method_clause '(' index_params ')' opt_definition OptTableSpace where_clause { IndexStmt *n = makeNode(IndexStmt); n->unique = $2; --- 3653,3659 ---- *****************************************************************************/ IndexStmt: CREATE index_opt_unique INDEX index_name ON qualified_name ! access_method_clause '(' index_params ')' opt_definition OptTableSpace index_opt_online where_clause { IndexStmt *n = makeNode(IndexStmt); n->unique = $2; *************** IndexStmt: CREATE index_opt_unique INDEX *** 3661,3669 **** n->relation = $6; n->accessMethod = $7; n->indexParams = $9; n->options = $11; n->tableSpace = $12; ! n->whereClause = $13; $$ = (Node *)n; } ; --- 3661,3670 ---- n->relation = $6; n->accessMethod = $7; n->indexParams = $9; + n->online = $13; n->options = $11; n->tableSpace = $12; ! n->whereClause = $14; $$ = (Node *)n; } ; *************** index_params: index_elem { $$ = li *** 3682,3687 **** --- 3683,3692 ---- | index_params ',' index_elem { $$ = lappend($1, $3); } ; + index_opt_online: ONLINE { $$ = TRUE; } + | /*EMPTY*/ { $$ = FALSE; } + ; + /* * Index attributes can be either simple column references, or arbitrary * expressions in parens. For backwards-compatibility reasons, we allow *************** unreserved_keyword: *** 8596,8601 **** --- 8601,8607 ---- | OBJECT_P | OF | OIDS + | ONLINE | OPERATOR | OPTION | OWNED Index: src/backend/parser/keywords.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/parser/keywords.c,v retrieving revision 1.172 diff -c -p -c -r1.172 keywords.c *** src/backend/parser/keywords.c 27 Jun 2006 03:43:20 -0000 1.172 --- src/backend/parser/keywords.c 13 Jul 2006 17:25:27 -0000 *************** static const ScanKeyword ScanKeywords[] *** 244,249 **** --- 244,250 ---- {"oids", OIDS}, {"old", OLD}, {"on", ON}, + {"online", ONLINE}, {"only", ONLY}, {"operator", OPERATOR}, {"option", OPTION}, Index: src/backend/tcop/utility.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/tcop/utility.c,v retrieving revision 1.260 diff -c -p -c -r1.260 utility.c *** src/backend/tcop/utility.c 11 Jul 2006 18:26:11 -0000 1.260 --- src/backend/tcop/utility.c 13 Jul 2006 17:25:27 -0000 *************** ProcessUtility(Node *parsetree, *** 784,789 **** --- 784,794 ---- { IndexStmt *stmt = (IndexStmt *) parsetree; + /* Must do this here because DefineIndex API doesn't take stmt object */ + if (stmt->online) { + PreventTransactionChain(stmt, "CREATE INDEX ... ONLINE"); + } + CheckRelationOwnership(stmt->relation, true); DefineIndex(stmt->relation, /* relation */ *************** ProcessUtility(Node *parsetree, *** 801,807 **** false, /* is_alter_table */ true, /* check_rights */ false, /* skip_build */ ! false); /* quiet */ } break; --- 806,813 ---- false, /* is_alter_table */ true, /* check_rights */ false, /* skip_build */ ! false, /* quiet */ ! stmt->online); /* shall this be an online index build */ } break; Index: src/backend/utils/adt/tid.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/tid.c,v retrieving revision 1.52 diff -c -p -c -r1.52 tid.c *** src/backend/utils/adt/tid.c 5 Mar 2006 15:58:44 -0000 1.52 --- src/backend/utils/adt/tid.c 13 Jul 2006 17:25:27 -0000 *************** tidsend(PG_FUNCTION_ARGS) *** 164,178 **** * PUBLIC ROUTINES * *****************************************************************************/ Datum tideq(PG_FUNCTION_ARGS) { ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! PG_RETURN_BOOL(BlockIdGetBlockNumber(&(arg1->ip_blkid)) == ! BlockIdGetBlockNumber(&(arg2->ip_blkid)) && ! arg1->ip_posid == arg2->ip_posid); } Datum --- 164,197 ---- * PUBLIC ROUTINES * *****************************************************************************/ + int + tid_cmp_internal(void * a1, void *a2) + { + ItemPointer arg1 = (ItemPointer) a1; + ItemPointer arg2 = (ItemPointer) a2; + + BlockNumber b1 = BlockIdGetBlockNumber(&(arg1->ip_blkid)); + BlockNumber b2 = BlockIdGetBlockNumber(&(arg2->ip_blkid)); + + if (b1 < b2) + return -1; + else if (b1 > b2) + return 1; + else if (arg1->ip_posid < arg2->ip_posid) + return -1; + else if (arg1->ip_posid > arg2->ip_posid) + return 1; + else + return 0; + } + Datum tideq(PG_FUNCTION_ARGS) { ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! PG_RETURN_BOOL(tid_cmp_internal(arg1,arg2) == 0); } Datum *************** tidne(PG_FUNCTION_ARGS) *** 181,190 **** ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! PG_RETURN_BOOL(BlockIdGetBlockNumber(&(arg1->ip_blkid)) != ! BlockIdGetBlockNumber(&(arg2->ip_blkid)) || ! arg1->ip_posid != arg2->ip_posid); } /* * Functions to get latest tid of a specified tuple. --- 200,254 ---- ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! PG_RETURN_BOOL(tid_cmp_internal(arg1,arg2) != 0); ! } ! ! Datum ! tidlt(PG_FUNCTION_ARGS) ! { ! ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ! ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! ! PG_RETURN_BOOL(tid_cmp_internal(arg1,arg2) == -1); ! } ! ! Datum ! tidle(PG_FUNCTION_ARGS) ! { ! ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ! ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! ! PG_RETURN_BOOL(tid_cmp_internal(arg1,arg2) <= 0); ! } ! ! Datum ! tidgt(PG_FUNCTION_ARGS) ! { ! ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ! ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! ! PG_RETURN_BOOL(tid_cmp_internal(arg1,arg2) == 1); ! } ! ! Datum ! tidge(PG_FUNCTION_ARGS) ! { ! ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ! ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! ! PG_RETURN_BOOL(tid_cmp_internal(arg1,arg2) >= 0); ! } ! ! Datum ! bttidcmp(PG_FUNCTION_ARGS) ! { ! ItemPointer arg1 = PG_GETARG_ITEMPOINTER(0); ! ItemPointer arg2 = PG_GETARG_ITEMPOINTER(1); ! ! PG_RETURN_INT32(tid_cmp_internal(arg1, arg2)); } + + /* * Functions to get latest tid of a specified tuple. Index: src/bin/psql/describe.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/bin/psql/describe.c,v retrieving revision 1.140 diff -c -p -c -r1.140 describe.c *** src/bin/psql/describe.c 14 Jun 2006 16:49:02 -0000 1.140 --- src/bin/psql/describe.c 13 Jul 2006 17:25:28 -0000 *************** describeOneTableDetails(const char *sche *** 910,916 **** PGresult *result; printfPQExpBuffer(&buf, ! "SELECT i.indisunique, i.indisprimary, i.indisclustered, a.amname, c2.relname,\n" " pg_catalog.pg_get_expr(i.indpred, i.indrelid, true)\n" "FROM pg_catalog.pg_index i, pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_am a\n" "WHERE i.indexrelid = c.oid AND c.oid = '%s' AND c.relam = a.oid\n" --- 910,916 ---- PGresult *result; printfPQExpBuffer(&buf, ! "SELECT i.indisunique, i.indisprimary, i.indisclustered, i.indisvalid, a.amname, c2.relname,\n" " pg_catalog.pg_get_expr(i.indpred, i.indrelid, true)\n" "FROM pg_catalog.pg_index i, pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_am a\n" "WHERE i.indexrelid = c.oid AND c.oid = '%s' AND c.relam = a.oid\n" *************** describeOneTableDetails(const char *sche *** 930,938 **** char *indisunique = PQgetvalue(result, 0, 0); char *indisprimary = PQgetvalue(result, 0, 1); char *indisclustered = PQgetvalue(result, 0, 2); ! char *indamname = PQgetvalue(result, 0, 3); ! char *indtable = PQgetvalue(result, 0, 4); ! char *indpred = PQgetvalue(result, 0, 5); int count_footers = 0; if (strcmp(indisprimary, "t") == 0) --- 930,939 ---- char *indisunique = PQgetvalue(result, 0, 0); char *indisprimary = PQgetvalue(result, 0, 1); char *indisclustered = PQgetvalue(result, 0, 2); ! char *indisvalid = PQgetvalue(result, 0, 3); ! char *indamname = PQgetvalue(result, 0, 4); ! char *indtable = PQgetvalue(result, 0, 5); ! char *indpred = PQgetvalue(result, 0, 6); int count_footers = 0; if (strcmp(indisprimary, "t") == 0) *************** describeOneTableDetails(const char *sche *** 953,958 **** --- 954,962 ---- if (strcmp(indisclustered, "t") == 0) appendPQExpBuffer(&tmpbuf, _(", clustered")); + if (strcmp(indisvalid, "f") == 0) + appendPQExpBuffer(&tmpbuf, _(", invalid")); + footers = pg_malloc_zero(4 * sizeof(*footers)); footers[count_footers++] = pg_strdup(tmpbuf.data); add_tablespace_footer(tableinfo.relkind, tableinfo.tablespace, *************** describeOneTableDetails(const char *sche *** 1035,1041 **** if (tableinfo.hasindex) { printfPQExpBuffer(&buf, ! "SELECT c2.relname, i.indisprimary, i.indisunique, i.indisclustered, " "pg_catalog.pg_get_indexdef(i.indexrelid, 0, true), c2.reltablespace\n" "FROM pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_index i\n" "WHERE c.oid = '%s' AND c.oid = i.indrelid AND i.indexrelid = c2.oid\n" --- 1039,1045 ---- if (tableinfo.hasindex) { printfPQExpBuffer(&buf, ! "SELECT c2.relname, i.indisprimary, i.indisunique, i.indisclustered, i.indisvalid, " "pg_catalog.pg_get_indexdef(i.indexrelid, 0, true), c2.reltablespace\n" "FROM pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_index i\n" "WHERE c.oid = '%s' AND c.oid = i.indrelid AND i.indexrelid = c2.oid\n" *************** describeOneTableDetails(const char *sche *** 1169,1175 **** ? " UNIQUE," : "")); /* Everything after "USING" is echoed verbatim */ ! indexdef = PQgetvalue(result1, i, 4); usingpos = strstr(indexdef, " USING "); if (usingpos) indexdef = usingpos + 7; --- 1173,1179 ---- ? " UNIQUE," : "")); /* Everything after "USING" is echoed verbatim */ ! indexdef = PQgetvalue(result1, i, 5); usingpos = strstr(indexdef, " USING "); if (usingpos) indexdef = usingpos + 7; *************** describeOneTableDetails(const char *sche *** 1179,1189 **** if (strcmp(PQgetvalue(result1, i, 3), "t") == 0) appendPQExpBuffer(&buf, " CLUSTER"); /* Print tablespace of the index on the same line */ count_footers += 1; initPQExpBuffer(&tmpbuf); if (add_tablespace_footer('i', ! atooid(PQgetvalue(result1, i, 5)), footers, &count_footers, tmpbuf, false)) { appendPQExpBuffer(&buf, ", "); --- 1183,1196 ---- if (strcmp(PQgetvalue(result1, i, 3), "t") == 0) appendPQExpBuffer(&buf, " CLUSTER"); + if (strcmp(PQgetvalue(result1, i, 4), "f") == 0) + appendPQExpBuffer(&buf, " INVALID"); + /* Print tablespace of the index on the same line */ count_footers += 1; initPQExpBuffer(&tmpbuf); if (add_tablespace_footer('i', ! atooid(PQgetvalue(result1, i, 6)), footers, &count_footers, tmpbuf, false)) { appendPQExpBuffer(&buf, ", "); Index: src/include/catalog/index.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/index.h,v retrieving revision 1.68 diff -c -p -c -r1.68 index.h *** src/include/catalog/index.h 3 Jul 2006 22:45:40 -0000 1.68 --- src/include/catalog/index.h 13 Jul 2006 17:25:28 -0000 *************** extern Oid index_create(Oid heapRelation *** 42,48 **** bool istoast, bool isconstraint, bool allow_system_table_mods, ! bool skip_build); extern void index_drop(Oid indexId); --- 42,49 ---- bool istoast, bool isconstraint, bool allow_system_table_mods, ! bool skip_build, ! bool online); extern void index_drop(Oid indexId); *************** extern double IndexBuildHeapScan(Relatio *** 71,74 **** --- 72,77 ---- extern void reindex_index(Oid indexId); extern bool reindex_relation(Oid relid, bool toast_too); + extern void validate_index(Oid IndexId); + #endif /* INDEX_H */ Index: src/include/catalog/pg_amop.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_amop.h,v retrieving revision 1.72 diff -c -p -c -r1.72 pg_amop.h *** src/include/catalog/pg_amop.h 11 Jul 2006 19:49:13 -0000 1.72 --- src/include/catalog/pg_amop.h 13 Jul 2006 17:25:28 -0000 *************** DATA(insert ( 1989 0 4 f 612 )); *** 157,162 **** --- 157,172 ---- DATA(insert ( 1989 0 5 f 610 )); /* + * btree tid_ops + */ + + DATA(insert ( 2785 0 1 f 2786 )); + DATA(insert ( 2785 0 2 f 2788 )); + DATA(insert ( 2785 0 3 f 387 )); + DATA(insert ( 2785 0 4 f 2789 )); + DATA(insert ( 2785 0 5 f 2787 )); + + /* * btree oidvector_ops */ Index: src/include/catalog/pg_amproc.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_amproc.h,v retrieving revision 1.58 diff -c -p -c -r1.58 pg_amproc.h *** src/include/catalog/pg_amproc.h 2 May 2006 15:23:16 -0000 1.58 --- src/include/catalog/pg_amproc.h 13 Jul 2006 17:25:28 -0000 *************** DATA(insert ( 2098 0 1 2187 )); *** 124,130 **** DATA(insert ( 2099 0 1 377 )); DATA(insert ( 2233 0 1 380 )); DATA(insert ( 2234 0 1 381 )); ! /* hash */ DATA(insert ( 427 0 1 1080 )); --- 124,130 ---- DATA(insert ( 2099 0 1 377 )); DATA(insert ( 2233 0 1 380 )); DATA(insert ( 2234 0 1 381 )); ! DATA(insert ( 2785 0 1 2794 )); /* hash */ DATA(insert ( 427 0 1 1080 )); Index: src/include/catalog/pg_attribute.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_attribute.h,v retrieving revision 1.123 diff -c -p -c -r1.123 pg_attribute.h *** src/include/catalog/pg_attribute.h 10 Jul 2006 16:20:51 -0000 1.123 --- src/include/catalog/pg_attribute.h 13 Jul 2006 17:25:28 -0000 *************** DATA(insert ( 1259 tableoid 26 0 4 - *** 460,468 **** { 0, {"indisunique"}, 16, -1, 1, 4, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ { 0, {"indisprimary"}, 16, -1, 1, 5, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ { 0, {"indisclustered"}, 16, -1, 1, 6, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ ! { 0, {"indkey"}, 22, -1, -1, 7, 1, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \ ! { 0, {"indclass"}, 30, -1, -1, 8, 1, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \ ! { 0, {"indexprs"}, 25, -1, -1, 9, 0, -1, -1, false, 'x', 'i', false, false, false, true, 0 }, \ ! { 0, {"indpred"}, 25, -1, -1, 10, 0, -1, -1, false, 'x', 'i', false, false, false, true, 0 } #endif /* PG_ATTRIBUTE_H */ --- 460,469 ---- { 0, {"indisunique"}, 16, -1, 1, 4, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ { 0, {"indisprimary"}, 16, -1, 1, 5, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ { 0, {"indisclustered"}, 16, -1, 1, 6, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ ! { 0, {"indisvalid"}, 16, -1, 1, 7, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ ! { 0, {"indkey"}, 22, -1, -1, 8, 1, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \ ! { 0, {"indclass"}, 30, -1, -1, 9, 1, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \ ! { 0, {"indexprs"}, 25, -1, -1, 10, 0, -1, -1, false, 'x', 'i', false, false, false, true, 0 }, \ ! { 0, {"indpred"}, 25, -1, -1, 11, 0, -1, -1, false, 'x', 'i', false, false, false, true, 0 } #endif /* PG_ATTRIBUTE_H */ Index: src/include/catalog/pg_index.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_index.h,v retrieving revision 1.39 diff -c -p -c -r1.39 pg_index.h *** src/include/catalog/pg_index.h 5 Mar 2006 15:58:54 -0000 1.39 --- src/include/catalog/pg_index.h 13 Jul 2006 17:25:28 -0000 *************** CATALOG(pg_index,2610) BKI_WITHOUT_OIDS *** 41,46 **** --- 41,47 ---- bool indisunique; /* is this a unique index? */ bool indisprimary; /* is this index for primary key? */ bool indisclustered; /* is this the index last clustered by? */ + bool indisvalid; /* is this index actually valid for use by queries */ /* VARIABLE LENGTH FIELDS: */ int2vector indkey; /* column numbers of indexed cols, or 0 */ *************** typedef FormData_pg_index *Form_pg_index *** 63,78 **** * compiler constants for pg_index * ---------------- */ ! #define Natts_pg_index 10 #define Anum_pg_index_indexrelid 1 #define Anum_pg_index_indrelid 2 #define Anum_pg_index_indnatts 3 #define Anum_pg_index_indisunique 4 #define Anum_pg_index_indisprimary 5 #define Anum_pg_index_indisclustered 6 ! #define Anum_pg_index_indkey 7 ! #define Anum_pg_index_indclass 8 ! #define Anum_pg_index_indexprs 9 ! #define Anum_pg_index_indpred 10 #endif /* PG_INDEX_H */ --- 64,80 ---- * compiler constants for pg_index * ---------------- */ ! #define Natts_pg_index 11 #define Anum_pg_index_indexrelid 1 #define Anum_pg_index_indrelid 2 #define Anum_pg_index_indnatts 3 #define Anum_pg_index_indisunique 4 #define Anum_pg_index_indisprimary 5 #define Anum_pg_index_indisclustered 6 ! #define Anum_pg_index_indisvalid 7 ! #define Anum_pg_index_indkey 8 ! #define Anum_pg_index_indclass 9 ! #define Anum_pg_index_indexprs 10 ! #define Anum_pg_index_indpred 11 #endif /* PG_INDEX_H */ Index: src/include/catalog/pg_opclass.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_opclass.h,v retrieving revision 1.70 diff -c -p -c -r1.70 pg_opclass.h *** src/include/catalog/pg_opclass.h 2 May 2006 15:23:16 -0000 1.70 --- src/include/catalog/pg_opclass.h 13 Jul 2006 17:25:28 -0000 *************** DATA(insert OID = 2222 ( 405 bool_ops *** 162,167 **** --- 162,168 ---- #define BOOL_HASH_OPS_OID 2222 DATA(insert OID = 2223 ( 405 bytea_ops PGNSP PGUID 17 t 0 )); DATA(insert OID = 2224 ( 405 int2vector_ops PGNSP PGUID 22 t 0 )); + DATA(insert OID = 2785 ( 403 tid_ops PGNSP PGUID 27 t 0 )); DATA(insert OID = 2225 ( 405 xid_ops PGNSP PGUID 28 t 0 )); DATA(insert OID = 2226 ( 405 cid_ops PGNSP PGUID 29 t 0 )); DATA(insert OID = 2227 ( 405 abstime_ops PGNSP PGUID 702 t 0 )); *************** DATA(insert OID = 2778 ( 2742 _money_ops *** 207,210 **** --- 208,212 ---- DATA(insert OID = 2779 ( 2742 _reltime_ops PGNSP PGUID 1024 t 703 )); DATA(insert OID = 2780 ( 2742 _tinterval_ops PGNSP PGUID 1025 t 704 )); + #endif /* PG_OPCLASS_H */ Index: src/include/catalog/pg_operator.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_operator.h,v retrieving revision 1.143 diff -c -p -c -r1.143 pg_operator.h *** src/include/catalog/pg_operator.h 2 May 2006 11:28:55 -0000 1.143 --- src/include/catalog/pg_operator.h 13 Jul 2006 17:25:28 -0000 *************** DATA(insert OID = 388 ( "!" PGNSP PG *** 128,136 **** DATA(insert OID = 389 ( "!!" PGNSP PGUID l f 0 20 1700 0 0 0 0 0 0 numeric_fac - - )); DATA(insert OID = 385 ( "=" PGNSP PGUID b t 29 29 16 385 0 0 0 0 0 cideq eqsel eqjoinsel )); DATA(insert OID = 386 ( "=" PGNSP PGUID b t 22 22 16 386 0 0 0 0 0 int2vectoreq eqsel eqjoinsel )); DATA(insert OID = 387 ( "=" PGNSP PGUID b f 27 27 16 387 402 0 0 0 0 tideq eqsel eqjoinsel )); #define TIDEqualOperator 387 ! DATA(insert OID = 402 ( "<>" PGNSP PGUID b f 27 27 16 402 387 0 0 0 0 tidne neqsel neqjoinsel )); DATA(insert OID = 410 ( "=" PGNSP PGUID b t 20 20 16 410 411 412 412 412 413 int8eq eqsel eqjoinsel )); DATA(insert OID = 411 ( "<>" PGNSP PGUID b f 20 20 16 411 410 0 0 0 0 int8ne neqsel neqjoinsel )); --- 128,142 ---- DATA(insert OID = 389 ( "!!" PGNSP PGUID l f 0 20 1700 0 0 0 0 0 0 numeric_fac - - )); DATA(insert OID = 385 ( "=" PGNSP PGUID b t 29 29 16 385 0 0 0 0 0 cideq eqsel eqjoinsel )); DATA(insert OID = 386 ( "=" PGNSP PGUID b t 22 22 16 386 0 0 0 0 0 int2vectoreq eqsel eqjoinsel )); + DATA(insert OID = 387 ( "=" PGNSP PGUID b f 27 27 16 387 402 0 0 0 0 tideq eqsel eqjoinsel )); #define TIDEqualOperator 387 ! DATA(insert OID = 402 ( "<>" PGNSP PGUID b f 27 27 16 402 387 0 0 0 0 tidne neqsel neqjoinsel )); ! DATA(insert OID = 2786 ( "<" PGNSP PGUID b f 27 27 16 2787 2789 0 0 0 0 tidlt scalarltsel scalarltjoinsel )); ! #define TID_LT_OP 2786 ! DATA(insert OID = 2787 ( ">" PGNSP PGUID b f 27 27 16 2786 2788 0 0 0 0 tidgt scalargtsel scalargtjoinsel )); ! DATA(insert OID = 2788 ( "<=" PGNSP PGUID b f 27 27 16 2789 2787 0 0 0 0 tidle scalarltsel scalarltjoinsel )); ! DATA(insert OID = 2789 ( ">=" PGNSP PGUID b f 27 27 16 2788 2786 0 0 0 0 tidge scalargtsel scalargtjoinsel )); DATA(insert OID = 410 ( "=" PGNSP PGUID b t 20 20 16 410 411 412 412 412 413 int8eq eqsel eqjoinsel )); DATA(insert OID = 411 ( "<>" PGNSP PGUID b f 20 20 16 411 410 0 0 0 0 int8ne neqsel neqjoinsel )); *************** DATA(insert OID = 596 ( "|/" PGNSP P *** 262,267 **** --- 268,274 ---- DATA(insert OID = 597 ( "||/" PGNSP PGUID l f 0 701 701 0 0 0 0 0 0 dcbrt - - )); DATA(insert OID = 1284 ( "|" PGNSP PGUID l f 0 704 702 0 0 0 0 0 0 tintervalstart - - )); DATA(insert OID = 606 ( "<#>" PGNSP PGUID b f 702 702 704 0 0 0 0 0 0 mktinterval - - )); + DATA(insert OID = 607 ( "=" PGNSP PGUID b t 26 26 16 607 608 609 609 609 610 oideq eqsel eqjoinsel )); #define MIN_OIDCMP 607 /* used by cache code */ DATA(insert OID = 608 ( "<>" PGNSP PGUID b f 26 26 16 608 607 0 0 0 0 oidne neqsel neqjoinsel )); Index: src/include/catalog/pg_proc.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.415 diff -c -p -c -r1.415 pg_proc.h *** src/include/catalog/pg_proc.h 3 Jul 2006 22:45:40 -0000 1.415 --- src/include/catalog/pg_proc.h 13 Jul 2006 17:25:29 -0000 *************** DESCR("anyarray contains"); *** 3872,3877 **** --- 3872,3889 ---- DATA(insert OID = 2749 ( arraycontained PGNSP PGUID 12 f f t f i 2 16 "2277 2277" _null_ _null_ _null_ arraycontained - _null_ )); DESCR("anyarray contained"); + /* new TID functions (tideq and tidne are above) */ + DATA(insert OID = 2790 ( tidgt PGNSP PGUID 12 f f t f i 2 16 "27 27" _null_ _null_ _null_ tidgt - _null_)); + DESCR("greater-than"); + DATA(insert OID = 2791 ( tidlt PGNSP PGUID 12 f f t f i 2 16 "27 27" _null_ _null_ _null_ tidlt - _null_)); + DESCR("less-than"); + DATA(insert OID = 2792 ( tidge PGNSP PGUID 12 f f t f i 2 16 "27 27" _null_ _null_ _null_ tidge - _null_)); + DESCR("greater-than-or-equal"); + DATA(insert OID = 2793 ( tidle PGNSP PGUID 12 f f t f i 2 16 "27 27" _null_ _null_ _null_ tidle - _null_)); + DESCR("less-than-or-equal"); + DATA(insert OID = 2794 ( bttidcmp PGNSP PGUID 12 f f t f i 2 23 "27 27" _null_ _null_ _null_ bttidcmp - _null_)); + DESCR("btree less-equal-greater"); + /* * Symbolic values for provolatile column: these indicate whether the result * of a function is dependent *only* on the values of its explicit arguments, Index: src/include/commands/defrem.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/commands/defrem.h,v retrieving revision 1.74 diff -c -p -c -r1.74 defrem.h *** src/include/commands/defrem.h 3 Jul 2006 22:45:40 -0000 1.74 --- src/include/commands/defrem.h 13 Jul 2006 17:25:29 -0000 *************** extern void DefineIndex(RangeVar *heapRe *** 33,39 **** bool is_alter_table, bool check_rights, bool skip_build, ! bool quiet); extern void RemoveIndex(RangeVar *relation, DropBehavior behavior); extern void ReindexIndex(RangeVar *indexRelation); extern void ReindexTable(RangeVar *relation); --- 33,41 ---- bool is_alter_table, bool check_rights, bool skip_build, ! bool quiet, ! bool online); ! extern void ValidateIndex(RangeVar *index); extern void RemoveIndex(RangeVar *relation, DropBehavior behavior); extern void ReindexIndex(RangeVar *indexRelation); extern void ReindexTable(RangeVar *relation); Index: src/include/nodes/parsenodes.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/nodes/parsenodes.h,v retrieving revision 1.315 diff -c -p -c -r1.315 parsenodes.h *** src/include/nodes/parsenodes.h 3 Jul 2006 22:45:40 -0000 1.315 --- src/include/nodes/parsenodes.h 13 Jul 2006 17:25:29 -0000 *************** typedef struct IndexStmt *** 1436,1443 **** --- 1436,1445 ---- bool unique; /* is index unique? */ bool primary; /* is index on primary key? */ bool isconstraint; /* is it from a CONSTRAINT clause? */ + bool online; /* should this be an online index build? */ } IndexStmt; + /* ---------------------- * Create Function Statement * ---------------------- Index: src/include/utils/builtins.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/utils/builtins.h,v retrieving revision 1.279 diff -c -p -c -r1.279 builtins.h *** src/include/utils/builtins.h 8 Apr 2006 18:49:52 -0000 1.279 --- src/include/utils/builtins.h 13 Jul 2006 17:25:29 -0000 *************** extern Datum tidrecv(PG_FUNCTION_ARGS); *** 538,543 **** --- 538,549 ---- extern Datum tidsend(PG_FUNCTION_ARGS); extern Datum tideq(PG_FUNCTION_ARGS); extern Datum tidne(PG_FUNCTION_ARGS); + extern Datum tidgt(PG_FUNCTION_ARGS); + extern Datum tidlt(PG_FUNCTION_ARGS); + extern Datum tidge(PG_FUNCTION_ARGS); + extern Datum tidle(PG_FUNCTION_ARGS); + extern int tid_cmp_internal(void *, void *); + extern Datum bttidcmp(PG_FUNCTION_ARGS); extern Datum currtid_byreloid(PG_FUNCTION_ARGS); extern Datum currtid_byrelname(PG_FUNCTION_ARGS); Index: src/test/regress/sql/create_index.sql =================================================================== RCS file: /projects/cvsroot/pgsql/src/test/regress/sql/create_index.sql,v retrieving revision 1.19 diff -c -p -c -r1.19 create_index.sql *** src/test/regress/sql/create_index.sql 11 Jul 2006 19:49:14 -0000 1.19 --- src/test/regress/sql/create_index.sql 13 Jul 2006 17:25:29 -0000 *************** INSERT INTO func_index_heap VALUES('QWER *** 219,221 **** --- 219,262 ---- create unique index hash_f8_index_1 on hash_f8_heap(abs(random)); create unique index hash_f8_index_2 on hash_f8_heap((seqno + 1), random); create unique index hash_f8_index_3 on hash_f8_heap(random) where seqno > 1000; + + + -- + -- Try some online index bulds + -- + -- Unfortunately this only tests about half the code paths because there are no + -- concurrent updates happening to the table at the same time. + + CREATE TABLE online_heap (f1 text, f2 text); + -- empty table + CREATE INDEX online_index1 ON online_heap(f2,f1) ONLINE; + INSERT INTO online_index VALUES ('a','b'); + INSERT INTO online_index VALUES ('b','b'); + -- unique index + CREATE UNIQUE INDEX online_index2 ON online_heap(f1) ONLINE; + -- check if constraint is set up properly to be enforced + INSERT INTO online_index VALUES ('b','x'); + -- check if constraint is enforced properly at build time + CREATE UNIQUE INDEX online_index3 ON online_heap(f2) ONLINE; + -- test that expression indexes and partial indexes work online + CREATE INDEX online_index4 on online_heap(f2) ONLINE WHERE f1='a'; + CREATE INDEX online_index5 on online_heap(f2) ONLINE WHERE f1='x'; + CREATE INDEX online_index6 on online_heap(f2||f1) ONLINE; + + -- You can't do an online index build in a transaction + BEGIN; + CREATE INDEX online_index7 ON online_heap ONLINE; + COMMIT; + + -- But you can do a regular index build in a transaction + BEGIN; + CREATE INDEX offline_index on online_heap; + COMMIT; + + -- check to make sure that the failed indexes were cleaned up properly and the + -- successful indexes are created properly. Notably that they do NOT have the + -- "invalid" flag set. + + \d online_heap + + DROP TABLE online_heap;