Re: Partitioning option for COPY - Mailing list pgsql-hackers

From Emmanuel Cecchet
Subject Re: Partitioning option for COPY
Date
Msg-id 4B00919B.4050309@asterdata.com
Whole thread Raw
In response to Re: Partitioning option for COPY  (Jan Urbański <wulczer@wulczer.org>)
Responses Re: Partitioning option for COPY
List pgsql-hackers
Jan,

Here is a new version of the patch. Find the response to your comments
embedded in the text.
>> partitioning option for COPY
>>
>
> Here's the review:
>
> == Submission ==
> The patch is contextual, applies cleanly to current HEAD, compiles fine.
> The docs build cleanly.
>
> == Docs ==
> They're reasonably clear, although they still mention ERROR_LOGGING,
> which was taken out of this patch. They could use some wordsmithing, but
> I didn't go into details, as there were more severe issues with the patch.
>
Removed the text related to ERROR_LOGGING.
> One thing that made me cautious was the mention that triggers modifying
> tuples will make random errors appear. As is demonstrated later,
> triggers are a big issue.
>
Whichever way routing is implemented we will have to decide what we want
to do with triggers. We can decide to fire them or not (there was
already a debate whether COPY is an insert statement or not and should
fire the statement trigger for insert). This is not a design problem
with this patch, we just have to chose what we want to do with triggers
when partitioning is involved. IMHO we should disable them altogether
but there are scenarios where one could argue that there are still useful.
> == Regression tests ==
> They ran fine, there's one additional regression test that exercises the
> new option.
>
> == Style/nitpicks ==
> Minor gripes include:
>  o instead of using an ad-hoc data structure for the LRU cache list, I'd
> suggest an OidList from pg_list.h.
>
Will do if we decide to go further with this patch.
>  o some mentions of "method" in comments should be changed to "function"
>  o trailing whitespace in the patch (it's not the end of the world, of
> course)
>
I guess the committer will run pg_indent anyway so I'm not too worried
about spaces.
> == Issues ==
> Attached are 3 files that demonstrate problems the patch has.
>  o test1.sql always segfaults for me, poking around with gdb suggests
> it's a case of an uninitialised cache list (another reason to use the
> builtin one).
>
I was never able to reproduce that problem. I don't know where this
comes from.
>  o test2.sql demonstrates, that indices on child tables are not being
> updated, probably because after resultRelInfo in
> check_tuple_constraints() gets created is never has ri_NumIndices set,
> and so the code that was supposed to take care of indices is never
> called. Looks like a copy-paste error.
>
Fixed, actually there was a leak in relcache for the index.
>  o test3.sql demonstrates, that some triggers that I would expect to be
> fired are in fact not fired. I guess it's the same reason as mentioned:
> ri_TrigDesc never gets set, so the code that calls triggers is dead.
>
There is a problem with after row triggers that I did not completely
figure out. For some reason, if I use the regular mechanism by calling
ExecARInsertTrigger that differ the execution of the trigger until the
after row event is triggered, the child relation is not closed and there
is a leak in the relcache. I forced the after row triggers to execute
synchronously after inserting in the child table to work around the
problem. If someone has an explanation, I am willing to do a cleaner
implementation!
> I stopped there, because unfortunately, apart from all that there's one
> fundamental problem with this patch, namely "we probably don't want it".
>
> As it stands it's more of a proof of concept than a really usable
> solution, it feels like built from spare (copied from around copy.c)
> parts. IMHO it's much too narrow for a general partitioning solution,
> even if the design it's based upon would be accepted. It's assuming a
> lot of things about the presence of child tables (with proper
> constraints), the absence of triggers, and so on.
>
> Granted, it solves a particular problem (bulk loading into a partitioned
> table, with not extra features like triggers and with standard
> inheritance/exclusive check constraints setup), but that's not good
> enough in my opinion, even if all other issues would be addressed.
>
Well, as Postgres does not have any support for real partitioning
besides inheritance, and so far it is unlikely that another
implementation will happen in the 8.5 timeframe, this feature fills the
need for people doing data warehouses. This is a scenario used with
every single Aster customer. Now if the Postgres community does not
think that the Aster use case is general enough or of interest to be
integrated in the code base, this is a different issue and I won't spent
time arguing if this is a philosophical/political issue.
Note that the new patch works with triggers but you can easily generate
corrupt data if your triggers are modifying the data on which the
routing decision is based.
> Now I'm not a real Postgres user, it's been a while since I worked in a
> PG shop (or a DB shop for that matter), but from what I understand from
> following this community for a while, a patch like that doesn't have a
> lot of chances to be committed. That said, my puny experience with real
> PG installations and their needs must be taken into account here.
>
I don't really understand why a new option of COPY should be solving a
general problem. It's an option, and like every option, it is to solve a
particular use case. I don't see what is wrong with that.
> I'll mark this patch as "Waiting on Author", but I have little doubt
> that even after fixing those probably trivial segfaults etc. the patch
> would be promptly rejected by a committer. I suggest withdrawing it from
> this commitfest and trying to work out a more complete design first that
> would address the needs of a bigger variety of users, or joining some of
> the already underway efforts to bring full-featured partitioning into
> Postgres.
>
I have integrated your tests in the regression test suite and I was
never able to reproduce the segfault you mentioned. What platform are
you using?

Thanks for your valuable feedback
Emmanuel

--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com

Index: src/backend/commands/trigger.c
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/backend/commands/trigger.c,v
retrieving revision 1.256
diff -c -r1.256 trigger.c
*** src/backend/commands/trigger.c    27 Oct 2009 20:14:27 -0000    1.256
--- src/backend/commands/trigger.c    15 Nov 2009 23:12:50 -0000
***************
*** 1756,1761 ****
--- 1756,1802 ----
      return newtuple;
  }

+ HeapTuple
+ ExecARInsertTriggersNow(EState *estate, ResultRelInfo *relinfo,
+                      HeapTuple trigtuple)
+ {
+     TriggerDesc *trigdesc = relinfo->ri_TrigDesc;
+     int            ntrigs = trigdesc->n_after_row[TRIGGER_EVENT_INSERT];
+     int           *tgindx = trigdesc->tg_after_row[TRIGGER_EVENT_INSERT];
+     HeapTuple    newtuple = trigtuple;
+     HeapTuple    oldtuple;
+     TriggerData LocTriggerData;
+     int            i;
+
+     LocTriggerData.type = T_TriggerData;
+     LocTriggerData.tg_event = TRIGGER_EVENT_INSERT |
+         TRIGGER_EVENT_ROW;
+     LocTriggerData.tg_relation = relinfo->ri_RelationDesc;
+     LocTriggerData.tg_newtuple = NULL;
+     LocTriggerData.tg_newtuplebuf = InvalidBuffer;
+     for (i = 0; i < ntrigs; i++)
+     {
+         Trigger    *trigger = &trigdesc->triggers[tgindx[i]];
+
+         if (!TriggerEnabled(trigger, LocTriggerData.tg_event, NULL))
+             continue;
+
+         LocTriggerData.tg_trigtuple = oldtuple = newtuple;
+         LocTriggerData.tg_trigtuplebuf = InvalidBuffer;
+         LocTriggerData.tg_trigger = trigger;
+         newtuple = ExecCallTriggerFunc(&LocTriggerData,
+                                        tgindx[i],
+                                        relinfo->ri_TrigFunctions,
+                                        relinfo->ri_TrigInstrument,
+                                        GetPerTupleMemoryContext(estate));
+         if (oldtuple != newtuple && oldtuple != trigtuple)
+             heap_freetuple(oldtuple);
+         if (newtuple == NULL)
+             break;
+     }
+     return newtuple;
+ }
+
  void
  ExecARInsertTriggers(EState *estate, ResultRelInfo *relinfo,
                       HeapTuple trigtuple, List *recheckIndexes)
Index: src/backend/commands/copy.c
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.317
diff -c -r1.317 copy.c
*** src/backend/commands/copy.c    21 Sep 2009 20:10:21 -0000    1.317
--- src/backend/commands/copy.c    15 Nov 2009 23:12:50 -0000
***************
*** 43,48 ****
--- 43,56 ----
  #include "utils/memutils.h"
  #include "utils/snapmgr.h"

+ /* For tuple routing */
+ #include "catalog/pg_inherits.h"
+ #include "catalog/pg_inherits_fn.h"
+ #include "nodes/makefuncs.h"
+ #include "nodes/pg_list.h"
+ #include "utils/fmgroids.h"
+ #include "utils/relcache.h"
+ #include "utils/tqual.h"

  #define ISOCTAL(c) (((c) >= '0') && ((c) <= '7'))
  #define OCTVALUE(c) ((c) - '0')
***************
*** 117,122 ****
--- 125,131 ----
      char       *escape;            /* CSV escape char (must be 1 byte) */
      bool       *force_quote_flags;        /* per-column CSV FQ flags */
      bool       *force_notnull_flags;    /* per-column CSV FNN flags */
+      bool        partitioning;    /* tuple routing in table hierarchy */

      /* these are just for error messages, see copy_in_error_callback */
      const char *cur_relname;    /* table name for error messages */
***************
*** 173,178 ****
--- 182,208 ----
  } DR_copy;


+ /**
+  * Size of the LRU list of relations to keep in cache for routing
+  */
+ int partitioningCacheSize;
+
+ typedef struct OidCell OidCell;
+
+ typedef struct OidLinkedList
+ {
+     int        length;
+     OidCell *head;
+ } OidLinkedList;
+
+ struct OidCell
+ {
+     Oid        oid_value;
+     OidCell *next;
+ };
+
+ OidLinkedList *child_table_lru = NULL;
+
  /*
   * These macros centralize code used to process line_buf and raw_buf buffers.
   * They are macros because they often do continue/break control and to avoid
***************
*** 839,844 ****
--- 869,882 ----
                       errmsg("argument to option \"%s\" must be a list of column names",
                              defel->defname)));
          }
+          else if (strcmp(defel->defname, "partitioning") == 0)
+          {
+              if (cstate->partitioning)
+                  ereport(ERROR,
+                          (errcode(ERRCODE_SYNTAX_ERROR),
+                           errmsg("conflicting or redundant options")));
+              cstate->partitioning = defGetBoolean(defel);
+          }
          else
              ereport(ERROR,
                      (errcode(ERRCODE_SYNTAX_ERROR),
***************
*** 1662,1667 ****
--- 1700,1992 ----
      return res;
  }

+ /**
+  * Check that the given tuple matches the constraints of the given child table
+  * and performs an insert if the constraints are matched. insert_tuple specifies
+  * if the tuple must be inserted in the table if the constraint is satisfied.
+  * The method returns true if the constraint is satisfied (and insert was
+  * performed if insert_tuple is true), false otherwise (constraints not
+  * satisfied for this tuple on this child table).
+  */
+ static bool
+ check_tuple_constraints(Relation child_table_relation, HeapTuple tuple,
+     bool insert_tuple, int hi_options, ResultRelInfo *parentResultRelInfo)
+ {
+     /* Check the constraints */
+     ResultRelInfo    *resultRelInfo;
+     TupleTableSlot    *slot;
+     EState             *estate = CreateExecutorState();
+     bool            result = false;
+
+     resultRelInfo = makeNode(ResultRelInfo);
+     resultRelInfo->ri_RangeTableIndex = 1;        /* dummy */
+     resultRelInfo->ri_RelationDesc = child_table_relation;
+     resultRelInfo->ri_TrigDesc = CopyTriggerDesc(child_table_relation->trigdesc);
+     if (resultRelInfo->ri_TrigDesc)
+         resultRelInfo->ri_TrigFunctions = (FmgrInfo *)
+             palloc0(resultRelInfo->ri_TrigDesc->numtriggers * sizeof(FmgrInfo));
+     resultRelInfo->ri_TrigInstrument = NULL;
+
+     ExecOpenIndices(resultRelInfo);
+
+     estate->es_result_relations = resultRelInfo;
+     estate->es_num_result_relations = 1;
+     estate->es_result_relation_info = resultRelInfo;
+
+     /* Set up a tuple slot too */
+     slot = MakeSingleTupleTableSlot(child_table_relation->rd_att);
+     ExecStoreTuple(tuple, slot, InvalidBuffer, false);
+
+     if (ExecRelCheck(resultRelInfo, slot, estate) == NULL)
+     {
+         /* Constraints satisfied */
+         if (insert_tuple)
+         {
+             /* Insert the row in the child table */
+             List *recheckIndexes = NIL;
+
+             /* BEFORE ROW INSERT Triggers */
+             if (resultRelInfo->ri_TrigDesc &&
+                 resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+             {
+                 HeapTuple    newtuple;
+                 newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);
+
+                 if (newtuple != tuple)
+                 {
+                     /* modified by Trigger(s) */
+                     heap_freetuple(tuple);
+                     tuple = newtuple;
+                 }
+             }
+
+             /* Perform the insert
+              * TODO: Check that we detect constraint violation if before row
+              * insert does something bad
+              */
+             /* OK, store the tuple and create index entries for it */
+             heap_insert(child_table_relation, tuple, GetCurrentCommandId(true),
+                     hi_options, NULL);
+
+             /* Update indices */
+             if (resultRelInfo->ri_NumIndices > 0)
+                 recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+                                                        estate, false);
+
+             /* AFTER ROW INSERT Triggers */
+
+             if (resultRelInfo->ri_TrigDesc &&
+                     resultRelInfo->ri_TrigDesc->n_after_row[TRIGGER_EVENT_INSERT] > 0)
+                 ExecARInsertTriggersNow(estate, resultRelInfo, tuple);
+         }
+         result = true;
+     }
+
+      /* Free resources */
+     FreeExecutorState(estate);
+     ExecDropSingleTupleTableSlot(slot);
+     ExecCloseIndices(resultRelInfo);
+
+     return result;
+ }
+
+
+ /**
+  * Route a tuple into a child table that matches the constraints of the tuple
+  * to be inserted.
+  * @param parent_relation_id Oid of the parent relation
+  * @param tuple the tuple to be routed
+  */
+ static bool route_tuple_to_child(Relation parent_relation, HeapTuple tuple, int hi_options, ResultRelInfo
*parentResultRelInfo)
+ {
+     Relation    child_table_relation;
+     bool        result = false;
+     Relation    catalog_relation;
+     HeapTuple    inherits_tuple;
+     HeapScanDesc scan;
+     ScanKeyData key[1];
+
+     /* Try to exploit locality for bulk inserts
+      * We expect consecutive insert to go to the same child table */
+     if (partitioningCacheSize > 0 && child_table_lru != NULL)
+     {
+          /* Try the child table LRU */
+         OidCell *child_oid_cell;
+         OidCell *previous_cell = NULL;
+         Oid child_relation_id;
+
+         for (child_oid_cell = child_table_lru->head ; child_oid_cell != NULL ;
+                 child_oid_cell = child_oid_cell->next)
+         {
+             child_relation_id = child_oid_cell->oid_value;
+             child_table_relation = try_relation_open(child_relation_id,
+                 RowExclusiveLock);
+
+             if (child_table_relation == NULL)
+             {
+                 /* Child table does not exist anymore, purge cache entry */
+                 if (previous_cell == NULL)
+                 {
+                     child_table_lru->head = child_oid_cell->next;
+                 }
+                 else
+                 {
+                     previous_cell->next = child_oid_cell->next;
+                 }
+
+                 pfree(child_oid_cell);
+                 child_table_lru->length--;
+                 continue;
+             }
+
+             if (check_tuple_constraints(child_table_relation, tuple, true, hi_options, parentResultRelInfo))
+             {
+                 /* Hit, move in front if not already the head
+                  * Close the relation but keep the lock until the end of
+                  * the transaction */
+                 relation_close(child_table_relation, NoLock);
+
+                 if (previous_cell != NULL)
+                 {
+                     previous_cell->next = child_oid_cell->next;
+                     child_oid_cell->next = child_table_lru->head;
+                     child_table_lru->head = child_oid_cell;
+                 }
+                 return true;
+             }
+             relation_close(child_table_relation, RowExclusiveLock);
+             previous_cell = child_oid_cell;
+         }
+         /* We got a miss */
+     }
+
+     /* Looking up child tables */
+     ScanKeyInit(&key[0],
+             Anum_pg_inherits_inhparent,
+             BTEqualStrategyNumber, F_OIDEQ,
+             ObjectIdGetDatum(parent_relation->rd_id));
+     catalog_relation = heap_open(InheritsRelationId, AccessShareLock);
+     scan = heap_beginscan(catalog_relation, SnapshotNow, 1, key);
+     while ((inherits_tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+     {
+         TupleConstr *constr;
+         Form_pg_inherits inh = (Form_pg_inherits) GETSTRUCT(inherits_tuple);
+         Oid child_relation_id = inh->inhrelid;
+
+         /* Check if the child table satisfy the constraints, if the relation
+          * cannot be opened this throws an exception */
+         child_table_relation = (Relation) relation_open(child_relation_id,
+             RowExclusiveLock);
+
+         constr = child_table_relation->rd_att->constr;
+         if (constr->num_check == 0)
+         {
+             ereport(ERROR,
+                     (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+                      errmsg("partition routing found no constraint for relation \"%s\"",
+                              RelationGetRelationName(child_table_relation))));
+         }
+
+         if (has_subclass(child_table_relation->rd_id))
+         {
+             /* This is a parent table, check its constraints first */
+             if (check_tuple_constraints(child_table_relation, tuple, false, hi_options, parentResultRelInfo))
+             {
+                 /* Constraint satisfied, explore the child tables */
+                 result = route_tuple_to_child(child_table_relation, tuple, hi_options, parentResultRelInfo);
+                 if (result)
+                 {
+                     /* Success, one of our child tables matched.
+                      * Release the lock on this parent relation, we did not use it */
+                     relation_close(child_table_relation, RowExclusiveLock);
+                     break;
+                 }
+                 else
+                 {
+                     ereport(ERROR,
+                             (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+                              errmsg("tuple matched constraints of relation \"%s\" but none of "
+                              "its children",
+                              RelationGetRelationName(child_table_relation))));
+                 }
+             }
+         }
+         else
+         {
+             /* Child table, try it */
+             result = check_tuple_constraints(child_table_relation, tuple, true, hi_options, parentResultRelInfo);
+         }
+
+         if (result)
+         {
+             /* We found the one, update the LRU and exit the loop!
+              *
+              * Close the relation but keep the lock until the end of
+              * the transaction */
+             relation_close(child_table_relation, NoLock);
+
+             if (partitioningCacheSize > 0)
+             {
+                 OidCell *new_head;
+
+                 if (child_table_lru == NULL)
+                 {
+                     /* Create the list if it does not exist */
+                     child_table_lru = (OidLinkedList *)MemoryContextAlloc(
+                         CacheMemoryContext, sizeof(OidLinkedList));
+                     child_table_lru->length = 0;
+                     child_table_lru->head = NULL;
+                 }
+
+                 /* Add the new entry in head of the list */
+                 new_head = (OidCell *) MemoryContextAlloc(
+                     CacheMemoryContext, sizeof(OidCell));
+                 new_head->oid_value = child_relation_id;
+                 new_head->next = child_table_lru->head;
+                 child_table_lru->head = new_head;
+                 child_table_lru->length++;
+
+                 /* Adjust list size if needed */
+                 if (child_table_lru->length > partitioningCacheSize)
+                 {
+                     OidCell *child_oid_cell;
+                     OidCell *previous_cell = NULL;
+                     int length = 1;
+
+                     for (child_oid_cell = child_table_lru->head ;
+                         child_oid_cell != NULL ; child_oid_cell = child_oid_cell->next)
+                     {
+                         /* Note that partitioningCacheSize is at least 1 so we don't
+                          * have to worry about the head. */
+                         if (length > partitioningCacheSize)
+                         {
+                             /* Remove entry */
+                             previous_cell->next = child_oid_cell->next;
+                             pfree(child_oid_cell);
+                             child_oid_cell = previous_cell;
+                         }
+                         else
+                         {
+                             previous_cell = child_oid_cell;
+                         }
+                         length++;
+                     }
+                     child_table_lru->length = partitioningCacheSize;
+                 }
+             }
+             break;
+         }
+         else
+         {
+             /* Release the lock on that relation, we did not use it */
+             relation_close(child_table_relation, RowExclusiveLock);
+         }
+     }
+     heap_endscan(scan);
+     heap_close(catalog_relation, AccessShareLock);
+     return result;
+ }
+
  /*
   * Copy FROM file to relation.
   */
***************
*** 2149,2178 ****
          {
              List *recheckIndexes = NIL;

!             /* Place tuple in tuple slot */
!             ExecStoreTuple(tuple, slot, InvalidBuffer, false);
!
!             /* Check the constraints of the tuple */
!             if (cstate->rel->rd_att->constr)
!                 ExecConstraints(resultRelInfo, slot, estate);
!
!             /* OK, store the tuple and create index entries for it */
!             heap_insert(cstate->rel, tuple, mycid, hi_options, bistate);

!             if (resultRelInfo->ri_NumIndices > 0)
!                 recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
!                                                        estate, false);

!             /* AFTER ROW INSERT Triggers */
!             ExecARInsertTriggers(estate, resultRelInfo, tuple,
!                                  recheckIndexes);

!             /*
!              * We count only tuples not suppressed by a BEFORE INSERT trigger;
!              * this is the same definition used by execMain.c for counting
!              * tuples inserted by an INSERT command.
!              */
!             cstate->processed++;
          }
      }

--- 2474,2526 ----
          {
              List *recheckIndexes = NIL;

!             /* If routing is enabled and table has child tables, let's try routing */
!             if (cstate->partitioning && has_subclass(cstate->rel->rd_id))
!             {
!                 if (route_tuple_to_child(cstate->rel, tuple, hi_options, resultRelInfo))
!                 {
!                     /* increase the counter so that we return how many
!                      * tuples got copied into all tables in total */
!                     cstate->processed++;
!                 }
!                 else
!                 {
!                     ereport(ERROR, (
!                             errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
!                             errmsg("tuple does not satisfy any child table constraint")
!                     ));
!                 }
!             }
!             else
!             {
!                 /* No partitioning, prepare the tuple and
!                  * check the constraints */

!                 /* Place tuple in tuple slot */
!                 ExecStoreTuple(tuple, slot, InvalidBuffer, false);

!                 /* Check the constraints of the tuple */
!                 if (cstate->rel->rd_att->constr)
!                     ExecConstraints(resultRelInfo, slot, estate);
!
!                 /* OK, store the tuple and create index entries for it */
!                 heap_insert(cstate->rel, tuple, mycid, hi_options, bistate);
!
!                 if (resultRelInfo->ri_NumIndices > 0)
!                     recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
!                             estate, false);
!
!                 /* AFTER ROW INSERT Triggers */
!                 ExecARInsertTriggers(estate, resultRelInfo, tuple,
!                         recheckIndexes);

!                 /*
!                  * We count only tuples not suppressed by a BEFORE INSERT trigger;
!                  * this is the same definition used by execMain.c for counting
!                  * tuples inserted by an INSERT command.
!                  */
!                 cstate->processed++;
!             }
          }
      }

Index: src/include/commands/trigger.h
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/include/commands/trigger.h,v
retrieving revision 1.77
diff -c -r1.77 trigger.h
*** src/include/commands/trigger.h    26 Oct 2009 02:26:41 -0000    1.77
--- src/include/commands/trigger.h    15 Nov 2009 23:12:50 -0000
***************
*** 130,135 ****
--- 130,138 ----
  extern HeapTuple ExecBRInsertTriggers(EState *estate,
                       ResultRelInfo *relinfo,
                       HeapTuple trigtuple);
+ extern HeapTuple ExecARInsertTriggersNow(EState *estate,
+                      ResultRelInfo *relinfo,
+                      HeapTuple trigtuple);
  extern void ExecARInsertTriggers(EState *estate,
                       ResultRelInfo *relinfo,
                       HeapTuple trigtuple,
Index: src/include/commands/copy.h
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/include/commands/copy.h,v
retrieving revision 1.32
diff -c -r1.32 copy.h
*** src/include/commands/copy.h    1 Jan 2009 17:23:58 -0000    1.32
--- src/include/commands/copy.h    15 Nov 2009 23:12:50 -0000
***************
*** 22,25 ****
--- 22,30 ----

  extern DestReceiver *CreateCopyDestReceiver(void);

+ /**
+  * Size of the LRU list of relations to keep in cache for partitioning in COPY
+  */
+ extern int partitioningCacheSize;
+
  #endif   /* COPY_H */
Index: src/include/executor/executor.h
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/include/executor/executor.h,v
retrieving revision 1.163
diff -c -r1.163 executor.h
*** src/include/executor/executor.h    26 Oct 2009 02:26:41 -0000    1.163
--- src/include/executor/executor.h    15 Nov 2009 23:12:50 -0000
***************
*** 166,171 ****
--- 166,173 ----
  extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
  extern void ExecConstraints(ResultRelInfo *resultRelInfo,
                  TupleTableSlot *slot, EState *estate);
+ extern const char *ExecRelCheck(ResultRelInfo *resultRelInfo,
+              TupleTableSlot *slot, EState *estate);
  extern TupleTableSlot *EvalPlanQual(EState *estate, EPQState *epqstate,
               Relation relation, Index rti,
               ItemPointer tid, TransactionId priorXmax);
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/backend/executor/execMain.c,v
retrieving revision 1.334
diff -c -r1.334 execMain.c
*** src/backend/executor/execMain.c    26 Oct 2009 02:26:29 -0000    1.334
--- src/backend/executor/execMain.c    15 Nov 2009 23:12:50 -0000
***************
*** 1235,1241 ****
  /*
   * ExecRelCheck --- check that tuple meets constraints for result relation
   */
! static const char *
  ExecRelCheck(ResultRelInfo *resultRelInfo,
               TupleTableSlot *slot, EState *estate)
  {
--- 1235,1241 ----
  /*
   * ExecRelCheck --- check that tuple meets constraints for result relation
   */
! const char *
  ExecRelCheck(ResultRelInfo *resultRelInfo,
               TupleTableSlot *slot, EState *estate)
  {
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.523
diff -c -r1.523 guc.c
*** src/backend/utils/misc/guc.c    21 Oct 2009 20:38:58 -0000    1.523
--- src/backend/utils/misc/guc.c    15 Nov 2009 23:12:50 -0000
***************
*** 32,37 ****
--- 32,38 ----
  #include "access/xact.h"
  #include "catalog/namespace.h"
  #include "commands/async.h"
+ #include "commands/copy.h"
  #include "commands/prepare.h"
  #include "commands/vacuum.h"
  #include "commands/variable.h"
***************
*** 534,539 ****
--- 535,542 ----
      gettext_noop("Customized Options"),
      /* DEVELOPER_OPTIONS */
      gettext_noop("Developer Options"),
+     /* COPY_OPTIONS */
+     gettext_noop("Copy Options"),
      /* help_config wants this array to be null-terminated */
      NULL
  };
***************
*** 1955,1960 ****
--- 1958,2019 ----
          1024, 100, 102400, NULL, NULL
      },

+     {
+         {
+             /* variable name */
+             "copy_partitioning_cache_size",
+
+             /* context, we want the user to set it */
+             PGC_USERSET,
+
+             /* category for this configuration variable */
+             COPY_OPTIONS,
+
+             /* short description */
+             gettext_noop("Size of the LRU list of child tables to keep in cache "
+                     " when partitioning tuples in COPY."),
+
+             /* long description */
+             gettext_noop("When tuples are automatically routed in COPY, all "
+                 "tables are scanned until the constraints are matched. When "
+                 "a large number of child tables are present the scanning "
+                 "overhead can be large. To reduce that overhead, the routing "
+                 "mechanism keeps a cache of the last child tables in which "
+                 "tuples where inserted and try these tables first before "
+                 "performing a full scan. This variable defines the cache size "
+                 "with 0 meaning no caching, 1 keep the last matching child table"
+                 ", x keep the last x child tables in which tuples were inserted."
+                 " Note that the list is managed with an LRU policy."),
+
+
+             /* flags: this option is not in the postgresql.conf.sample
+              * file and should not be allowed in the config.
+              * NOTE: this is not currently enforced.
+              */
+             GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+         },
+
+         /* pointer to the variable, this one is present in
+          * src/backend/commands/copy.c
+          */
+         &partitioningCacheSize,
+
+         /* default value */
+         2,
+
+         /* min value */
+         0,
+
+         /* max value */
+         INT_MAX,
+
+         /* assign hook function */
+         NULL,
+
+         /* show hook function */
+         NULL
+     },
+
      /* End-of-list marker */
      {
          {NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL
Index: src/test/regress/parallel_schedule
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/parallel_schedule,v
retrieving revision 1.57
diff -c -r1.57 parallel_schedule
*** src/test/regress/parallel_schedule    24 Aug 2009 03:10:16 -0000    1.57
--- src/test/regress/parallel_schedule    15 Nov 2009 23:12:50 -0000
***************
*** 47,53 ****
  # execute two copy tests parallel, to check that copy itself
  # is concurrent safe.
  # ----------
! test: copy copyselect

  # ----------
  # Another group of parallel tests
--- 47,53 ----
  # execute two copy tests parallel, to check that copy itself
  # is concurrent safe.
  # ----------
! test: copy copyselect copy_partitioning

  # ----------
  # Another group of parallel tests
Index: doc/src/sgml/ref/copy.sgml
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.92
diff -c -r1.92 copy.sgml
*** doc/src/sgml/ref/copy.sgml    21 Sep 2009 20:10:21 -0000    1.92
--- doc/src/sgml/ref/copy.sgml    15 Nov 2009 23:12:50 -0000
***************
*** 41,46 ****
--- 41,47 ----
      ESCAPE '<replaceable class="parameter">escape_character</replaceable>'
      FORCE_QUOTE { ( <replaceable class="parameter">column</replaceable> [, ...] ) | * }
      FORCE_NOT_NULL ( <replaceable class="parameter">column</replaceable> [, ...] )
+     PARTITIONING [ <replaceable class="parameter">boolean</replaceable> ]
  </synopsis>
   </refsynopsisdiv>

***************
*** 282,287 ****
--- 283,298 ----
      </listitem>
     </varlistentry>

+    <varlistentry>
+     <term><literal>PARTITIONING</></term>
+     <listitem>
+      <para>
+       In <literal>PARTITIONING</> mode, <command>COPY TO</> a parent
+       table will automatically move each row to the child table that
+       has the matching constraints.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </refsect1>

***************
*** 384,389 ****
--- 395,418 ----
      <command>VACUUM</command> to recover the wasted space.
     </para>

+    <para>
+     <literal>PARTITIONING</> mode scans for each child table constraint in the
+     hierarchy to find a match. As an optimization, a cache of the last child
+     tables where tuples have been routed is kept and tried first. The size
+     of the cache is set by the <literal>copy_partitioning_cache_size</literal>
+     session variable. It the size is set to 0, the cache is disabled otherwise
+     the indicated number of child tables is kept in the cache (at most).
+    </para>
+
+    <para>
+     <literal>PARTITIONING</> mode assumes that every child table has at least
+     one constraint defined otherwise an error is thrown. If child tables have
+     overlapping constraints, the row is inserted in the first child table found
+     (be it a cached table or the first table to appear in the lookup).
+     ROW and STATEMENT triggers that modify the tuple value after routing has
+     been performed will lead to unpredictable errors.
+    </para>
+
   </refsect1>

   <refsect1>
***************
*** 828,833 ****
--- 857,1000 ----
  0000200   M   B   A   B   W   E 377 377 377 377 377 377
  </programlisting>
    </para>
+
+   <para>
+   Multiple options are separated by a comma like:
+ <programlisting>
+ COPY (SELECT t FROM foo WHERE id = 1) TO STDOUT (FORMAT CSV, HEADER, FORCE_QUOTE (t));
+ </programlisting>
+   </para>
+
+   <refsect2>
+   <title>Partitioning examples</title>
+   <para>
+   Here is an example on how to use partitioning. Let's first create a parent
+   table and 3 child tables as follows:
+ <programlisting>
+     CREATE TABLE y2008 (
+       id     int not null,
+       date   date not null,
+       value  int
+     );
+
+     CREATE TABLE jan2008 (
+         CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+     ) INHERITS (y2008);
+
+     CREATE TABLE feb2008 (
+         CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+     ) INHERITS (y2008);
+
+     CREATE TABLE mar2008 (
+         CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+     ) INHERITS (y2008);
+ </programlisting>
+ We prepare the following data file (1 row for each child table):
+ copy_input.data content:
+ <programlisting>
+ 11  '2008-01-10'    11
+ 12  '2008-02-15'    12
+ 13  '2008-03-15'    13
+ 21  '2008-01-10'    11
+ 31  '2008-01-10'    11
+ 41  '2008-01-10'    11
+ 22  '2008-02-15'    12
+ 23  '2008-03-15'    13
+ 32  '2008-02-15'    12
+ 33  '2008-03-15'    13
+ 42  '2008-02-15'    12
+ 43  '2008-03-15'    13
+ </programlisting>
+     If we COPY the data in the parent table without partitioning enabled, all
+     rows are inserted in the master table as in this example:
+ <programlisting>
+ COPY y2008 FROM 'copy_input.data';
+
+ SELECT COUNT(*) FROM y2008;
+  count
+ -------
+     12
+ (1 row)
+
+ SELECT COUNT(*) FROM jan2008;
+  count
+ -------
+      0
+ (1 row)
+
+ SELECT COUNT(*) FROM feb2008;
+  count
+ -------
+      0
+ (1 row)
+
+ SELECT COUNT(*) FROM mar2008;
+  count
+ -------
+      0
+ (1 row)
+
+ DELETE FROM y2008;
+ </programlisting>
+     If we execute COPY with partitioning enabled, rows are loaded in the
+     appropriate child table automatically as in this example:
+ <programlisting>
+     COPY y2008 FROM 'copy_input.data' (PARTITIONING);
+
+ SELECT * FROM y2008;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-10-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-10-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM feb2008;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+ </programlisting>
+     The cache size can be tuned using:
+ <programlisting>
+ set copy_partitioning_cache_size = 3;
+ </programlisting>
+     Repeating the COPY command will now be faster:
+ <programlisting>
+     COPY y2008 FROM 'copy_input.data' (PARTITIONING);
+ </programlisting>
+     </para>
+   </refsect2>
   </refsect1>

   <refsect1>
Index: src/include/utils/guc_tables.h
===================================================================
RCS file: /home/manu/cvsrepo/pgsql/src/include/utils/guc_tables.h,v
retrieving revision 1.46
diff -c -r1.46 guc_tables.h
*** src/include/utils/guc_tables.h    11 Jun 2009 14:49:13 -0000    1.46
--- src/include/utils/guc_tables.h    15 Nov 2009 23:12:50 -0000
***************
*** 76,82 ****
      COMPAT_OPTIONS_CLIENT,
      PRESET_OPTIONS,
      CUSTOM_OPTIONS,
!     DEVELOPER_OPTIONS
  };

  /*
--- 76,83 ----
      COMPAT_OPTIONS_CLIENT,
      PRESET_OPTIONS,
      CUSTOM_OPTIONS,
!     DEVELOPER_OPTIONS,
!     COPY_OPTIONS
  };

  /*
Index: src/test/regress/input/copy_partitioning.source
===================================================================
RCS file: src/test/regress/input/copy_partitioning.source
diff -N src/test/regress/input/copy_partitioning.source
*** /dev/null    1 Jan 1970 00:00:00 -0000
--- src/test/regress/input/copy_partitioning.source    1 Jan 1970 00:00:00 -0000
***************
*** 0 ****
--- 1,180 ----
+ -- test 1
+ create table parent(i int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ copy parent from stdin with (partitioning);
+ 1
+ \.
+
+ drop table parent cascade;
+
+ create table parent(i int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ copy parent from stdin with (partitioning);
+ 1
+ \.
+
+ drop table parent cascade;
+
+ -- test 2
+ set copy_partitioning_cache_size = 0;
+ create table parent(i int, j int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ create table c2 (check (i > 1 and i <= 2)) inherits (parent);
+ create table c3 (check (i > 2 and i <= 3)) inherits (parent);
+
+ create index c1_idx on c1(j);
+ copy (select i % 3 + 1, i from generate_series(1, 1000) s(i)) to '/tmp/parent';
+ copy parent from '/tmp/parent' with (partitioning);
+ analyse;
+
+ set enable_seqscan to false;
+ -- no rows if index was not updated
+ select * from c1 where j = 3;
+
+ set enable_seqscan to true;
+ set enable_indexscan to false;
+ -- 1 row
+ select * from c1 where j = 3;
+ drop table parent cascade;
+
+ -- test 3
+ set copy_partitioning_cache_size = 0;
+
+ create table parent(i int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ create table c2 (check (i > 1 and i <= 2)) inherits (parent);
+ create table c3 (check (i > 2 and i <= 3)) inherits (parent);
+
+ create table audit(i int);
+
+ create function audit() returns trigger as $$ begin insert into audit(i) values (new.i); return new; end; $$ language
plpgsql;
+
+ create trigger parent_a after insert on parent for each row execute procedure audit();
+ -- the before trigger on the parent would get fired
+ -- create trigger parent_a2 before insert on parent for each row execute procedure audit();
+ create trigger c1_a before insert on c1 for each row execute procedure audit();
+ create trigger c1_a2 after insert on c1 for each row execute procedure audit();
+
+ copy parent from stdin with (partitioning);
+ 1
+ 2
+ 3
+ \.
+
+ -- no rows if trigger does not work
+ select * from audit;
+
+ drop table parent cascade;
+ drop table audit cascade;
+ drop function audit();
+
+ -- test cache size
+ CREATE TABLE y2008 (
+   id     int not null,
+   date   date not null,
+   value  int,
+   primary key(id)
+ );
+
+ CREATE TABLE jan2008 (
+     CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+ ) INHERITS (y2008);
+
+ CREATE TABLE jan2008half1 (
+     CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-01-15' )
+ ) INHERITS (jan2008);
+
+ CREATE TABLE jan2008half2 (
+     CHECK ( date >= DATE '2008-01-16' AND date < DATE '2008-01-31' )
+ ) INHERITS (jan2008);
+
+ CREATE TABLE feb2008 (
+     CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+ ) INHERITS (y2008);
+
+ CREATE TABLE mar2008 (
+     CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+ ) INHERITS (y2008);
+
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data';
+
+ SELECT COUNT(*) FROM y2008;
+ SELECT COUNT(*) FROM jan2008;
+ SELECT COUNT(*) FROM jan2008half1;
+ SELECT COUNT(*) FROM jan2008half2;
+ SELECT COUNT(*) FROM feb2008;
+ SELECT COUNT(*) FROM mar2008;
+
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 0;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 1;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 2;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 3;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 2;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 1;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ set copy_partitioning_cache_size = 0;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+ SELECT * FROM jan2008 ORDER BY id;
+ SELECT * FROM jan2008half1 ORDER BY id;
+ SELECT * FROM jan2008half2 ORDER BY id;
+ SELECT * FROM feb2008 ORDER BY id;
+ SELECT * FROM mar2008 ORDER BY id;
+ DELETE FROM y2008;
+
+ DROP TABLE y2008 CASCADE;
Index: src/test/regress/output/copy_partitioning.source
===================================================================
RCS file: src/test/regress/output/copy_partitioning.source
diff -N src/test/regress/output/copy_partitioning.source
*** /dev/null    1 Jan 1970 00:00:00 -0000
--- src/test/regress/output/copy_partitioning.source    1 Jan 1970 00:00:00 -0000
***************
*** 0 ****
--- 1,567 ----
+ -- test 1
+ create table parent(i int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ copy parent from stdin with (partitioning);
+ drop table parent cascade;
+ NOTICE:  drop cascades to table c1
+ create table parent(i int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ copy parent from stdin with (partitioning);
+ drop table parent cascade;
+ NOTICE:  drop cascades to table c1
+ -- test 2
+ set copy_partitioning_cache_size = 0;
+ create table parent(i int, j int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ create table c2 (check (i > 1 and i <= 2)) inherits (parent);
+ create table c3 (check (i > 2 and i <= 3)) inherits (parent);
+ create index c1_idx on c1(j);
+ copy (select i % 3 + 1, i from generate_series(1, 1000) s(i)) to '/tmp/parent';
+ copy parent from '/tmp/parent' with (partitioning);
+ analyse;
+ set enable_seqscan to false;
+ -- no rows if index was not updated
+ select * from c1 where j = 3;
+  i | j
+ ---+---
+  1 | 3
+ (1 row)
+
+ set enable_seqscan to true;
+ set enable_indexscan to false;
+ -- 1 row
+ select * from c1 where j = 3;
+  i | j
+ ---+---
+  1 | 3
+ (1 row)
+
+ drop table parent cascade;
+ NOTICE:  drop cascades to 3 other objects
+ DETAIL:  drop cascades to table c1
+ drop cascades to table c2
+ drop cascades to table c3
+ -- test 3
+ set copy_partitioning_cache_size = 0;
+ create table parent(i int);
+ create table c1 (check (i > 0 and i <= 1)) inherits (parent);
+ create table c2 (check (i > 1 and i <= 2)) inherits (parent);
+ create table c3 (check (i > 2 and i <= 3)) inherits (parent);
+ create table audit(i int);
+ create function audit() returns trigger as $$ begin insert into audit(i) values (new.i); return new; end; $$ language
plpgsql;
+ create trigger parent_a after insert on parent for each row execute procedure audit();
+ -- the before trigger on the parent would get fired
+ -- create trigger parent_a2 before insert on parent for each row execute procedure audit();
+ create trigger c1_a before insert on c1 for each row execute procedure audit();
+ create trigger c1_a2 after insert on c1 for each row execute procedure audit();
+ copy parent from stdin with (partitioning);
+ -- no rows if trigger does not work
+ select * from audit;
+  i
+ ---
+  1
+  1
+ (2 rows)
+
+ drop table parent cascade;
+ NOTICE:  drop cascades to 3 other objects
+ DETAIL:  drop cascades to table c1
+ drop cascades to table c2
+ drop cascades to table c3
+ drop table audit cascade;
+ drop function audit();
+ -- test cache size
+ CREATE TABLE y2008 (
+   id     int not null,
+   date   date not null,
+   value  int,
+   primary key(id)
+ );
+ NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "y2008_pkey" for table "y2008"
+ CREATE TABLE jan2008 (
+     CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+ ) INHERITS (y2008);
+ CREATE TABLE jan2008half1 (
+     CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-01-15' )
+ ) INHERITS (jan2008);
+ CREATE TABLE jan2008half2 (
+     CHECK ( date >= DATE '2008-01-16' AND date < DATE '2008-01-31' )
+ ) INHERITS (jan2008);
+ CREATE TABLE feb2008 (
+     CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+ ) INHERITS (y2008);
+ CREATE TABLE mar2008 (
+     CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+ ) INHERITS (y2008);
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data';
+ SELECT COUNT(*) FROM y2008;
+  count
+ -------
+     12
+ (1 row)
+
+ SELECT COUNT(*) FROM jan2008;
+  count
+ -------
+      0
+ (1 row)
+
+ SELECT COUNT(*) FROM jan2008half1;
+  count
+ -------
+      0
+ (1 row)
+
+ SELECT COUNT(*) FROM jan2008half2;
+  count
+ -------
+      0
+ (1 row)
+
+ SELECT COUNT(*) FROM feb2008;
+  count
+ -------
+      0
+ (1 row)
+
+ SELECT COUNT(*) FROM mar2008;
+  count
+ -------
+      0
+ (1 row)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 0;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 1;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 2;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 3;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 2;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 1;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ set copy_partitioning_cache_size = 0;
+ COPY y2008 FROM '@abs_srcdir@/data/copy_input.data' (PARTITIONING);
+ SELECT * FROM y2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  12 | 02-15-2008 |    12
+  13 | 03-15-2008 |    13
+  21 | 01-10-2008 |    11
+  22 | 02-15-2008 |    12
+  23 | 03-15-2008 |    13
+  31 | 01-10-2008 |    11
+  32 | 02-15-2008 |    12
+  33 | 03-15-2008 |    13
+  41 | 01-10-2008 |    11
+  42 | 02-15-2008 |    12
+  43 | 03-15-2008 |    13
+ (12 rows)
+
+ SELECT * FROM jan2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (4 rows)
+
+ SELECT * FROM jan2008half1 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  21 | 01-10-2008 |    11
+  31 | 01-10-2008 |    11
+  41 | 01-10-2008 |    11
+ (3 rows)
+
+ SELECT * FROM jan2008half2 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  11 | 01-19-2008 |    11
+ (1 row)
+
+ SELECT * FROM feb2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  12 | 02-15-2008 |    12
+  22 | 02-15-2008 |    12
+  32 | 02-15-2008 |    12
+  42 | 02-15-2008 |    12
+ (4 rows)
+
+ SELECT * FROM mar2008 ORDER BY id;
+  id |    date    | value
+ ----+------------+-------
+  13 | 03-15-2008 |    13
+  23 | 03-15-2008 |    13
+  33 | 03-15-2008 |    13
+  43 | 03-15-2008 |    13
+ (4 rows)
+
+ DELETE FROM y2008;
+ DROP TABLE y2008 CASCADE;
+ NOTICE:  drop cascades to 5 other objects
+ DETAIL:  drop cascades to table jan2008
+ drop cascades to table jan2008half1
+ drop cascades to table jan2008half2
+ drop cascades to table feb2008
+ drop cascades to table mar2008

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Summary and Plan for Hot Standby
Next
From: Roger Leigh
Date:
Subject: Re: Unicode UTF-8 table formatting for psql text output