Re: best practice for moving millions of rows to child table when setting up partitioning? - Mailing list pgsql-admin

From Mark Stosberg
Subject Re: best practice for moving millions of rows to child table when setting up partitioning?
Date
Msg-id ip9tkc$jek$1@dough.gmane.org
Whole thread Raw
In response to best practice for moving millions of rows to child table when setting up partitioning?  (Mark Stosberg <mark@summersault.com>)
Responses Re: Re: best practice for moving millions of rows to child table when setting up partitioning?
Re: Re: best practice for moving millions of rows to child table when setting up partitioning?
List pgsql-admin
On 04/27/2011 10:48 AM, Mark Stosberg wrote:
>
> Hello,
>
> I'm working on moving a table with over 30 million to rows to be
> partitioned. The table seeing several inserts per second. It's
> essentially an activity log that only sees insert activity and is
> lightly used for reporting, such that queries against it can safely be
> disabled during a transition.
>
> I'm looking for recommendations for a way to do this that will be least
> disruptive to the flow of inserts statements that will continue to
> stream in.
>
> Here's the plan which seems best to me at the moment. Is it is
> reasonable?

I revised my plan based on feedback and mentioned resources here, and
also due to my own concerns about the resource and performance impact of
using the INSERT .. SELECT pattern on millions of rows at a time.

Here's my revised plan for the transition, which avoids using large
INSERT..SELECT statements, and only requires a minimal amount of the
transition to work to happen once the application has started to insert
data into the child tables. Seem reasonable?

New Plan
---------

1. Create the child tables targeted to contain data from the parent as
   standalone tables (including archive tables and the current month).

2. Take a full database backup from this morning and extract the COPY
   statement for the parent table. Manually split it up by date to create
   multiple copy statements, one for each partition.

3. Run the COPY statements to load the data into each child table.  So
   far, everything has happened outside of application access.

4. Find the max ID that has been inserted in the current child table.

5. INSERT .. SELECT the missing rows from the last backup from the parent
   table to the current child table to be. Again, note the max ID.

6. Now, during a maintenance window:
   - alter the child tables to inherit the parent
   - Set up the trigger which starts redirecting inserts
     from the parent table to the child table.
   - INSERT .. SELECT the file the final few missing rows from
     the parent to the current child
   - TRUNCATE then CLUSTER the parent table.

pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: Starting psql without a database?
Next
From: ramon cruel
Date:
Subject: Please, i want exit here