Re: partitioning an existing table - Mailing list pgsql-performance

From Robert Blayzor
Subject Re: partitioning an existing table
Date
Msg-id 942C6568-41A7-46EC-B391-D523CC36B3AB@inoc.net
Whole thread Raw
In response to Re: partitioning an existing table  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: partitioning an existing table - efficient pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-performance
On Dec 30, 2017, at 12:38 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> For inheritance (available in and before PG10), the parent may be nonempty,
> which works fine, although someone else might find it unintuitive.  (Does the
> doc actually say "should" somewhere ?)

Well it doesn’t say should, but says “normally”..

"The parent table itself is normally empty; it exists just to represent the entire data set. …


> Just curious: are your constraints/indices on starting time or ending time?

Yes, the child tables will be strictly on a months worth of data.

CREATE TABLE table_201801
  (CHECK (ts >= DATE ‘2018-01-01' AND ts < DATE ‘2018-02-01'))
  INHERITS …


The application will insert directly into the child tables, so no need for triggers or rules.


> BTW depending on your requirements, it may be possible to make pg_dump much
> more efficient.  For our data, it's reasonable to assume that a table is
> "final" if its constraints exclude data older than a few days ago, and it can
> be permanently dumped and excluded from future, daily backups, which makes the
> backups smaller and faster, and probably causes less cache churn, etc.  But I
> imagine you might have different requirements, so that may be infeasible, or
> you'd maybe have to track insertions, either via p

The idea is only only keep a # of months available for searching over a period of months. Those months could be 3 or
more,up to a year, etc. But being able to just drop and entire child table for pruning is very attractive. Right now
theaverage months data is about 2-3 million rows each. Data is just inserted and then only searched. Never updated… 

I also like the idea of skipping all this older data from a PGdump. We archive records inserted into these tables daily
intocold storage. ie: export and compressed. So the data is saved cold. We dump the DB nightly also, but probably would
makesense to skip anything outside of the newest child table. Just not sure how to make that happen, yet…. 






pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: analyze stats: child vs parent
Next
From: Justin Pryzby
Date:
Subject: Re: partitioning an existing table - efficient pg_dump