Re: Partitioning of a dependent table not based on date - Mailing list pgsql-general

From Herouth Maoz
Subject Re: Partitioning of a dependent table not based on date
Date
Msg-id 53AC2E6F-CC02-4EC7-B358-5DD54B6CCC5A@unicell.co.il
Whole thread Raw
In response to Re: Partitioning of a dependent table not based on date  (Andy Colson <andy@squeakycode.net>)
List pgsql-general
On 01/12/2014, at 19:26, Andy Colson wrote:

> On 12/1/2014 11:14 AM, Herouth Maoz wrote:
>> I am currently in the process of creating a huge archive database that
>> contains data from all of our systems, going back for almost a decade.
>>
>> Most of the tables fall into one of two categories:
>>
>> 1. Static tables, which are rarely updated, such as lookup tables or
>> user lists. I don't intend to partition these, I'll just refresh them
>> periodically from production.
>> 2. Transaction tables, that have a timestamp field, for which I have the
>> data archived in COPY format by month. Of course a monolithic table over
>> a decade is not feasible, so I am partitioning these by month.
>>
>> (I don't mean "transaction" in the database sense, but in the sense that
>> the data represents historical activity, e.g. message sent, file
>> downloaded etc.)
>>
>> I have one table, though, that doesn't fall into this pattern. It's a
>> many-to-one table relating to one of the transaction tables. So on one
>> hand, it doesn't have a time stamp field, and on the other hand, it has
>> accumulated lots of data over the last decade so I can't keep it
>> unpartitioned.
>>
>
> Lets stop here.  One big table with lots of rows (and a good index) isn't a problem.  As long as you are not table
scanningeverything, there isn't a reason to partition the table. 
>
> Lots and lots of rows isnt a problem except for a few usage patterns:
> 1) delete from bigtable where (some huge percent of the rows)
> 2) select * from bigtable where (lots and lots of table scanning and cant really index)
>
> If your index is selective enough, you'll be fine.


Hmm. I suppose you're right. I planned the whole partition thing in the first place because most of my "transaction"
tablesare still alive so I'll need to continue bulk-inserting data every month, and inserting into a fresh partition is
betterthan into a huge table. 

But in this case, since we have stopped working on this application in January, there will be no fresh inserts so it's
notas important. We just need the archive for legal purposes. 

One thing, though: I noticed on my other system (a reports system, that holds a year's worth of data) that after I have
partitionedthe largest tables, backup time dropped. I suppose pg_dump of a single huge table takes is not as fast as
pg_dumpof multiple smaller ones. 

Herouth

pgsql-general by date:

Previous
From: Albe Laurenz
Date:
Subject: Re: Serialization exception : Who else was involved?
Next
From: Tim Schäfer
Date:
Subject: Auto vacuum not running -- Could not bind socket for statistics collector