Re: Recommendations for partitioning? - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Recommendations for partitioning?
Date
Msg-id CAMkU=1wWcL9rzMzvsaV_OrqtexieVe16dEZvd2Q70hmB=JqLug@mail.gmail.com
Whole thread Raw
In response to Recommendations for partitioning?  (Dave Johansen <davejohansen@gmail.com>)
Responses Re: Recommendations for partitioning?
List pgsql-performance
On Thu, Dec 5, 2013 at 7:36 AM, Dave Johansen <davejohansen@gmail.com> wrote:
I'm managing a database that is adding about 10-20M records per day to a table and time is a core part of most queries,


What is the nature of how the time column is used in the queries?   Depending on how it is used, you might not get much improvement at all, or you might get N fold improvement, or you might find that re-designing your indexes could get you the same query improvement that partitioning would, but with less headache.
 
so I've been looking into seeing if I need to start using partitioning based on the time column and I've found these general guidelines:

Don't use more than about 50 paritions ( http://www.postgresql.org/message-id/17147.1271696670@sss.pgh.pa.us )

Using triggers slows INSERTs down by a lot (unless they were already slow due to the need to hit disk to maintain the indexes or something like that).  Are you sure you can handle that slow down, given your insertion rate?  You could get the best of both worlds by having your bulk loaders target the correct partition directly, but also have the triggers on the parent table for any programs that don't get the message.
 

The only data I found fell inline with what you'd expect (i.e. speeds up selects but slows down inserts/updates http://www.if-not-true-then-false.com/2009/performance-testing-between-partitioned-and-non-partitioned-postgresql-tables-part-3/ )


One of the big benefits of partitioning can be to speed up insertions tremendously, by keeping the hot part of the indices that need to be maintained upon insertion together in shared_buffers.
 

So I was thinking that partitioning based on month to keep the number of partitions low, so that would mean about 0.5G records in each table. Does that seem like a reasonable number of records in each partition? Is there anything else that I should consider or be aware of?

How will data be expired?  Hows does the size of one of your intended partitions compare to your RAM and shared_buffers.
 
Cheers,

Jeff

pgsql-performance by date:

Previous
From: desmodemone
Date:
Subject: Re: Recommendations for partitioning?
Next
From: Scott Marlowe
Date:
Subject: Re: Recommendations for partitioning?