Re: Fast data, slow data - Mailing list pgsql-general

From Shaun Thomas
Subject Re: Fast data, slow data
Date
Msg-id 53AD7469.2060903@optionshouse.com
Whole thread Raw
In response to Re: Fast data, slow data  (Tim Uckun <timuckun@gmail.com>)
List pgsql-general
On 06/26/2014 05:31 PM, Tim Uckun wrote:

> 1. If there is a very large set of data in the table that needs to be
>  moved this will be slow and might throw locks which would impact the
>  performance of the inserts and the updates.

Well, the locks would only affect the rows being moved. If this is
primarily insert-based, the only performance hit would be from the extra
IO. If you move records frequently enough, these small batches should
not be a problem.

Though I strongly suggest you build a tier library that creates a
partition for every day/week/month so the target movement tables
themselves don't get overly large.

> 2. Constantly deleting large chunks of data might cause vacuum problems.

This goes back to PostgreSQL and MVCC. If your interval moves roughly
the same amount of data each time, the table will no longer grow past
that point. In the movement function, just have it run a manual
vacuum/analyze which is not restricted by autovacuum rules. It'll use
more IO, but will finish faster and check growth more consistently.

I wrote a tier library I'm working on getting permission to open source
that does exactly this. It supports tier granularity up to 1 day, and
will age out old data into the partitions beyond that threshold. You
could probably tool something similar to be more aggressive if you want
to keep that 15-minute (or shorter) expiration interval.

I haven't used it myself, but pg_partman might have what you need:

http://pgxn.org/dist/pg_partman/doc/pg_partman.html

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


pgsql-general by date:

Previous
From: Oliver
Date:
Subject: Re: [pgadmin-support] Best backup strategy for production systems
Next
From: David G Johnston
Date:
Subject: Re: Fast data, slow data