Re: Strategies/Best Practises Handling Large Tables - Mailing list pgsql-general

From Chitra Creta
Subject Re: Strategies/Best Practises Handling Large Tables
Date
Msg-id CABkVLeNDW0XUyqQpPpB=hrMmgv-Q4VC5Ph5iaHX+O5wK6bUR0A@mail.gmail.com
Whole thread Raw
In response to Re: Strategies/Best Practises Handling Large Tables  (John R Pierce <pierce@hogranch.com>)
Responses Re: Strategies/Best Practises Handling Large Tables  (Ryan Kelly <rpkelly22@gmail.com>)
List pgsql-general

Thank you all for your suggestions. Since all of you recommended the Partition option, I decided to try it out myself.

I found a function that created partition tables for every month in the last two years. A trigger was also added to the parent table to ensure that every insert into it from hence forth will be inserted into the approapriate child table.

However, there were a few observations that I made which I would appreciate your comments on:

1. Since existing data was in the parent table, I had to do a pg_dump on it, drop it, and then to a restore on it to force the trigger to work on existing data. Is this how partitioning existing data should be done?

2. I noticed that there are two copies of the same record - i.e the one that was inserted into the parent table and another that was inserted in the child table. If I delete the record in the parent table, the child record gets automatically deleted. I was under the impression that partitioning meant that my parent table will not be large anymore because the data will be moved to smaller child tables. Is this the case?

3. Is there a way for me to evaluate the effectiveness of the partitioned table? Would performing an Explain Analyse allow me to determine whether querying the parent table for statistics is quicker than querying against a massive non-partitioned table?

Thank you.

On Oct 13, 2012 3:49 AM, "John R Pierce" <pierce@hogranch.com> wrote:
On 10/12/12 7:44 AM, Chitra Creta wrote:

1. Purge old data
2. Reindex
3. Partition
4. Creation of daily, monthly, yearly summary tables that contains aggregated data specific to the statistics required


if most of your queries read the majority of the tables,  indexing will be of little help

parittioning will aid in purging old data, as you can partitions by date (for instance, by week) and drop whole partitions rather than deleting individual records.

aggregate tables likely will be the biggest win for your statistics if they reduce the mount of data you need to query.


--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Maximilian Tyrtania
Date:
Subject: Re: return query execute SQL-problem
Next
From: Ryan Kelly
Date:
Subject: Re: Strategies/Best Practises Handling Large Tables