Re: Partitioned table performance - Mailing list pgsql-performance
From | Jim C. Nasby |
---|---|
Subject | Re: Partitioned table performance |
Date | |
Msg-id | 20041221231155.GV18180@decibel.org Whole thread Raw |
In response to | Re: Partitioned table performance (Josh Berkus <josh@agliodbs.com>) |
List | pgsql-performance |
On Wed, Dec 15, 2004 at 11:56:40AM -0800, Josh Berkus wrote: > Greg, > > > Well Oracle has lots of partitioning intelligence pushed up to the planner > > to avoid overhead. > > > > If you have a query with something like "WHERE date = '2004-01-01'" and > > date is your partition key (even if it's a range) then Oracle will figure > > out which partition it will need at planning time. > > Hmmm ... well, we're looking at making a spec for Postgres Table Partitioning. > Maybe you could help? This is something I've been thinking about doing for http://stats.distributed.net; is there a formal project for this somewhere? On a different note, has anyone looked at the savings you get by ommitting the partition field from the child tables? ISTM that the savings would be substantial for narrow tables. Of course that most likely means doing a union view instead of inheritence, but I'm guessing here. The table I'm thinking of partitioning is quite narrow (see below), so I suspect that dropping project_id out would result in a substantial savings (there's basically nothing that ever queries across the whole table). With the data distribution, I suspect just breaking project ID's 205, 5, and 25 into partitioned tables that didn't contain project_id would save about 450M (4bytes * 95% * 130M). (the table has ~130M rows) Table "public.email_contrib" Column | Type | Modifiers ------------+---------+----------- project_id | integer | not null id | integer | not null date | date | not null team_id | integer | work_units | bigint | not null Indexes: "email_contrib_pkey" primary key, btree (project_id, id, date) "email_contrib__pk24" btree (id, date) WHERE (project_id = 24) "email_contrib__pk25" btree (id, date) WHERE (project_id = 25) "email_contrib__pk8" btree (id, date) WHERE (project_id = 8) "email_contrib__project_date" btree (project_id, date) Foreign-key constraints: "fk_email_contrib__id" FOREIGN KEY (id) REFERENCES stats_participant(id) ON UPDATE CASCADE "fk_email_contrib__team_id" FOREIGN KEY (team_id) REFERENCES stats_team(team) ON UPDATE CASCADE stats=# select * from pg_stats where tablename='email_contrib' and attname='project_id'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs| histogram_bounds | correlation ------------+---------------+------------+-----------+-----------+------------+-------------------+---------------------------------------------------------+------------------+------------- public | email_contrib | project_id | 0 | 4 | 6 | {205,5,25,8,24,3} | {0.461133,0.4455,0.0444333,0.0418667,0.0049,0.00216667}| | 0.703936 -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-performance by date: