Re: Table Partitioning and Rules - Mailing list pgsql-sql

From Steve Crawford
Subject Re: Table Partitioning and Rules
Date
Msg-id 200307171239.51837.scrawford@pinpointresearch.com
Whole thread Raw
In response to Re: Table Partitioning and Rules  ("Girish Bajaj" <gbajaj@tietronix.com>)
Responses Re: Table Partitioning and Rules  (Richard Huxton <dev@archonet.com>)
List pgsql-sql
OK, so basically you are trying to keep a hundered some odd attributes on 
everyone in the US. It's possible that a 150 column table is properly 
normalized (I have a similar situation) but it is rare.

Suppose it is really properly normalized. You can still benefit from indexes 
on just some of the columns by choosing those most commonly used in queries. 
You may also want to research partial indexes (create index foo ... where 
bar=baz) which can under certain circumstances be far smaller and faster than 
full indexes.

Breaking the table up won't improve the full table scan - in fact as each 
sub-table's file grows it will probably fragment on the disk much worse than 
a single growing file would which will, along with all the overhead of 
joining all the tables, make things worse.

Review your structure carefully. Plan on $$$ for the hardware.

Cheers,
Steve


On Thursday 17 July 2003 12:22 pm, Girish Bajaj wrote:
> The problem is that Im worried about sequential scans. This particular
> table can have upto 150 cols and 250 million records. Now we have a
> reporting requirement that someone could select on ANY col and filter on
> any col as well. Meaning someone could so a SELECT on col number
> 1,2,310,1000 from contact where col num 75='X' and col num 139 = 'Y'.
>
> I cant possibly index all the cols in the table. So I thought Id best
> manage the data by splitting up the table into multiple partitions and
> eventually depending on application logic, only scan those tables that are
> necessary to scan sequentially instead of the whole big table.
>
> Im getting a little confused here cause eventually I would want to join in
> this 250 million gigantic table as well.. and that would be a real big
> problem causing loads of sequential scans wouldn't it?
>
> Thanks,
> Girish
>
>
> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: Thursday, July 17, 2003 2:03 PM
> To: Richard Huxton; Girish Bajaj; pgsql-sql@postgresql.org
> Subject: Re: [SQL] Table Partitioning and Rules
>
> Girish,
>
> > > Essentially Im trying to store a persons information in a table in the
> > > database. Since we could have millions of people, with duplicates! Ive
> > > decided we need to partition the table into segments where all people
>
> with
>
> > > the LastName starting from A to G will be in one table. H-N will be in
> > > another table and O-Z in the third. Ive created a VIEW that does a
> > > UNION
>
> on
>
> > > all the tables.
>
> This sounds hideously inefficient and a management headache besides.  I
> think
> PostgreSQL will accept up to 2 billion rows in any one table, and splitting
> stuff into 3 tables will not improve your performance ... quite the
> opposite.
>
> Change your database design.



pgsql-sql by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Table Partitioning and Rules
Next
From: Dmitry Tkach
Date:
Subject: Re: Table Partitioning and Rules