Home > mailing lists

general question on two-partition table - Mailing list pgsql-general

From	Janet Jacobsen
Subject	general question on two-partition table
Date	July 27, 2009 23:53:13
Msg-id	4A6E3DDA.2080606@lbl.gov Whole thread Raw
Responses	Re: general question on two-partition table Re: general question on two-partition table
List	pgsql-general

Tree view

Hi.  We have a table with 30 M records that is growing by
about 100 K records per day.

The experimentalists, whose data are in the table, have
decided that they will focus on the records for which the
value of one field, rbscore, is greater than a cut-off.
However, they want to continue to store all of the data
- even the records for which rbscore is less than the cutoff
- in the database.

For the current table, there are about 400 K (out of 30 M)
records that meet the 'good' criteria.

Each record in the table has about 40 fields, and the
experimentalists do in fact write queries that use many of
those fields, some more than others.  (They are building a
model and have not pinned down exactly which fields are
more important than others, so that's why they store and
query by so many fields.)

If they are going to spend 95% of their time querying the
records that meet the 'good' criteria, what are the good
strategies for ensuring good performance for those queries?
(1) Should I partition the table into two partitions based on
the value of rbscore?
(2) Should I create two separate tables?

Are (1) and (2) more or less equivalent in terms of
performance?

I think that partitioning the table is a more flexible option
(i.e., what if the cutoff value changes, no need to change
the name of the table being queried, etc.), but would (2)
give better performance given that 95% of their queries
are for rbscores greater than a threshold value?

Can you suggest other strategies?

Thank you,
Janet

pgsql-general by date:

From: Tom Lane
Date: 27 July 2009, 23:45:07
Subject: Re: For production: 8.4 or 8.3?

From: Christophe Pettus
Date: 27 July 2009, 23:53:40
Subject: Building from source vs RPMs

general question on two-partition table - Mailing list pgsql-general

Previous

Next