Re: general question on two-partition table - Mailing list pgsql-general
From | Janet Jacobsen |
---|---|
Subject | Re: general question on two-partition table |
Date | |
Msg-id | 4A6F76C5.6010506@lbl.gov Whole thread Raw |
In response to | Re: general question on two-partition table (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: general question on two-partition table
|
List | pgsql-general |
Hi. Thanks for your reply. I looked at the documentation for partitions - it is the case, right, that I have to create the master table and the two partition tables (depending on the value of rbscore) and then copy the records from the existing table into the two partitions? Stephen Frost wrote: > * Janet Jacobsen (jsjacobsen@lbl.gov) wrote: > >> If they are going to spend 95% of their time querying the >> records that meet the 'good' criteria, what are the good >> strategies for ensuring good performance for those queries? >> (1) Should I partition the table into two partitions based on >> the value of rbscore? >> (2) Should I create two separate tables? >> >> Are (1) and (2) more or less equivalent in terms of >> performance? >> > > It's not clear to me what you plan here.. How would you handle (2) for > the users? Would you construct a view across them, or expect them to > query the right table(s)? Options, as I see them, and in the order of > 'best-to-worst' wrt user friendlyness and performance, I believe, are: > > 1- Partitioning (with CHECK constraints and constraint_exclusion) > 2- View across two tables (with appropriate WHERE clauses) > 3- Functional index (as suggested by someone else) > 4- separate tables (users have to figure out how to use them) > 5- single table with everything > > My recommendation would be #1, followed by #2. Be sure to look up how > to do partitioning by using inheiritance in PG, and, if you need to, > look at how to implement a trigger to handle inserts on the parent > table. Make sure you create your CHECK() constraints correctly, and > that you have constraint_exclusion enabled, and that it *works*. > Is there some reason why you emphasized "works"? Would I use EXPLAIN to verify that the constraints are being checked? >> I think that partitioning the table is a more flexible option >> (i.e., what if the cutoff value changes, no need to change >> the name of the table being queried, etc.), but would (2) >> give better performance given that 95% of their queries >> are for rbscores greater than a threshold value? >> > > If you have your partitioning set up correctly, I don't believe having > actual separate tables would be that much of a performance gain for > queries. It would help some with inserts, tho if you know which table > to insert into, you could just insert into that child rather than the > main and using a trigger. > > Enjoy, > > Stephen > Since the insert would be based on the value of rbscore, and there is only one cut-off value, I could skip using a trigger. At this point it seems like it was be fairly easy to try both the partial index on rbscore and partitioning to see whether there are performance differences. Does anyone have any experience with performances difference between using a partial index and partitions? (Or is it not so usual to have a two-partition table.) Also, does anyone have experience with the performance of partitions as the size of the table increases, say from 10 million rows to 10 billion rows? Janet
pgsql-general by date: