Re: Indexing Strategy - Mailing list pgsql-novice
From | Matthew Engelbert |
---|---|
Subject | Re: Indexing Strategy |
Date | |
Msg-id | 20041211005702.75029.qmail@web60302.mail.yahoo.com Whole thread Raw |
In response to | Indexing Strategy (Matthew Engelbert <mje1975@yahoo.com>) |
Responses |
Re: Indexing Strategy
|
List | pgsql-novice |
The select query will be like this: select * from quote where sybmol = 'MSFT' and extract(doy from time) = 340 order by time I need to select all the records for a particular symbol, for a particular day, ordered by time. I might need to select the records in the same way, but for a shorter time interval, like some hour during the day. I'm using Postgresql version 7.4. Other info I forgot to mention.... this data is historical, and so I won't be doing lots of inserts, updates or deletes - I'll just load a big chunk of data once in a while, and rebuild the index. I realize that I can also add a lookup table for symbol, as there are only about 150 unique values. Time values are about 99% unique. Yes, I'm sure I need to tune the db better. I'm new to Postgresql. Thanks for the help. --- Josh Berkus <josh@agliodbs.com> wrote: > Matthew, > > > I'm having trouble correctly indexing a large table. The table will be > > receiving about 30 million rows/day, and has the following definition: > > > > CREATE TABLE quote ( > > symbol varchar (5), > > price numeric (7,4), > > time timestamp > > ); > > > > I need to configure the indices so that I can minimize the time to request > > all the records for a particular symbol, for a particular day, ordered by > > time. > > Please post the query you're using. How are you selecting the day? What > version of PostgreSQL are you using? > > > I've tried clustering a single index based on symbol, then time, but this > > operation takes much too long. I think this is because this operation > > would have O(N^2) complexity(?). > > Nope, just because it's a very large table and you're probably swapping. > Try > increasing sort_mem dramatically, like half your RAM. > > > I think my next best option would be to > > cluster on an index based on symbol, then have another index on time. > > Not that either. You need a multicolumn index. When you answer the above > quesitons, I'll explain it. > > > Does any one know of some resources on the web that discusses indexing > > strategies ? Any help would be appreciated? Thanks. > > Not that covers your question. I'm writing a book which covers this right > now, but it won't be published until January 2006. > > -- > --Josh > > Josh Berkus > Aglio Database Solutions > San Francisco >
pgsql-novice by date: