Thread: Performance problems on a fairly big table with two key columns.
Hi, I have a table that looks like this: DATA ID TIME |------|----|------| The table holds app. 14M rows now and grows by app. 350k rows a day. The ID-column holds about 1500 unique values (integer). The TIME-columns is of type timestamp without timezone. I have one index (b-tree) on the ID-column and one index (b-tree) on the time-column. My queries most often look like this: SELECT DATA FROM <tbl> WHERE ID = 1 AND TIME > now() - '1 day'::interval; or SELECT DATA FROM <tbl> WHERE ID = 2 AND TIME > now() - '1 week'::interval; Since I have about 350000 rows the last 24 hours the query planner chooses to use my ID-index to get hold of the rows - then using only a filter on the time column. This takes a lot of time (over a minute) on a P4 1900MHz which unfortenately isn't good enough for my purpose (webpages times out and so on..). If I SELECT only the rows with a certain ID (regardless of time): SELECT DATA FROM <tbl> WHERE ID = 3; ..it still takes almost a minute so I guess this is the problem (not the filtering on the TIME-column), especially since it recieves a lot of rows which will be descarded using my filter anyway. (I recieve ~6000 rows and want about 250). But using the TIME-column as a first subset of rows and discarding using the ID-column as a filter is even worse since I then get 350k rows and discards about 349750 of them using the filter. I tried applying a multicolumn index on ID and TIME, but that one won't even be used (after ANALYZE). My only option here seems to have like a "daily" table which will only carry the rows for the past 24 hours which will give my SELECT a result of 6000 initial rows out of ~350k (instead of 14M like now) and then 250 when filtered. But I really hope there is a cleaner solution to the problem - actually I though a multicolumn index would do it. -ra
On Thursday 04 September 2003 23:53, Rasmus Aveskogh wrote: > Hi, > > I have a table that looks like this: > > DATA ID TIME > > |------|----|------| > > The table holds app. 14M rows now and grows by app. 350k rows a day. > > The ID-column holds about 1500 unique values (integer). > The TIME-columns is of type timestamp without timezone. > > I have one index (b-tree) on the ID-column and one index (b-tree) on the > time-column. > > My queries most often look like this: > > SELECT DATA FROM <tbl> WHERE ID = 1 AND TIME > now() - '1 day'::interval; [snip] > I tried applying a multicolumn index on ID and TIME, but that one won't > even be used (after ANALYZE). The problem is likely to be that the parser isn't spotting that now()-'1 day' is constant. Try an explicit time and see if the index is used. If so, you can write a wrapper function for your expression (mark it STABLE so the planner knows it won't change during the statement). Alternatively, you can do the calculation in the application and use an explicit time. HTH -- Richard Huxton Archonet Ltd
Richard, Thanks a lot! You were right - the query parser "misunderstood" now() - '1 day'::interval and only used one of the indexes (as I already noticed). Actually all I had to do was to cast the result like this: (now() - '1 day'::interval)::date 75s is not between 10ms and 200ms. Thanks again! -ra > On Thursday 04 September 2003 23:53, Rasmus Aveskogh wrote: >> Hi, >> >> I have a table that looks like this: >> >> DATA ID TIME >> >> |------|----|------| >> >> The table holds app. 14M rows now and grows by app. 350k rows a day. >> >> The ID-column holds about 1500 unique values (integer). >> The TIME-columns is of type timestamp without timezone. >> >> I have one index (b-tree) on the ID-column and one index (b-tree) on the >> time-column. >> >> My queries most often look like this: >> >> SELECT DATA FROM <tbl> WHERE ID = 1 AND TIME > now() - '1 >> day'::interval; > [snip] >> I tried applying a multicolumn index on ID and TIME, but that one won't >> even be used (after ANALYZE). > > The problem is likely to be that the parser isn't spotting that now()-'1 > day' > is constant. Try an explicit time and see if the index is used. If so, you > can write a wrapper function for your expression (mark it STABLE so the > planner knows it won't change during the statement). > > Alternatively, you can do the calculation in the application and use an > explicit time. > > HTH > -- > Richard Huxton > Archonet Ltd >
On Friday 05 September 2003 16:36, Rasmus Aveskogh wrote: > Richard, > > Thanks a lot! You were right - the query parser "misunderstood" > now() - '1 day'::interval and only used one of the indexes (as I already > noticed). > > Actually all I had to do was to cast the result like this: > > (now() - '1 day'::interval)::date > > 75s is not between 10ms and 200ms. > > Thanks again! Ah - good. You also want to be careful with differences between timestamp with/without time zone etc. -- Richard Huxton Archonet Ltd