Re: Creating Index - Mailing list pgsql-sql
From | Stephan Szabo |
---|---|
Subject | Re: Creating Index |
Date | |
Msg-id | 20031001222050.E53419@megazone.bigpanda.com Whole thread Raw |
In response to | Re: Creating Index ("CN" <cnliou9@fastmail.fm>) |
List | pgsql-sql |
On Wed, 1 Oct 2003, CN wrote: > > You do realize that extract returns a double precision value not an > > integer, and it's probably not going to be willing to push clauses down > > through the union where the types are different . > > > > Argh! I didn't noticed that. Thanks for the reminder. > > Let's do not consider table2 and view1 for this moment and focus only on > table1. > Table1 in my original post was incorrect. Please forgive me! (I posted it > midnight when my head was not clear and tried to make my case simple for > understanding.) The correct one is: > > CREATE TABLE table1 > ( id VARCHAR(20) PRIMARY KEY, > d DATE, > amount INTEGER > ); > CREATE INDEX itable1 ON table1 (d); > > EXPLAIN ANALYZE SELECT COUNT(*) FROM table1 WHERE EXTRACT(YEAR FROM d) >= > 2001.0 AND EXTRACT(MONTH FROM d) >= 1.; > > takes 630 msec on my AMD 450MHz machine. While > > EXPLAIN ANALYZE SELECT COUNT(*) FROM table1 WHERE d >= '2001-1-1'; These two queries seem fairly equivalent, but WHERE EXTRACT(YEAR FROM d) >=2001 AND EXTRACT(MONTH FROM d)>=2is not equivalent to WHERE d>='2001-2-1' Are you trying to get certain months in a group of years or all months after a given fixed time point? If the former, only the former form in general will work, if the latter the former form really doesn't work at all with the exception of the case where you're doing month>=1 (which might as well mean you don't do a month test at all). > Aggregate (cost=535.20..535.20 rows=1 width=0) (actual > time=625.10..625.11 rows=1 loops=1) > -> Seq Scan on table1 (cost=0.00..532.58 rows=1048 width=0) (actual > time=14.84..605.85 rows=3603 loops=1) > Filter: ((date_part('year'::text, f2) > 2001::double precision) > AND (date_part('month'::text, f2) >= > 1::double precision)) > Total runtime: 626.61 msec > > ----------------------- > Aggregate (cost=464.12..464.12 rows=1 width=0) (actual > time=114.28..114.28 rows=1 loops=1) > -> Seq Scan on table1 (cost=0.00..461.86 rows=902 width=0) (actual > time=10.71..102.99 rows=3603 loops=1) > Filter: (f2 >= '2002-01-01'::date) > Total runtime: 114.50 msec > > Does the first query perform sequential scan? They both are. I'd have to guess that most of the real cost is coming from evaluating the conditions, which seems wierd. > If a composit index (year,month) derived from column "d" helps and is > available, then someone please show me how to build that index like: > > CREATE INDEX i1 ON table1 <EXTRACT(YEAR FROM d)::TEXT || EXTRACT(MONTH > FROM d)::TEXT> > > Is creating a function that eats DATE as argument to build that index my > only solution? For 7.3 and earlier, yes I think so and you'd have to use that form in the query. However that wouldn't help for the union query. In 7.4, you can make an index on table((extract(year from d)), (extract(month from d))) and I believe once the type issues were resolved that would get used. However, I think the evaluations of the extracts would mean that it'd still probably lose to a comparison on date (unless you want the fraction of a set of years solution).