Re: Creating Index - Mailing list pgsql-sql

From Stephan Szabo
Subject Re: Creating Index
Date
Msg-id 20031001222050.E53419@megazone.bigpanda.com
Whole thread Raw
In response to Re: Creating Index  ("CN" <cnliou9@fastmail.fm>)
List pgsql-sql
On Wed, 1 Oct 2003, CN wrote:

> > You do realize that extract returns a double precision value not an
> > integer, and it's probably not going to be willing to push clauses down
> > through the union where the types are different .
> >
>
> Argh! I didn't noticed that. Thanks for the reminder.
>
> Let's do not consider table2 and view1 for this moment and focus only on
> table1.
> Table1 in my original post was incorrect. Please forgive me! (I posted it
> midnight when my head was not clear and tried to make my case simple for
> understanding.) The correct one is:
>
> CREATE TABLE table1
> ( id VARCHAR(20) PRIMARY KEY,
>   d DATE,
>   amount INTEGER
> );
> CREATE INDEX itable1 ON table1 (d);
>
> EXPLAIN ANALYZE SELECT COUNT(*) FROM table1 WHERE EXTRACT(YEAR FROM d) >=
> 2001.0 AND EXTRACT(MONTH FROM d) >= 1.;
>
> takes 630 msec on my AMD 450MHz machine. While
>
> EXPLAIN ANALYZE SELECT COUNT(*) FROM table1 WHERE d >= '2001-1-1';

These two queries seem fairly equivalent, but
WHERE EXTRACT(YEAR FROM d) >=2001 AND EXTRACT(MONTH FROM d)>=2is not equivalent to
WHERE d>='2001-2-1'

Are you trying to get certain months in a group of years or all months
after a given fixed time point? If the former, only the former form in
general will work, if the latter the former form really doesn't work at
all with the exception of the case where you're doing month>=1 (which
might as well mean you don't do a month test at all).

>  Aggregate (cost=535.20..535.20 rows=1 width=0) (actual
>  time=625.10..625.11 rows=1 loops=1)
>    ->  Seq Scan on table1  (cost=0.00..532.58 rows=1048 width=0) (actual
>    time=14.84..605.85 rows=3603 loops=1)
>          Filter: ((date_part('year'::text, f2) > 2001::double precision)
>          AND (date_part('month'::text, f2) >=
> 1::double precision))
>  Total runtime: 626.61 msec
>
> -----------------------
>  Aggregate  (cost=464.12..464.12 rows=1 width=0) (actual
>  time=114.28..114.28 rows=1 loops=1)
>    ->  Seq Scan on table1  (cost=0.00..461.86 rows=902 width=0) (actual
>    time=10.71..102.99 rows=3603 loops=1)
>          Filter: (f2 >= '2002-01-01'::date)
>  Total runtime: 114.50 msec
>
> Does the first query perform sequential scan?

They both are. I'd have to guess that most of the real cost is coming from
evaluating the conditions, which seems wierd.

> If a composit index (year,month) derived from column "d" helps and is
> available, then someone please show me how to build that index like:
>
> CREATE INDEX i1 ON table1 <EXTRACT(YEAR FROM d)::TEXT || EXTRACT(MONTH
> FROM d)::TEXT>
>
> Is creating a function that eats DATE as argument to build that index my
> only solution?

For 7.3 and earlier, yes I think so and you'd have to use that form in
the query.  However that wouldn't help for the union query.

In 7.4, you can make an index on table((extract(year from d)),
(extract(month from d))) and I believe once the type issues were resolved
that would get used.  However, I think the evaluations of the extracts
would mean that it'd still probably lose to a comparison on date (unless
you want the fraction of a set of years solution).



pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Creating Index
Next
From: Popeanga Marian
Date:
Subject: output