Re: Partial index on date column - Mailing list pgsql-hackers

From scott.marlowe
Subject Re: Partial index on date column
Date
Msg-id Pine.LNX.4.33.0303061054240.5572-100000@css120.ihs.com
Whole thread Raw
In response to Partial index on date column  ("Dave Page" <dpage@vale-housing.co.uk>)
List pgsql-hackers
On Thu, 6 Mar 2003, Dave Page wrote:

> I have a table that is likely to grow over the next few years at a rate
> of 1K-2K rows/day. As the vast majority of the activity on the table
> (other than the inserts) will be selects of data for the current day, I
> have a cron job that drops and recreates a partial index just after
> midnight. It also vacuum analyzes the table.
> 
> -- Index: public.pbx_log_today_idx
> CREATE INDEX pbx_log_today_idx ON pbx_log USING btree (pbx_time,
> pbx_call_type, pbx_digits_source, pbx_digits_destination) WHERE
> (pbx_date = '2003-03-06'::date);
> 
> I'm surprised by the following behaviour:
> 
> EXPLAIN SELECT * FROM pbx_log WHERE pbx_date = CURRENT_DATE;
> 
> Seq Scan on pbx_log  (cost=0.00..286.20 rows=1274 width=384)
>   Filter: (pbx_date = ('now'::text)::date)
> 
> 
> EXPLAIN SELECT * FROM pbx_log WHERE pbx_date = '2003-03-06';
> 
> Index Scan using pbx_log_today_idx on pbx_log  (cost=0.00..5.00 rows=0
> width=384)
>   Filter: (pbx_date = '2003-03-06'::date)
> 
> Is this just an oddity because I don't have masses of data yet (4500
> rows right now), or is this something the optimizer cannot handle?

It's not an oddity, it's the planner trying to decide which is the better 
choice, a seq scan or an index scan.  If you had 150,000,000 rows and 
asked for 149,999,000 of them, it would be counterproductive to use an 
index, since you're gonna visit nearly every page of the table anyway.

If you were gonna get 1,000 rows out of 150,000,000 then an index scan 
makes more sense,

Somewhere in between is the switch point where going from one to the other 
makes sense.

Since you've got 4500 rows and asked for 1274 of them it's likely that the 
database will have to read the whole table anyway, so it goes ahead and 
does it.

Look for random_page_cost and a few other settings near it in the 
$PGDATA/postgresql.conf file.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: TODO: DROP COLUMN .. CASCADE
Next
From: Tom Lane
Date:
Subject: Re: Partial index on date column