Home > mailing lists

Re: Advice for optimizing queries using Large Tables - Mailing list pgsql-general

From	Tom Lane
Subject	Re: Advice for optimizing queries using Large Tables
Date	March 10, 2002 14:15:31
Msg-id	27024.1015776894@sss.pgh.pa.us Whole thread Raw
In response to	Advice for optimizing queries using Large Tables ("Shaun Grannis" <shaun_grannis@hotmail.com>)
List	pgsql-general

Tree view

"Shaun Grannis" <shaun_grannis@hotmail.com> writes:
> I'm working with a table containing over 65 million records in Postgres v
> 7.1.3.

> This query:
>     SELECT count(*) FROM table WHERE value=1999;
> takes approximately 45 minutes to execute, and returns a count of approx 2.2
> million records.

A query scanning 1/30th of the table almost certainly should use a
seqscan not an indexscan.  Does it get faster if you do "set
enable_seqscan to off"?

>     Aggregate (cost=477861.60..477861.60 rows=1 width=0)
>         -> Index Scan using value_idx on table (cost=0.00..477553.70
> rows=123157 width=0)

Hmm.  The reason that the planner is making the wrong plan choice is the
drastic underestimation of the number of matched rows.  With so few
distinct values in the column I'd have expected 7.1 to get a more
accurate estimate, but it's probably not worth worrying about at this
point.  The short answer is to update to 7.2 --- it has much better
statistics-gathering code and should pick the right plan.

            regards, tom lane

PS: this is a refreshing change from the usual "I want an indexscan,
why aren't I getting one?" type of planner mistake ;-)

pgsql-general by date:

From: Tom Lane
Date: 10 March 2002, 14:02:37
Subject: Re: Referential Integrity Triggers

From: Francisco Reyes
Date: 10 March 2002, 16:04:21
Subject: Wal_buffers memory utilization

Re: Advice for optimizing queries using Large Tables - Mailing list pgsql-general

Previous

Next