Home > mailing lists

Optimizing a condition based on an a very unequally distributed value. - Mailing list pgsql-admin

From	Nick Fankhauser
Subject	Optimizing a condition based on an a very unequally distributed value.
Date	March 13, 2002 13:33:54
Msg-id	NEBBLAAHGLEEPCGOBHDGKEMPEIAA.nickf@ontko.com Whole thread Raw
Responses	Re: Optimizing a condition based on an a very unequally
List	pgsql-admin

Tree view

Hi-

I have a field that is very rarely set to 'YES', but I need to filter my
results so that only rows where it is set to 'NO' appear.

Here is the distribution:


temp=#  select count(*) from case_data where case_impound = 'YES';
 count
-------
     1
(1 row)

temp=#  select count(*) from case_data where case_impound = 'NO';
 count
-------
 23768
(1 row)


Since I always test this field, I want to make sure an index is used, but
depending on what I look for, I get different query plans:



temp=# explain select count(*) from case_data where case_impound = 'NO';
NOTICE:  QUERY PLAN:

Aggregate  (cost=815.52..815.52 rows=1 width=0)
  ->  Seq Scan on case_data  (cost=0.00..756.10 rows=23768 width=0)

EXPLAIN
temp=# explain select count(*) from case_data where case_impound = 'YES';
NOTICE:  QUERY PLAN:

Aggregate  (cost=2.23..2.23 rows=1 width=0)
  ->  Index Scan using case_data_case_impound on case_data  (cost=0.00..2.22
rows=1 width=0)

EXPLAIN
temp=# explain select count(*) from case_data where case_impound != 'NO';
NOTICE:  QUERY PLAN:

Aggregate  (cost=756.10..756.10 rows=1 width=0)
  ->  Seq Scan on case_data  (cost=0.00..756.10 rows=1 width=0)

EXPLAIN
temp=# explain select count(*) from case_data where case_impound != 'YES';
NOTICE:  QUERY PLAN:

Aggregate  (cost=815.52..815.52 rows=1 width=0)
  ->  Seq Scan on case_data  (cost=0.00..756.10 rows=23768 width=0)



So my question in general is why does PGSQL opt to use the index when
looking for the single field row, and not use it when looking for the other
23768 rows?

More specifically is there a trick to make it use the index in the condition
that I want to test for, which could be either [ = 'NO' ] or [ != 'YES' ]?

Thanks!

-NickF



--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/

pgsql-admin by date:

From: "Gaetano Mendola"
Date: 13 March 2002, 12:24:15
Subject: Re: Dependence beetwen Function

From: Don Saxton
Date: 13 March 2002, 14:05:31
Subject: Re: Pg_restore on 7.2 from 7.1 under cygwin

Optimizing a condition based on an a very unequally distributed value. - Mailing list pgsql-admin

Previous

Next