Re: planner question.. - Mailing list pgsql-sql

From Tom Lane
Subject Re: planner question..
Date
Msg-id 10432.1050589603@sss.pgh.pa.us
Whole thread Raw
In response to planner question..  (Rajesh Kumar Mallah <mallah@trade-india.com>)
Responses Re: planner question..
List pgsql-sql
Rajesh Kumar Mallah <mallah@trade-india.com> writes:
> For a distribution of data like below why does the planner
> choses to do an index scan by default for source = 'REGIS' when > 50%
> of the rows are having source='REGIS'.

Are there a huge number of dead rows in the table?  ("vacuum verbose"
would give some info)

The given result seems suspect; an indexscan couldn't possibly read >50%
of the rows in less than a quarter of the time for a seqscan.  Unless
(a) the table contains vast amounts of empty space that the seqscan has to
slog through, or (b) your second measurement is bogus due to caching
performed by the first measurement.

Also, might the table be in order by the "source" column?  A
sufficiently high correlation might have persuaded the planner to try an
indexscan even if point (a) isn't true.
        regards, tom lane



pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: analyse question..
Next
From: "Marco Roda"
Date:
Subject: OUTER JOIN