Re: Indicies work on FreeBSD, not on Linux - Mailing list pgsql-hackers

From Michael Fuhr
Subject Re: Indicies work on FreeBSD, not on Linux
Date
Msg-id 20050308065948.GA53014@winnie.fuhr.org
Whole thread Raw
In response to Indicies work on FreeBSD, not on Linux  (Christopher Farley <chris@northernbrewer.com>)
List pgsql-hackers
On Sat, Mar 05, 2005 at 02:18:37AM -0600, Christopher Farley wrote:

> I'm sure this is something simple, but my Linux development machine
> running Postgresql 7.4.7 has very different behavior than my FreeBSD
> production machine running Postgresql 7.4.2. I've got the same table
> definitions, but I do have different data in the databases.

FreeBSD vs. Linux is probably a red herring -- different data is
more likely the cause of different behavior.  You also ran different
queries (one set has ORDER BY, one set doesn't); that might not
matter in this case, but when comparing test results the queries
should be identical.  And it might be a good idea to ANALYZE the
tables to make sure the planners' statistics are current.

Here's what you show for FreeBSD after the index is created:

> freebsd=# explain analyze select * from orders where or_status = 'N';
>                                                          QUERY PLAN 
>
----------------------------------------------------------------------------------------------------------------------------
>  Index Scan using or_status_idx on orders  (cost=0.00..382.26 rows=96
> width=759) (actual time=0.098..1.648 rows=38 loops=1)
>    Index Cond: (or_status = 'N'::bpchar)
>  Total runtime: 1.866 ms
> (3 rows)

The planner estimates that this query will return 96 rows (it
actually returns 38 rows).  How many pages and rows does the planner
think this table has?  That is, what's the result of the following
query?

SELECT relpages, reltuples FROM pg_class WHERE relname = 'orders';

Retrieving a small percentage of the total number of rows is more
likely to favor an index scan than retrieving a large percentage.

Here's what you show for Linux after creating the index:

> linux=# explain analyze select * from orders where or_status = 'N' order
> by or_number;
>                                                     QUERY PLAN 
> -------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=3981.40..3988.35 rows=2779 width=770) (actual
> time=1338.112..1384.465 rows=7760 loops=1)
>    Sort Key: or_number
>    ->  Seq Scan on orders  (cost=0.00..3260.54 rows=2779 width=770)
> (actual time=0.185..891.342 rows=7760 loops=1)
>          Filter: (or_status = 'N'::bpchar)
>  Total runtime: 1394.538 ms
> (5 rows)

The planner estimates that this query will return 2779 rows (it
actually returns 7760).  Again, how many pages and rows does the
planner think this table has?  If 2279 rows (the estimate) is a
significant percentage of the total then the planner might decide
that a sequential scan would be faster than an index scan.

What output do you get if you ANALYZE the tables?  What output do
you get on Linux if you disable sequential scans ("SET enable_seqscan
TO off")?  Updating the statistics might not help the Linux query
since a more accurate row estimate (7760) would be higher than the
inaccurate estimate (2779), which might make a sequential scan even
more likely (unless perhaps a more accurate reltuples is also much
higher), but in general it's a good idea to let the planner work
with accurate statistics.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: A bad plan
Next
From: David Fetter
Date:
Subject: Re: Best practices: MERGE