Strange behaviour - Mailing list pgsql-performance

From Richard Rowell
Subject Strange behaviour
Date
Msg-id 1154036222.22231.40.camel@meowth
Whole thread Raw
Responses Re: Strange behaviour
List pgsql-performance
We are using a BI tool that generates some rather ugly queries.  One of
the ugly queries is taking much longer to return thin I think it
should.

The select expression when run alone returns in 2 seconds with 35k rows
(http://www.bowmansystems.com/~richard/explain_select.analyze)

The "where clause" when run alone returns 5200 rows in 10 seconds
(http://www.bowmansystems.com/~richard/explain_where.analyze)

However when I put to two together it takes much, much longer to run.
(http://www.bowmansystems.com/~richard/full.analyze)

Can anyone shed any light on what is going on here?  Why does the
optimizer choose such a slow plan in the combined query when the only
real difference between the full query and the "where only" query is the
number of rows in the result set on the "outside" of the "IN" clause?

A few pertinent observations/facts below

1.  The query is generated by a BI tool, I know it is ugly and stupid in
many cases. However, please try to see the larger issue, that if the
select and where portions are run separately they are both fast but
together it is insanely slow.

2.  The database has vacuumdb -f -z run on it nightly.

3.  Modifications to the stock postgresql.conf:
shared_buffers = 15000
work_mem = 131072
default_statistics_target = 100

4.  Dual Dual core Opterons, 4 gigs of ram, 6 disk Serial ATA hardware
RAID 10 running Postgres 8.03 compiled from source running on Debian
stable.

5. The tables being queried are only 200 megs or so combined on disk,
the whole DB is ~ 4 gigs
SELECT sum(relpages*8/1024) AS size_M FROM pg_class;
 size_m
--------
   4178

Thanks!


pgsql-performance by date:

Previous
From: "NbForYou"
Date:
Subject: Query 200x slower on server [PART 2]
Next
From: Jignesh Shah
Date:
Subject: Re: PostgreSQL scalability on Sun UltraSparc T1