Thread: Strange behaviour

Strange behaviour

From

Richard Rowell

Date:

30 July 2006, 23:52:49

We are using a BI tool that generates some rather ugly queries.  One of
the ugly queries is taking much longer to return thin I think it
should.

The select expression when run alone returns in 2 seconds with 35k rows
(http://www.bowmansystems.com/~richard/explain_select.analyze)

The "where clause" when run alone returns 5200 rows in 10 seconds
(http://www.bowmansystems.com/~richard/explain_where.analyze)

However when I put to two together it takes much, much longer to run.
(http://www.bowmansystems.com/~richard/full.analyze)

Can anyone shed any light on what is going on here?  Why does the
optimizer choose such a slow plan in the combined query when the only
real difference between the full query and the "where only" query is the
number of rows in the result set on the "outside" of the "IN" clause?

A few pertinent observations/facts below

1.  The query is generated by a BI tool, I know it is ugly and stupid in
many cases. However, please try to see the larger issue, that if the
select and where portions are run separately they are both fast but
together it is insanely slow.

2.  The database has vacuumdb -f -z run on it nightly.

3.  Modifications to the stock postgresql.conf:
shared_buffers = 15000
work_mem = 131072
default_statistics_target = 100

4.  Dual Dual core Opterons, 4 gigs of ram, 6 disk Serial ATA hardware
RAID 10 running Postgres 8.03 compiled from source running on Debian
stable.

5. The tables being queried are only 200 megs or so combined on disk,
the whole DB is ~ 4 gigs
SELECT sum(relpages*8/1024) AS size_M FROM pg_class;
 size_m
--------
   4178

Thanks!

Re: Strange behaviour

From

Tom Lane

Date:

03 August 2006, 13:39:56

Richard Rowell <richard@bowmansystems.com> writes:
> We are using a BI tool that generates some rather ugly queries.  One of
> the ugly queries is taking much longer to return thin I think it
> should.
> (http://www.bowmansystems.com/~richard/full.analyze)
> Can anyone shed any light on what is going on here?

Seems like you have some bad rowcount estimates leading to poor plan
selection.  Most of the problem looks to be coming from the FunctionScan
nodes, wherein the planner doesn't have any real way to estimate how
many rows come out.  You might look into whether you can replace those
functions with views, so that the planner isn't dealing with "black boxes".

            regards, tom lane