Re: 7.3.1 New install, large queries are slow - Mailing list pgsql-performance

From Stephan Szabo
Subject Re: 7.3.1 New install, large queries are slow
Date
Msg-id 20030116085358.E5729-100000@megazone23.bigpanda.com
Whole thread Raw
In response to Re: 7.3.1 New install, large queries are slow  ("Roman Fail" <rfail@posportal.com>)
List pgsql-performance
On Thu, 16 Jan 2003, Roman Fail wrote:

> ***********************

Hmm, I wonder if maybe we're going about things backwards in this
case.  Does the original database have something like EXPLAIN
that'll show what it's doing? Perhaps that'll give an idea.

> > What does vacuum verbose batchdetail give you (it'll give an idea of pages anyway)
>
> trans=# VACUUM VERBOSE batchdetail;
> INFO:  --Relation public.batchdetail--
> INFO:  Pages 1669047: Changed 0, Empty 0; Tup 23316674: Vac 0, Keep 0, UnUsed 0.

So about 12 gigabytes of data, then?


> It seems to me that the big, big isolated problem is the index scan on
> batchdetail.tranamount.  During this small query, 'sar -b' showed
> consistent 90,000 block reads/sec. (contrast with only 6,000 with
> larger query index scan).  'top' shows the CPU is at 20% user, 30%
> system the whole time (contrast with 2% total in larger query above).

Note that in this case below, you've gotten a sequence scan not an
index scan. (similar to setting enable_indexscan=off performance)

> This results here still seem pretty bad (although not as bad as
> above), but I still don't know what is the bottleneck.  And the
> strange sar stats are confusing me.
>
> EXPLAIN ANALYZE SELECT * FROM batchdetail WHERE tranamount BETWEEN 300 AND 499;
> Seq Scan on batchdetail  (cost=0.00..2018797.11 rows=783291 width=440) (actual time=45.66..283926.58 rows=783687
loops=1)
>   Filter: ((tranamount >= 300::numeric) AND (tranamount <= 499::numeric))
> Total runtime: 285032.47 msec

I'd assume that tranamount values are fairly randomly distributed
throughout the table, right?  It takes about 5 minutes for the
system to read the entire table and more for the index scan, so
you're probably reading most of the table randomly and the index
as well.

What values on batchdetail do you use in query where clauses
regularly?  It's possible that occasional clusters would help
if this was the main field you filtered on.  The cluster
itself is time consuming, but it might help make the index
scans actually read fewer pages.


pgsql-performance by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: schema/db design wrt performance
Next
From: "Josh Berkus"
Date:
Subject: Re: 7.3.1 New install, large queries are slow