500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running - Mailing list pgsql-performance

From El-Lotso
Subject 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running
Date
Msg-id 1189529868.17184.17.camel@neuromancer.home.net
Whole thread Raw
Responses Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running
List pgsql-performance
Hi,

appreciate if someone can have some pointers for this.

PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD

3 mail tables which has already been selected "out" into separate tables
(useing create table foo as select * from foo_main where x=y)

These test tables containing only a very small subset of the main data's
table (max 1k to 10k rows vs 1.5mill to 7mill rows in the main table)

table definitions and actual query are attached. (names has been altered
to protect the innocent)

I've played around with some tweaking of the postgres.conf setting per
guidance from jdavis (in irc) w/o much(any) improvement. Also tried
re-writing the queries to NOT use subselects (per depesz in irc also)
also yielded nothing spectacular.

The only thing I noticed was that when the subqueries combine more than
3 tables, then PG will choke. If only at 3 joined tables per subquery,
the results come out fast, even for 6K rows.

but if the subqueries (these subqueries by itself, executes fast and
returns results in 1 to 10secs) were done independently and then placed
into a temp table, and then finally joined together using a query such
as

select a.a,b.b,c.c from a inner join b on (x = x) left outer join c on(x
= y)

then it would also be fast

work_mem = 8MB / 32MB /128MB (32 MB default in my setup)
effective_Cache_size = 128MB/500MB (500 default)
shared_buffers = 200MB
geqo_threshold = 5 (default 12)
geqo_effort = 2 (default 5)
ramdom_page_cose = 8.0 (default 4)
maintenance_work_mem = 64MB
join_collapse_limit = 1/8/15  (8 default)
from_collapse_limit = 1/8/15 (8 default)
enable_nestloop = f (on by default)

based on current performance, even with a small number of rows in the
individual tables (max 20k), I can't even get a result out in 2 hours.
(> 3 tables joined per subquery) which is making me re-think of PG's
useful-ness.



BTW, I also tried 8.2.4 CVS_STABLE Branch

Attachment

pgsql-performance by date:

Previous
From: Kevin Kempter
Date:
Subject: More Vacuum questions...
Next
From: El-Lotso
Date:
Subject: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running