Re: Finding bottleneck - Mailing list pgsql-performance

From Tom Lane
Subject Re: Finding bottleneck
Date
Msg-id 29384.1123520200@sss.pgh.pa.us
Whole thread Raw
In response to Re: Finding bottleneck  (Kari Lavikka <tuner@bdb.fi>)
Responses Re: Finding bottleneck  (Kari Lavikka <tuner@bdb.fi>)
Re: Finding bottleneck  (Kari Lavikka <tuner@bdb.fi>)
List pgsql-performance
Kari Lavikka <tuner@bdb.fi> writes:
> We are having performance problems with some smaller tables and very
> simple queries. For example:

> SELECT u.uid, u.nick, extract(epoch from uc.stamp) AS stamp FROM
> user_channel uc INNER JOIN users u USING (uid) WHERE channel_id = 281321
> AND u.status = 'a' ORDER BY uc.channel_id, upper(uc.nick)

> And explain analyze:
>   Nested Loop  (cost=0.00..200.85 rows=35 width=48) (actual time=0.414..38.128 rows=656 loops=1)
>     ->  Index Scan using user_channel_channel_id_nick on user_channel uc  (cost=0.00..40.18 rows=47 width=27) (actual
time=0.090..0.866rows=667 loops=1) 
>           Index Cond: (channel_id = 281321)
>     ->  Index Scan using users_pkey on users u  (cost=0.00..3.40 rows=1 width=25) (actual time=0.048..0.051 rows=1
loops=667)
>           Index Cond: ("outer".uid = u.uid)
>           Filter: (status = 'a'::bpchar)
>   Total runtime: 38.753 ms

> Under heavy load these queries tend to take several minutes to execute
> although there's plenty of free cpu available.

What that sounds like to me is a machine with inadequate disk I/O bandwidth.
Your earlier comment that checkpoint drives the machine into the ground
fits right into that theory, too.  You said there is "almost no IO-wait"
but are you sure you are measuring that correctly?

Something else just struck me from your first post:

> Queries accumulate and when checkpointing is over, there can be
> something like 400 queries running but over 50% of cpu is just idling.

400 queries?  Are you launching 400 separate backends to do that?
Some sort of connection pooling seems like a good idea, if you don't
have it in place already.  If the system's effective behavior in the
face of heavy load is to start even more concurrent backends, that
could easily drive things into the ground.

            regards, tom lane

pgsql-performance by date:

Previous
From: Kari Lavikka
Date:
Subject: Re: Finding bottleneck
Next
From: Rhett Garber
Date:
Subject: Re: Why hash join instead of nested loop?