unstable SELECT performance under load - Mailing list pgsql-general

From Piotr Foremniak
Subject unstable SELECT performance under load
Date
Msg-id gh851j$41m$1@news.hub.org
Whole thread Raw
Responses Re: unstable SELECT performance under load  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hello,

When executing some quite complex select queries our 8.3.5 Postgres
often hangs. The query that we used to reproduce this problem on our
isolated test environment is a quite complex join of ca.15 tables.
When executed only once, it always takes ca. 500ms to finish. When
executed by a JMeter stress test running a couple of threads
simultaneously, ca. 95% of queries also take 500ms to finish. In the
remaining 5% cases something weird happens and the same query finishes
after several minutes, with the postgres process generating gigabytes
of temp files and consuming 100% processor resources. In extreme cases
the process has to be killed because the whole DB system becomes
unresponsive. Any ideas what is going on? Might this be a problem with
a query structure or maybe a poor Postgres config?

We are running 8.3.5-1 from debian unstable, the same behaviour was
observed on 8.3.4. The machine is 2xQuadCore Xeon with 4GB memory and
2.6.27.7-bigmem kernel. We have experimented with several combinations
of effective_cache_size, work_mem and shared_buffers, but with no
success. DB size is 4GB, the relations involved in the example query
have althogether ca. 80MB of data, with 40000 rows in the largest
table.

The OS settings relevant to Postgres are:
kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128

The postgresql.conf looks like this:
shared_buffers = 512MB # min 128kB or max_connections*16kB
work_mem = 768MB # min 64kB
max_fsm_pages = 204800 # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1000 # min 100, ~70 bytes each
fsync = on # turns forced synchronization on or off
wal_buffers = 64kB # min 32kB
checkpoint_segments = 1 # in logfile segments, min 1, 16MB each
effective_cache_size = 2560MB
geqo = off
autovacuum = on # Enable autovacuum subprocess? 'on'

The explain plan for the sample query looks like this:
QUERY PLAN
Limit (cost=24061.64..24061.71 rows=27 width=261) (actual
time=469.604..469.623 rows=30 loops=1)
-> Sort (cost=24061.64..24061.71 rows=27 width=261) (actual
time=469.602..469.608 rows=30 loops=1)
Sort Key: w.low_id
Sort Method: top-N heapsort Memory: 32kB
-> Nested Loop (cost=4225.03..24061.00 rows=27 width=261) (actual
time=145.873..460.510 rows=4703 loops=1)
-> Nested Loop (cost=4225.03..24032.48 rows=27 width=227) (actual
time=145.817..391.480 rows=4703 loops=1)
-> Hash Join (cost=4225.03..24004.64 rows=27 width=222) (actual
time=145.799..365.989 rows=4703 loops=1)
Hash Cond: (w.stat_id = s.stat_id)
-> Hash Join (cost=4223.64..24002.89 rows=27 width=200) (actual
time=145.739..360.872 rows=4703 loops=1)
Hash Cond: (w.stat_edy_id = s_e.stat_edy_id)
-> Nested Loop (cost=4222.58..24001.45 rows=27 width=178) (actual
time=145.719..355.134 rows=4703 loops=1)
-> Hash Join (cost=4222.58..23991.88 rows=27 width=158) (actual
time=145.702..324.409 rows=4703 loops=1)
Hash Cond: (prz.przdmt_rdz_id = p_r.przdmt_rdz_id)
-> Nested Loop (cost=4221.35..23990.28 rows=27 width=149) (actual
time=145.673..319.483 rows=4703 loops=1)
-> Nested Loop (cost=4221.35..23974.43 rows=33 width=141) (actual
time=145.659..295.064 rows=4703 loops=1)
-> Nested Loop (cost=3904.50..5377.76 rows=54 width=78) (actual
time=145.632..246.488 rows=4703 loops=1)
-> Nested Loop (cost=3904.50..5347.89 rows=74 width=51) (actual
time=145.614..220.865 rows=4703 loops=1)
-> Hash Join (cost=3904.50..5141.64 rows=74 width=51) (actual
time=145.590..190.008 rows=4703 loops=1)
Hash Cond: (p.stat_id = u_s.stat_id)
-> Hash IN Join (cost=3903.12..5139.07 rows=120 width=29) (actual
time=145.560..185.442 rows=4703 loops=1)
Hash Cond: (p.lop_id = a.lop_id)
-> Seq Scan on lprzedmiot p (cost=2447.40..3621.93 rows=16061 width=29)
(actual time=48.715..72.959 rows=31687 loops=1)
Filter: (NOT (hashed subplan))
SubPlan
-> Hash Join (cost=817.66..2445.50 rows=759 width=4) (actual
time=26.282..48.440 rows=454 loops=1)
Hash Cond: (d.dok_id = l_d.dok_id)
-> Seq Scan on dokument d (cost=0.00..1608.55 rows=1040 width=4) (actual
time=0.044..21.419 rows=455 loops=1)
Filter: (dok_typ_id = 21)
-> Hash (cost=467.85..467.85 rows=27985 width=8) (actual
time=26.207..26.207 rows=27805 loops=1)
-> Seq Scan on lwniosek_dokument l_d (cost=0.00..467.85 rows=27985
width=8) (actual time=0.008..13.145 rows=27805 loops=1)
-> Hash (cost=1452.70..1452.70 rows=241 width=4) (actual
time=95.279..95.279 rows=5151 loops=1)
-> Nested Loop (cost=5.30..1452.70 rows=241 width=4) (actual
time=2.498..92.160 rows=5151 loops=1)
Join Filter: ((b.stat_edy_id = w.stat_edy_id) OR (w.stat_edy_id IS NULL))
-> Hash Join (cost=5.30..1307.50 rows=411 width=12) (actual
time=2.469..63.682 rows=5331 loops=1)
Hash Cond: (a.flow_id = w.flow_id)
Join Filter: ((a.stat_id = w.stat_id) OR (w.stat_id IS NULL))
-> Seq Scan on lprzedmiot a (cost=0.00..1094.22 rows=32122 width=16)
(actual time=0.007..11.325 rows=32141 loops=1)
-> Hash (cost=5.29..5.29 rows=1 width=12) (actual time=0.127..0.127
rows=3 loops=1)
-> Hash Join (cost=2.74..5.29 rows=1 width=12) (actual time=0.116..0.123
rows=3 loops=1)
Hash Cond: (w.oper_id = o.oper_id)
-> Seq Scan on wniosek_workflow_link_start w (cost=0.00..2.10 rows=110
width=16) (actual time=0.005..0.032 rows=110 loops=1)
-> Hash (cost=2.72..2.72 rows=1 width=8) (actual time=0.047..0.047
rows=1 loops=1)
-> Hash Join (cost=1.34..2.72 rows=1 width=8) (actual time=0.043..0.045
rows=1 loops=1)
Hash Cond: (k.oper_id = o.oper_id)
-> Seq Scan on wniosek_operacja_kanal k (cost=0.00..1.35 rows=7 width=4)
(actual time=0.015..0.018 rows=7 loops=1)
Filter: (kan_id = 3)
-> Hash (cost=1.32..1.32 rows=1 width=4) (actual time=0.014..0.014
rows=1 loops=1)
-> Seq Scan on wniosek_operacja o (cost=0.00..1.32 rows=1 width=4)
(actual time=0.012..0.013 rows=1 loops=1)
Filter: ((symbol)::text = 'ENTITY_REGISTRATION_REPLACE_REG_DOC'::text)
-> Index Scan using idx_lwniosek_lof_id on lwniosek b (cost=0.00..0.34
rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=5331)
Index Cond: (b.lof_id = a.lof_id)
-> Hash (cost=1.17..1.17 rows=17 width=30) (actual time=0.017..0.017
rows=17 loops=1)
-> Seq Scan on wniosek_status u_s (cost=0.00..1.17 rows=17 width=30)
(actual time=0.002..0.011 rows=17 loops=1)
-> Index Scan using lkalkulacja_pkey on lkalkulacja kalk
(cost=0.00..2.77 rows=1 width=8) (actual time=0.004..0.005 rows=1
loops=4703)
Index Cond: (kalk.lkalk_id = p.lkalk_id)
-> Index Scan using przedmioty_pkey on przedmioty prz (cost=0.00..0.39
rows=1 width=35) (actual time=0.003..0.004 rows=1 loops=4703)
Index Cond: (prz.przdmt_id = kalk.przdmt_id)
-> Index Scan using idx_lwniosek_lof_id on lwniosek w
(cost=316.85..344.37 rows=1 width=63) (actual time=0.008..0.009 rows=1
loops=4703)
Index Cond: (w.lof_id = p.lof_id)
Filter: (((w.pmask & 64) > 0) OR ((w.o_kontr_id = 1) AND ((w.pmask &
1024) > 0)) OR (((w.pmask & 16) > 0) AND (w.o_kontr_id = 1)) OR
(((w.pmask & 1) > 0) AND (w.o_uz_id = 1107)) OR (((w.pmask & 4) > 0) AND
(subplan)) OR (((w.pmask & 4096) > 0) AND (w.o_kontr_id = 1)) OR
(((w.pmask & 256) > 0) AND ((hashed subplan) OR (subplan) OR (subplan))))
SubPlan
-> Index Scan using idx_acl_kontr_aclk_acl_id_aclk_kontr_id on acl_kontr
(cost=0.00..8.28 rows=1 width=4) (never executed)
Index Cond: ((aclk_acl_id = $3) AND (aclk_kontr_id = 1))
Filter: ((aclk_pmask & 1) > 0)
-> Nested Loop (cost=0.00..29.30 rows=1 width=4) (never executed)
-> Seq Scan on acl_gr a (cost=0.00..4.45 rows=3 width=8) (never executed)
Filter: ((aclg_acl_id = $3) AND ((aclg_pmask & 1) > 0))
-> Index Scan using grupa_pkey on grupa g (cost=0.00..8.27 rows=1
width=4) (never executed)
Index Cond: (g.gr_id = a.aclg_gr_id)
Filter: (g.kontr_id = 1)
-> Hash Join (cost=24.13..316.84 rows=7 width=4) (actual
time=0.171..0.435 rows=64 loops=1)
Hash Cond: (u.uz_id = a.aclu_uz_id)
-> Bitmap Heap Scan on uzytkownik u (cost=21.29..312.58 rows=134
width=4) (actual time=0.065..0.254 rows=134 loops=1)
Recheck Cond: (kontr_id = 1)
-> Bitmap Index Scan on idx_uzytkownik__kontrahent (cost=0.00..21.26
rows=134 width=0) (actual time=0.046..0.046 rows=135 loops=1)
Index Cond: (kontr_id = 1)
-> Hash (cost=2.00..2.00 rows=67 width=8) (actual time=0.080..0.080
rows=67 loops=1)
-> Seq Scan on acl_uz a (cost=0.00..2.00 rows=67 width=8) (actual
time=0.017..0.051 rows=67 loops=1)
Filter: ((aclu_pmask & 1) > 0)
-> Index Scan using idx_grupa_uzytkownik__gr_uz on grupa_uzytkownik
(cost=0.00..8.27 rows=1 width=4) (actual time=0.003..0.003 rows=0
loops=4501)
Index Cond: ((gr_id = $1) AND (uz_id = 1107))
-> Index Scan using loferta_pkey on loferta o (cost=0.00..0.47 rows=1
width=16) (actual time=0.003..0.004 rows=1 loops=4703)
Index Cond: (o.lof_id = w.lof_id)
-> Hash (cost=1.10..1.10 rows=10 width=17) (actual time=0.011..0.011
rows=10 loops=1)
-> Seq Scan on przedmioty_rodzaje p_r (cost=0.00..1.10 rows=10 width=17)
(actual time=0.003..0.006 rows=10 loops=1)
-> Index Scan using uzytkownik_pkey on uzytkownik u (cost=0.00..0.34
rows=1 width=28) (actual time=0.003..0.004 rows=1 loops=4703)
Index Cond: (u.uz_id = o.sprzedawca_id)
-> Hash (cost=1.03..1.03 rows=3 width=30) (actual time=0.006..0.006
rows=3 loops=1)
-> Seq Scan on wniosek_status_edycja s_e (cost=0.00..1.03 rows=3
width=30) (actual time=0.003..0.004 rows=3 loops=1)
-> Hash (cost=1.17..1.17 rows=17 width=30) (actual time=0.035..0.035
rows=17 loops=1)
-> Seq Scan on wniosek_status s (cost=0.00..1.17 rows=17 width=30)
(actual time=0.016..0.024 rows=17 loops=1)
-> Index Scan using kontrahent_pkey on kontrahent dealer
(cost=0.00..1.02 rows=1 width=13) (actual time=0.003..0.004 rows=1
loops=4703)
Index Cond: (dealer.kontr_id = o.dealer_id)
-> Index Scan using kontrahent_pkey on kontrahent k (cost=0.00..1.02
rows=1 width=42) (actual time=0.004..0.004 rows=1 loops=4703)
Index Cond: (k.kontr_id = o.odbiorca_id)
Total runtime: 470.605 ms

Regards,
Piotr Foremniak

pgsql-general by date:

Previous
From: "Filip Rembiałkowski"
Date:
Subject: Re: psql verbose mode
Next
From: Bill Moran
Date:
Subject: Re: equal and like