Thread: Increasing GROUP BY CHAR columns speed
Group by using CHAR columns takes abnormally big time. How to speed it ? Andrus. 8.1.4, cluster locale is en-us, db encoding is utf-8 set search_path to firma2,public; explain analyze SELECT CASE WHEN bilkaib.raha='EEK' THEN 0 ELSE bilkaib.id END, bilkaib.DB, CASE WHEN dbkonto.objekt1='+' THEN bilkaib.DBOBJEKT ELSE null END:: CHAR(10) AS dbobjekt, CASE WHEN dbkonto.objekt2='+' THEN bilkaib.DB2OBJEKT ELSE null END:: CHAR(10) AS db2objekt, CASE WHEN dbkonto.objekt3='+' THEN bilkaib.DB3OBJEKT ELSE null END:: CHAR(10) AS db3objekt, CASE WHEN dbkonto.objekt4='+' THEN bilkaib.DB4OBJEKT ELSE null END:: CHAR(10) AS db4objekt, CASE WHEN dbkonto.objekt5='+' THEN bilkaib.DB5OBJEKT ELSE null END:: CHAR(10) AS db5objekt, CASE WHEN dbkonto.objekt6='+' THEN bilkaib.DB6OBJEKT ELSE null END:: CHAR(10) AS db6objekt, CASE WHEN dbkonto.objekt7='+' THEN bilkaib.DB7OBJEKT ELSE null END:: CHAR(10) AS db7objekt, CASE WHEN dbkonto.objekt8='+' THEN bilkaib.DB8OBJEKT ELSE null END:: CHAR(10) AS db8objekt, CASE WHEN dbkonto.objekt9='+' THEN bilkaib.DB9OBJEKT ELSE null END:: CHAR(10) AS db9objekt, bilkaib.CR, CASE WHEN crkonto.objekt1='+' THEN bilkaib.crOBJEKT ELSE null END:: CHAR(10) AS crobjekt, CASE WHEN crkonto.objekt2='+' THEN bilkaib.cr2OBJEKT ELSE null END:: CHAR(10) AS cr2objekt, CASE WHEN crkonto.objekt3='+' THEN bilkaib.cr3OBJEKT ELSE null END:: CHAR(10) AS cr3objekt, CASE WHEN crkonto.objekt4='+' THEN bilkaib.cr4OBJEKT ELSE null END:: CHAR(10) AS cr4objekt, CASE WHEN crkonto.objekt5='+' THEN bilkaib.cr5OBJEKT ELSE null END:: CHAR(10) AS cr5objekt, CASE WHEN crkonto.objekt6='+' THEN bilkaib.cr6OBJEKT ELSE null END:: CHAR(10) AS cr6objekt, CASE WHEN crkonto.objekt7='+' THEN bilkaib.cr7OBJEKT ELSE null END:: CHAR(10) AS cr7objekt, CASE WHEN crkonto.objekt8='+' THEN bilkaib.cr8OBJEKT ELSE null END:: CHAR(10) AS cr8objekt, CASE WHEN crkonto.objekt9='+' THEN bilkaib.cr9OBJEKT ELSE null END:: CHAR(10) AS cr9objekt, bilkaib.RAHA, CASE WHEN crkonto.klienkaupa OR dbkonto.klienkaupa OR crkonto.tyyp IN ('K','I') OR dbkonto.tyyp IN ('K','I') THEN bilkaib.KLIENT ELSE NULL END AS klient, bilkaib.EXCHRATE, CASE WHEN crkonto.klienkaupa OR dbkonto.klienkaupa OR crkonto.tyyp IN ('K','I') OR dbkonto.tyyp IN ('K','I') THEN klient.nimi ELSE NULL END AS kliendinim, -- 24. CAST(CASE WHEN crkonto.arvekaupa OR dbkonto.arvekaupa OR (bilkaib.cr<>'00' AND crkonto.tyyp='K') OR (bilkaib.db<>'00' AND dbkonto.tyyp='K') THEN bilkaib.doknr ELSE NULL END AS CHAR(25)) AS doknr ,bilkaib.ratediffer ,CASE WHEN bilkaib.raha='EEK' THEN DATE'20070101' ELSE bilkaib.kuupaev END AS kuupaev ,SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood where ( bilkaib.cr LIKE '112'||'%' OR bilkaib.db LIKE '112'||'%' ) AND bilkaib.kuupaev BETWEEN '2007-01-01' AND '2008-11-26' GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 "GroupAggregate (cost=52316.23..61434.48 rows=41923 width=838) (actual time=10771.337..11372.135 rows=577 loops=1)" " -> Sort (cost=52316.23..52421.03 rows=41923 width=838) (actual time=10770.529..11012.651 rows=52156 loops=1)" " Sort Key: CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN 0 ELSE bilkaib.id END, bilkaib.db, (CASE WHEN (dbkonto.objekt1 = '+'::bpchar) THEN bilkaib.dbobjekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt2 = '+'::bpchar) THEN bilkaib.db2objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt3 = '+'::bpchar) THEN bilkaib.db3objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt4 = '+'::bpchar) THEN bilkaib.db4objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt5 = '+'::bpchar) THEN bilkaib.db5objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt6 = '+'::bpchar) THEN bilkaib.db6objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt7 = '+'::bpchar) THEN bilkaib.db7objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt8 = '+'::bpchar) THEN bilkaib.db8objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt9 = '+'::bpchar) THEN bilkaib.db9objekt ELSE NULL::bpchar END)::character(10), bilkaib.cr, (CASE WHEN (crkonto.objekt1 = '+'::bpchar) THEN bilkaib.crobjekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt2 = '+'::bpchar) THEN bilkaib.cr2objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt3 = '+'::bpchar) THEN bilkaib.cr3objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt4 = '+'::bpchar) THEN bilkaib.cr4objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt5 = '+'::bpchar) THEN bilkaib.cr5objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt6 = '+'::bpchar) THEN bilkaib.cr6objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt7 = '+'::bpchar) THEN bilkaib.cr7objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt8 = '+'::bpchar) THEN bilkaib.cr8objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt9 = '+'::bpchar) THEN bilkaib.cr9objekt ELSE NULL::bpchar END)::character(10), bilkaib.raha, CASE WHEN ((crkonto.klienkaupa)::boolean OR (dbkonto.klienkaupa)::boolean OR (crkonto.tyyp = 'K'::bpchar) OR (crkonto.tyyp = 'I'::bpchar) OR (dbkonto.tyyp = 'K'::bpchar) OR (dbkonto.tyyp = 'I'::bpchar)) THEN bilkaib.klient ELSE NULL::bpchar END, bilkaib.exchrate, CASE WHEN ((crkonto.klienkaupa)::boolean OR (dbkonto.klienkaupa)::boolean OR (crkonto.tyyp = 'K'::bpchar) OR (crkonto.tyyp = 'I'::bpchar) OR (dbkonto.tyyp = 'K'::bpchar) OR (dbkonto.tyyp = 'I'::bpchar)) THEN klient.nimi ELSE NULL::bpchar END, (CASE WHEN ((crkonto.arvekaupa)::boolean OR (dbkonto.arvekaupa)::boolean OR ((bilkaib.cr <> '00'::bpchar) AND (crkonto.tyyp = 'K'::bpchar)) OR ((bilkaib.db <> '00'::bpchar) AND (dbkonto.tyyp = 'K'::bpchar))) THEN bilkaib.doknr ELSE NULL::bpchar END)::character(25), bilkaib.ratediffer, CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN '2007-01-01'::date ELSE bilkaib.kuupaev END" " -> Hash Left Join (cost=936.48..40184.64 rows=41923 width=838) (actual time=46.000..2820.944 rows=52156 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=785.35..34086.74 rows=41923 width=764) (actual time=34.547..1563.790 rows=52156 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=764.26..33403.76 rows=48533 width=712) (actual time=32.069..1082.505 rows=52156 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Bitmap Heap Scan on bilkaib (cost=743.17..32616.41 rows=56185 width=660) (actual time=29.652..518.289 rows=52156 loops=1)" " Recheck Cond: ((cr ~~ '112%'::text) OR (db ~~ '112%'::text))" " Filter: (((cr ~~ '112%'::text) OR (db ~~ '112%'::text)) AND (kuupaev >= '2007-01-01'::date) AND (kuupaev <= '2008-11-26'::date))" " -> BitmapOr (cost=743.17..743.17 rows=65862 width=0) (actual time=26.539..26.539 rows=0 loops=1)" " -> Bitmap Index Scan on bilkaib_cr_pattern_idx (cost=0.00..236.63 rows=20939 width=0) (actual time=8.510..8.510 rows=21028 loops=1)" " Index Cond: ((cr ~>=~ '112'::bpchar) AND (cr ~<~ '113'::bpchar))" " -> Bitmap Index Scan on bilkaib_db_pattern_idx (cost=0.00..506.54 rows=44923 width=0) (actual time=18.013..18.013 rows=45426 loops=1)" " Index Cond: ((db ~>=~ '112'::bpchar) AND (db ~<~ '113'::bpchar))" " -> Hash (cost=20.49..20.49 rows=241 width=66) (actual time=2.375..2.375 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=66) (actual time=0.011..1.207 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=66) (actual time=2.451..2.451 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=66) (actual time=0.022..1.259 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=90) (actual time=11.371..11.371 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=90) (actual time=0.009..5.587 rows=1290 loops=1)" "Total runtime: 11380.437 ms" If group by is removed same query runs 8 times (!) faster: set search_path to firma2,public; explain analyze SELECT SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood where ( bilkaib.cr LIKE '112'||'%' OR bilkaib.db LIKE '112'||'%' ) AND bilkaib.kuupaev BETWEEN '2007-01-01' AND '2008-11-26' "Aggregate (cost=34944.27..34944.28 rows=1 width=11) (actual time=1781.456..1781.460 rows=1 loops=1)" " -> Hash Left Join (cost=936.48..34839.46 rows=41923 width=11) (actual time=41.194..1545.105 rows=52156 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=785.35..34086.74 rows=41923 width=27) (actual time=30.372..1120.431 rows=52156 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=764.26..33403.76 rows=48533 width=41) (actual time=28.168..710.336 rows=52156 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Bitmap Heap Scan on bilkaib (cost=743.17..32616.41 rows=56185 width=55) (actual time=25.970..294.638 rows=52156 loops=1)" " Recheck Cond: ((cr ~~ '112%'::text) OR (db ~~ '112%'::text))" " Filter: (((cr ~~ '112%'::text) OR (db ~~ '112%'::text)) AND (kuupaev >= '2007-01-01'::date) AND (kuupaev <= '2008-11-26'::date))" " -> BitmapOr (cost=743.17..743.17 rows=65862 width=0) (actual time=23.056..23.056 rows=0 loops=1)" " -> Bitmap Index Scan on bilkaib_cr_pattern_idx (cost=0.00..236.63 rows=20939 width=0) (actual time=7.414..7.414 rows=21028 loops=1)" " Index Cond: ((cr ~>=~ '112'::bpchar) AND (cr ~<~ '113'::bpchar))" " -> Bitmap Index Scan on bilkaib_db_pattern_idx (cost=0.00..506.54 rows=44923 width=0) (actual time=15.627..15.627 rows=45426 loops=1)" " Index Cond: ((db ~>=~ '112'::bpchar) AND (db ~<~ '113'::bpchar))" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.164..2.164 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.012..1.205 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.177..2.177 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.019..1.203 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=16) (actual time=10.782..10.782 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=16) (actual time=0.009..5.597 rows=1290 loops=1)" "Total runtime: 1781.673 ms"
The below query is spending most of its time in the sort, or perhaps the complicated check condition before it. The explain has a 8 second gap in time between the 2.8 seconds after the Hash Left Join and before the Sort. I'm guessingits hidden in the sort. You can get the planner to switch from a sort to a hash aggregate with a large work_mem. Try calling SET work_mem = '100MB'; before this query first. It may not help that much if the check time is as expensive as it looks in the plan below, but its very easy to try. If it does help, you may want to temporarily increase that value only for this query rather than making it a default in theconfig file. ________________________________________ From: pgsql-performance-owner@postgresql.org [pgsql-performance-owner@postgresql.org] On Behalf Of Andrus [kobruleht2@hot.ee] Sent: Friday, November 28, 2008 7:04 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Increasing GROUP BY CHAR columns speed Group by using CHAR columns takes abnormally big time. How to speed it ? Andrus. 8.1.4, cluster locale is en-us, db encoding is utf-8 set search_path to firma2,public; explain analyze SELECT CASE WHEN bilkaib.raha='EEK' THEN 0 ELSE bilkaib.id END, bilkaib.DB, CASE WHEN dbkonto.objekt1='+' THEN bilkaib.DBOBJEKT ELSE null END:: CHAR(10) AS dbobjekt, CASE WHEN dbkonto.objekt2='+' THEN bilkaib.DB2OBJEKT ELSE null END:: CHAR(10) AS db2objekt, CASE WHEN dbkonto.objekt3='+' THEN bilkaib.DB3OBJEKT ELSE null END:: CHAR(10) AS db3objekt, CASE WHEN dbkonto.objekt4='+' THEN bilkaib.DB4OBJEKT ELSE null END:: CHAR(10) AS db4objekt, CASE WHEN dbkonto.objekt5='+' THEN bilkaib.DB5OBJEKT ELSE null END:: CHAR(10) AS db5objekt, CASE WHEN dbkonto.objekt6='+' THEN bilkaib.DB6OBJEKT ELSE null END:: CHAR(10) AS db6objekt, CASE WHEN dbkonto.objekt7='+' THEN bilkaib.DB7OBJEKT ELSE null END:: CHAR(10) AS db7objekt, CASE WHEN dbkonto.objekt8='+' THEN bilkaib.DB8OBJEKT ELSE null END:: CHAR(10) AS db8objekt, CASE WHEN dbkonto.objekt9='+' THEN bilkaib.DB9OBJEKT ELSE null END:: CHAR(10) AS db9objekt, bilkaib.CR, CASE WHEN crkonto.objekt1='+' THEN bilkaib.crOBJEKT ELSE null END:: CHAR(10) AS crobjekt, CASE WHEN crkonto.objekt2='+' THEN bilkaib.cr2OBJEKT ELSE null END:: CHAR(10) AS cr2objekt, CASE WHEN crkonto.objekt3='+' THEN bilkaib.cr3OBJEKT ELSE null END:: CHAR(10) AS cr3objekt, CASE WHEN crkonto.objekt4='+' THEN bilkaib.cr4OBJEKT ELSE null END:: CHAR(10) AS cr4objekt, CASE WHEN crkonto.objekt5='+' THEN bilkaib.cr5OBJEKT ELSE null END:: CHAR(10) AS cr5objekt, CASE WHEN crkonto.objekt6='+' THEN bilkaib.cr6OBJEKT ELSE null END:: CHAR(10) AS cr6objekt, CASE WHEN crkonto.objekt7='+' THEN bilkaib.cr7OBJEKT ELSE null END:: CHAR(10) AS cr7objekt, CASE WHEN crkonto.objekt8='+' THEN bilkaib.cr8OBJEKT ELSE null END:: CHAR(10) AS cr8objekt, CASE WHEN crkonto.objekt9='+' THEN bilkaib.cr9OBJEKT ELSE null END:: CHAR(10) AS cr9objekt, bilkaib.RAHA, CASE WHEN crkonto.klienkaupa OR dbkonto.klienkaupa OR crkonto.tyyp IN ('K','I') OR dbkonto.tyyp IN ('K','I') THEN bilkaib.KLIENT ELSE NULL END AS klient, bilkaib.EXCHRATE, CASE WHEN crkonto.klienkaupa OR dbkonto.klienkaupa OR crkonto.tyyp IN ('K','I') OR dbkonto.tyyp IN ('K','I') THEN klient.nimi ELSE NULL END AS kliendinim, -- 24. CAST(CASE WHEN crkonto.arvekaupa OR dbkonto.arvekaupa OR (bilkaib.cr<>'00' AND crkonto.tyyp='K') OR (bilkaib.db<>'00' AND dbkonto.tyyp='K') THEN bilkaib.doknr ELSE NULL END AS CHAR(25)) AS doknr ,bilkaib.ratediffer ,CASE WHEN bilkaib.raha='EEK' THEN DATE'20070101' ELSE bilkaib.kuupaev END AS kuupaev ,SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood where ( bilkaib.cr LIKE '112'||'%' OR bilkaib.db LIKE '112'||'%' ) AND bilkaib.kuupaev BETWEEN '2007-01-01' AND '2008-11-26' GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 "GroupAggregate (cost=52316.23..61434.48 rows=41923 width=838) (actual time=10771.337..11372.135 rows=577 loops=1)" " -> Sort (cost=52316.23..52421.03 rows=41923 width=838) (actual time=10770.529..11012.651 rows=52156 loops=1)" " Sort Key: CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN 0 ELSE bilkaib.id END, bilkaib.db, (CASE WHEN (dbkonto.objekt1 = '+'::bpchar) THEN bilkaib.dbobjekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt2 = '+'::bpchar) THEN bilkaib.db2objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt3 = '+'::bpchar) THEN bilkaib.db3objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt4 = '+'::bpchar) THEN bilkaib.db4objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt5 = '+'::bpchar) THEN bilkaib.db5objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt6 = '+'::bpchar) THEN bilkaib.db6objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt7 = '+'::bpchar) THEN bilkaib.db7objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt8 = '+'::bpchar) THEN bilkaib.db8objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt9 = '+'::bpchar) THEN bilkaib.db9objekt ELSE NULL::bpchar END)::character(10), bilkaib.cr, (CASE WHEN (crkonto.objekt1 = '+'::bpchar) THEN bilkaib.crobjekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt2 = '+'::bpchar) THEN bilkaib.cr2objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt3 = '+'::bpchar) THEN bilkaib.cr3objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt4 = '+'::bpchar) THEN bilkaib.cr4objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt5 = '+'::bpchar) THEN bilkaib.cr5objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt6 = '+'::bpchar) THEN bilkaib.cr6objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt7 = '+'::bpchar) THEN bilkaib.cr7objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt8 = '+'::bpchar) THEN bilkaib.cr8objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt9 = '+'::bpchar) THEN bilkaib.cr9objekt ELSE NULL::bpchar END)::character(10), bilkaib.raha, CASE WHEN ((crkonto.klienkaupa)::boolean OR (dbkonto.klienkaupa)::boolean OR (crkonto.tyyp = 'K'::bpchar) OR (crkonto.tyyp = 'I'::bpchar) OR (dbkonto.tyyp = 'K'::bpchar) OR (dbkonto.tyyp = 'I'::bpchar)) THEN bilkaib.klient ELSE NULL::bpchar END, bilkaib.exchrate, CASE WHEN ((crkonto.klienkaupa)::boolean OR (dbkonto.klienkaupa)::boolean OR (crkonto.tyyp = 'K'::bpchar) OR (crkonto.tyyp = 'I'::bpchar) OR (dbkonto.tyyp = 'K'::bpchar) OR (dbkonto.tyyp = 'I'::bpchar)) THEN klient.nimi ELSE NULL::bpchar END, (CASE WHEN ((crkonto.arvekaupa)::boolean OR (dbkonto.arvekaupa)::boolean OR ((bilkaib.cr <> '00'::bpchar) AND (crkonto.tyyp = 'K'::bpchar)) OR ((bilkaib.db <> '00'::bpchar) AND (dbkonto.tyyp = 'K'::bpchar))) THEN bilkaib.doknr ELSE NULL::bpchar END)::character(25), bilkaib.ratediffer, CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN '2007-01-01'::date ELSE bilkaib.kuupaev END" " -> Hash Left Join (cost=936.48..40184.64 rows=41923 width=838) (actual time=46.000..2820.944 rows=52156 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=785.35..34086.74 rows=41923 width=764) (actual time=34.547..1563.790 rows=52156 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=764.26..33403.76 rows=48533 width=712) (actual time=32.069..1082.505 rows=52156 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Bitmap Heap Scan on bilkaib (cost=743.17..32616.41 rows=56185 width=660) (actual time=29.652..518.289 rows=52156 loops=1)" " Recheck Cond: ((cr ~~ '112%'::text) OR (db ~~ '112%'::text))" " Filter: (((cr ~~ '112%'::text) OR (db ~~ '112%'::text)) AND (kuupaev >= '2007-01-01'::date) AND (kuupaev <= '2008-11-26'::date))" " -> BitmapOr (cost=743.17..743.17 rows=65862 width=0) (actual time=26.539..26.539 rows=0 loops=1)" " -> Bitmap Index Scan on bilkaib_cr_pattern_idx (cost=0.00..236.63 rows=20939 width=0) (actual time=8.510..8.510 rows=21028 loops=1)" " Index Cond: ((cr ~>=~ '112'::bpchar) AND (cr ~<~ '113'::bpchar))" " -> Bitmap Index Scan on bilkaib_db_pattern_idx (cost=0.00..506.54 rows=44923 width=0) (actual time=18.013..18.013 rows=45426 loops=1)" " Index Cond: ((db ~>=~ '112'::bpchar) AND (db ~<~ '113'::bpchar))" " -> Hash (cost=20.49..20.49 rows=241 width=66) (actual time=2.375..2.375 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=66) (actual time=0.011..1.207 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=66) (actual time=2.451..2.451 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=66) (actual time=0.022..1.259 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=90) (actual time=11.371..11.371 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=90) (actual time=0.009..5.587 rows=1290 loops=1)" "Total runtime: 11380.437 ms" If group by is removed same query runs 8 times (!) faster: set search_path to firma2,public; explain analyze SELECT SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood where ( bilkaib.cr LIKE '112'||'%' OR bilkaib.db LIKE '112'||'%' ) AND bilkaib.kuupaev BETWEEN '2007-01-01' AND '2008-11-26' "Aggregate (cost=34944.27..34944.28 rows=1 width=11) (actual time=1781.456..1781.460 rows=1 loops=1)" " -> Hash Left Join (cost=936.48..34839.46 rows=41923 width=11) (actual time=41.194..1545.105 rows=52156 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=785.35..34086.74 rows=41923 width=27) (actual time=30.372..1120.431 rows=52156 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=764.26..33403.76 rows=48533 width=41) (actual time=28.168..710.336 rows=52156 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Bitmap Heap Scan on bilkaib (cost=743.17..32616.41 rows=56185 width=55) (actual time=25.970..294.638 rows=52156 loops=1)" " Recheck Cond: ((cr ~~ '112%'::text) OR (db ~~ '112%'::text))" " Filter: (((cr ~~ '112%'::text) OR (db ~~ '112%'::text)) AND (kuupaev >= '2007-01-01'::date) AND (kuupaev <= '2008-11-26'::date))" " -> BitmapOr (cost=743.17..743.17 rows=65862 width=0) (actual time=23.056..23.056 rows=0 loops=1)" " -> Bitmap Index Scan on bilkaib_cr_pattern_idx (cost=0.00..236.63 rows=20939 width=0) (actual time=7.414..7.414 rows=21028 loops=1)" " Index Cond: ((cr ~>=~ '112'::bpchar) AND (cr ~<~ '113'::bpchar))" " -> Bitmap Index Scan on bilkaib_db_pattern_idx (cost=0.00..506.54 rows=44923 width=0) (actual time=15.627..15.627 rows=45426 loops=1)" " Index Cond: ((db ~>=~ '112'::bpchar) AND (db ~<~ '113'::bpchar))" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.164..2.164 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.012..1.205 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.177..2.177 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.019..1.203 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=16) (actual time=10.782..10.782 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=16) (actual time=0.009..5.597 rows=1290 loops=1)" "Total runtime: 1781.673 ms" -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Scott, Thank you. >The below query is spending most of its time in the sort, or perhaps the >complicated check condition before it. >The explain has a 8 second gap in time between the 2.8 seconds after the >Hash Left Join and before the Sort. I'm guessing its hidden in the sort. >You can get the planner to switch from a sort to a hash aggregate with a >large work_mem. Try calling >SET work_mem = '100MB'; >before this query first. >It may not help that much if the check time is as expensive as it looks in >the plan below, but its very easy to try. >If it does help, you may want to temporarily increase that value only for >this query rather than making it a default in the config file. SET work_mem = 2097151 (this is max allowed value) or SET work_mem = 97151 decreases query time from 12 seconds to 9 seconds. My application may ran in servers with 1 GB RAM only. I'm afraid than in those servers 2097151 will cause error and abort query. Is it reasonable to add SET work_mem = 97151 before this query and SET work_mem TO DEFAULT after this query ? Or should I use max value in cases where there are much more data ? This query may return a much more data for longer period and more accounts. CASE WHEN dbkonto.objekt1='+' THEN bilkaib.DBOBJEKT ELSE null END:: CHAR(10) AS dbobjekt is ugly. I tried to rewrite it using NullIfNot(dbkonto.objekt1, '+') AS dbobjekt, bot got error ERROR: function nullifnot(character, "unknown") does not exist How to re-write this in nicer and faster way ? For most of rows checks WHEN objektn='+' will fail: objektn values are usually rarely equal to '+': they are empty or null mostly. Maybe this can be used to optimize the query. Btw. Tom Lane's reply from earlier discussion about this query speed (then there were '' instead of NULL in group columns) some years ago: "I think the problem is probably that you're sorting two dozen CHAR columns, and that in many of the rows all these entries are '' forcing the sort code to compare all two dozen columns (not so)? So the sort ends up doing lots and lots and lots of CHAR comparisons. Which can be slow, especially in non-C locales." locale specific check is not nessecary for those CHAR(10) columns. How to force PostgreSql to use binary check for grouping ? Some dbms allow to mark columns as C locale. I havent found this nor chartobin() function in PostgreSql. Will creating BinaryNullIfNot(dbkonto.objekt1, '+') function solve this ? Andrus. New testcase: set search_path to firma2,public; SET work_mem = 2097151; -- 9 seconds -- SET work_mem = 1097151; -- 9 seconds --SET work_mem to default; -- 12 seconds explain analyze SELECT CASE WHEN bilkaib.raha='EEK' THEN 0 ELSE bilkaib.id END, bilkaib.DB, CASE WHEN dbkonto.objekt1='+' THEN bilkaib.DBOBJEKT ELSE null END:: CHAR(10) AS dbobjekt, CASE WHEN dbkonto.objekt2='+' THEN bilkaib.DB2OBJEKT ELSE null END:: CHAR(10) AS db2objekt, CASE WHEN dbkonto.objekt3='+' THEN bilkaib.DB3OBJEKT ELSE null END:: CHAR(10) AS db3objekt, CASE WHEN dbkonto.objekt4='+' THEN bilkaib.DB4OBJEKT ELSE null END:: CHAR(10) AS db4objekt, CASE WHEN dbkonto.objekt5='+' THEN bilkaib.DB5OBJEKT ELSE null END:: CHAR(10) AS db5objekt, CASE WHEN dbkonto.objekt6='+' THEN bilkaib.DB6OBJEKT ELSE null END:: CHAR(10) AS db6objekt, CASE WHEN dbkonto.objekt7='+' THEN bilkaib.DB7OBJEKT ELSE null END:: CHAR(10) AS db7objekt, CASE WHEN dbkonto.objekt8='+' THEN bilkaib.DB8OBJEKT ELSE null END:: CHAR(10) AS db8objekt, CASE WHEN dbkonto.objekt9='+' THEN bilkaib.DB9OBJEKT ELSE null END:: CHAR(10) AS db9objekt, bilkaib.CR, CASE WHEN crkonto.objekt1='+' THEN bilkaib.crOBJEKT ELSE null END:: CHAR(10) AS crobjekt, CASE WHEN crkonto.objekt2='+' THEN bilkaib.cr2OBJEKT ELSE null END:: CHAR(10) AS cr2objekt, CASE WHEN crkonto.objekt3='+' THEN bilkaib.cr3OBJEKT ELSE null END:: CHAR(10) AS cr3objekt, CASE WHEN crkonto.objekt4='+' THEN bilkaib.cr4OBJEKT ELSE null END:: CHAR(10) AS cr4objekt, CASE WHEN crkonto.objekt5='+' THEN bilkaib.cr5OBJEKT ELSE null END:: CHAR(10) AS cr5objekt, CASE WHEN crkonto.objekt6='+' THEN bilkaib.cr6OBJEKT ELSE null END:: CHAR(10) AS cr6objekt, CASE WHEN crkonto.objekt7='+' THEN bilkaib.cr7OBJEKT ELSE null END:: CHAR(10) AS cr7objekt, CASE WHEN crkonto.objekt8='+' THEN bilkaib.cr8OBJEKT ELSE null END:: CHAR(10) AS cr8objekt, CASE WHEN crkonto.objekt9='+' THEN bilkaib.cr9OBJEKT ELSE null END:: CHAR(10) AS cr9objekt, bilkaib.RAHA, CASE WHEN crkonto.klienkaupa OR dbkonto.klienkaupa OR crkonto.tyyp IN ('K','I') OR dbkonto.tyyp IN ('K','I') THEN bilkaib.KLIENT ELSE NULL END AS klient, bilkaib.EXCHRATE, CASE WHEN crkonto.klienkaupa OR dbkonto.klienkaupa OR crkonto.tyyp IN ('K','I') OR dbkonto.tyyp IN ('K','I') THEN klient.nimi ELSE NULL END AS kliendinim, -- 24. CAST(CASE WHEN crkonto.arvekaupa OR dbkonto.arvekaupa OR (bilkaib.cr<>'00' AND crkonto.tyyp='K') OR (bilkaib.db<>'00' AND dbkonto.tyyp='K') THEN bilkaib.doknr ELSE NULL END AS CHAR(25)) AS doknr ,bilkaib.ratediffer ,CASE WHEN bilkaib.raha='EEK' THEN DATE'20070101' ELSE bilkaib.kuupaev END AS kuupaev ,SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood where ( bilkaib.cr LIKE '112'||'%' OR bilkaib.db LIKE '112'||'%' ) AND bilkaib.kuupaev BETWEEN '2007-01-01' AND '2008-11-26' GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 "GroupAggregate (cost=43403.38..52521.63 rows=41923 width=838) (actual time=8083.171..8620.908 rows=577 loops=1)" " -> Sort (cost=43403.38..43508.19 rows=41923 width=838) (actual time=8082.456..8273.259 rows=52156 loops=1)" " Sort Key: CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN 0 ELSE bilkaib.id END, bilkaib.db, (CASE WHEN (dbkonto.objekt1 = '+'::bpchar) THEN bilkaib.dbobjekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt2 = '+'::bpchar) THEN bilkaib.db2objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt3 = '+'::bpchar) THEN bilkaib.db3objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt4 = '+'::bpchar) THEN bilkaib.db4objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt5 = '+'::bpchar) THEN bilkaib.db5objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt6 = '+'::bpchar) THEN bilkaib.db6objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt7 = '+'::bpchar) THEN bilkaib.db7objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt8 = '+'::bpchar) THEN bilkaib.db8objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (dbkonto.objekt9 = '+'::bpchar) THEN bilkaib.db9objekt ELSE NULL::bpchar END)::character(10), bilkaib.cr, (CASE WHEN (crkonto.objekt1 = '+'::bpchar) THEN bilkaib.crobjekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt2 = '+'::bpchar) THEN bilkaib.cr2objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt3 = '+'::bpchar) THEN bilkaib.cr3objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt4 = '+'::bpchar) THEN bilkaib.cr4objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt5 = '+'::bpchar) THEN bilkaib.cr5objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt6 = '+'::bpchar) THEN bilkaib.cr6objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt7 = '+'::bpchar) THEN bilkaib.cr7objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt8 = '+'::bpchar) THEN bilkaib.cr8objekt ELSE NULL::bpchar END)::character(10), (CASE WHEN (crkonto.objekt9 = '+'::bpchar) THEN bilkaib.cr9objekt ELSE NULL::bpchar END)::character(10), bilkaib.raha, CASE WHEN ((crkonto.klienkaupa)::boolean OR (dbkonto.klienkaupa)::boolean OR (crkonto.tyyp = 'K'::bpchar) OR (crkonto.tyyp = 'I'::bpchar) OR (dbkonto.tyyp = 'K'::bpchar) OR (dbkonto.tyyp = 'I'::bpchar)) THEN bilkaib.klient ELSE NULL::bpchar END, bilkaib.exchrate, CASE WHEN ((crkonto.klienkaupa)::boolean OR (dbkonto.klienkaupa)::boolean OR (crkonto.tyyp = 'K'::bpchar) OR (crkonto.tyyp = 'I'::bpchar) OR (dbkonto.tyyp = 'K'::bpchar) OR (dbkonto.tyyp = 'I'::bpchar)) THEN klient.nimi ELSE NULL::bpchar END, (CASE WHEN ((crkonto.arvekaupa)::boolean OR (dbkonto.arvekaupa)::boolean OR ((bilkaib.cr <> '00'::bpchar) AND (crkonto.tyyp = 'K'::bpchar)) OR ((bilkaib.db <> '00'::bpchar) AND (dbkonto.tyyp = 'K'::bpchar))) THEN bilkaib.doknr ELSE NULL::bpchar END)::character(25), bilkaib.ratediffer, CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN '2007-01-01'::date ELSE bilkaib.kuupaev END" " -> Hash Left Join (cost=936.48..40184.64 rows=41923 width=838) (actual time=47.409..2427.059 rows=52156 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=785.35..34086.74 rows=41923 width=764) (actual time=35.669..1414.794 rows=52156 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=764.26..33403.76 rows=48533 width=712) (actual time=32.839..954.784 rows=52156 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Bitmap Heap Scan on bilkaib (cost=743.17..32616.41 rows=56185 width=660) (actual time=30.337..448.153 rows=52156 loops=1)" " Recheck Cond: ((cr ~~ '112%'::text) OR (db ~~ '112%'::text))" " Filter: (((cr ~~ '112%'::text) OR (db ~~ '112%'::text)) AND (kuupaev >= '2007-01-01'::date) AND (kuupaev <= '2008-11-26'::date))" " -> BitmapOr (cost=743.17..743.17 rows=65862 width=0) (actual time=27.194..27.194 rows=0 loops=1)" " -> Bitmap Index Scan on bilkaib_cr_pattern_idx (cost=0.00..236.63 rows=20939 width=0) (actual time=8.833..8.833 rows=21028 loops=1)" " Index Cond: ((cr ~>=~ '112'::bpchar) AND (cr ~<~ '113'::bpchar))" " -> Bitmap Index Scan on bilkaib_db_pattern_idx (cost=0.00..506.54 rows=44923 width=0) (actual time=18.345..18.345 rows=45426 loops=1)" " Index Cond: ((db ~>=~ '112'::bpchar) AND (db ~<~ '113'::bpchar))" " -> Hash (cost=20.49..20.49 rows=241 width=66) (actual time=2.450..2.450 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=66) (actual time=0.014..1.232 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=66) (actual time=2.799..2.799 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=66) (actual time=0.029..1.536 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=90) (actual time=11.661..11.661 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=90) (actual time=0.014..5.808 rows=1290 loops=1)" "Total runtime: 8634.630 ms"
On Fri, Nov 28, 2008 at 10:58 AM, Andrus <kobruleht2@hot.ee> wrote: > > SET work_mem = 2097151 (this is max allowed value) or SET work_mem = 97151 > decreases query time from 12 seconds to 9 seconds. Setting work_mem so high that it allocates memory that isn't there WILL slow down your queries, because the OS will then wind up swapping out one part of the swap to make room for another part. There are values between 100M and 2G. Run it with increasing work_mem from 100 to 500 or so Meg and see if that helps. Keep an eye on vmstat 1 or something to see if your machine starts swapping out while running the query. If it does you've overallocated memory and things will start to slow down a LOT.
I'm not sure that postgres allocates the whole work_mem each time, and in any event, the OS won't try and allocate to RAMuntil the page is actually used. My experience is that oversized work_mem doesn't hurt until it is actually used. Else,my configuration with 1000 concurrent connections and work_mem = 1GB would have blown up. I don't have that much RAM+ SWAP * overcommit. Of the 1000 connections, only 8 run queries that would ever need more than 2 or 3 MB of space toexecute. Of course, one has to be certain what the client connections do for it to be very over sized, so I would notrecommend the above in general. ---------- Back to this query: In the query case shown, the explain analyze shows: "GroupAggregate (cost=43403.38..52521.63 rows=41923 width=838) (actual time=8083.171..8620.908 rows=577 loops=1)" Thus, the planner thought that it needed ~40K ish rows for results of ~800 bytes in size, hence an approximation of the requiredhash space is 80M. However, it returns only 577 rows, so the actual needed hash space is much smaller. This isa common problem I've seen -- the query planner has very conservative estimates for result row counts from any non-trivialfilter condition / aggregate which leads to poor query plans. I'd be suprised if this query used more than 1MB total work_mem in reality for that last step if it used a hash. As it stands,sorting will actually use much more. I'm still not sure why the planner chose to sort rather than hash with oversized work_mem (is there an implied order in thequery results I missed?). My guess is that this query can still get much faster if a hash is possible on the last part. It looks like the gain so far has more to do with sorting purely in memory which reduced the number of compares required. But that is just a guess. ________________________________________ From: Scott Marlowe [scott.marlowe@gmail.com] Sent: Friday, November 28, 2008 10:24 AM To: Andrus Cc: Scott Carey; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Increasing GROUP BY CHAR columns speed On Fri, Nov 28, 2008 at 10:58 AM, Andrus <kobruleht2@hot.ee> wrote: > > SET work_mem = 2097151 (this is max allowed value) or SET work_mem = 97151 > decreases query time from 12 seconds to 9 seconds. Setting work_mem so high that it allocates memory that isn't there WILL slow down your queries, because the OS will then wind up swapping out one part of the swap to make room for another part. There are values between 100M and 2G. Run it with increasing work_mem from 100 to 500 or so Meg and see if that helps. Keep an eye on vmstat 1 or something to see if your machine starts swapping out while running the query. If it does you've overallocated memory and things will start to slow down a LOT.
Application should work with any server starting at 8.1 with any RAM size (probably starting at 1 GB). How to find safe value which does not cause error in SET work_mem command ? If I use 2 GB maybe this can cause excaption when running in server with 1 GB RAM where this setting may be not allowed. Andrus.
I it seems that slowness is caused by grouping by column exchrate numeric(13,8) if this column is excluded, query takes 12 seconds if this column in present, query takes 27 (!) seconds. How to fix this ? Andrus. set search_path to firma2,public; SET work_mem = 2097151 ; explain analyze SELECT CASE WHEN bilkaib.raha='EEK' THEN 0 ELSE bilkaib.id END, bilkaib.DB, bilkaib.CR, bilkaib.RAHA, -- 12 sek bilkaib.EXCHRATE, -- 27 sec SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood WHERE --(bilkaib.cr LIKE ''||'%' OR bilkaib.db LIKE ''||'%') bilkaib.kuupaev BETWEEN '2008-01-01' AND '2008-12-31' GROUP BY 1,2,3,4,5 "GroupAggregate (cost=71338.72..79761.05 rows=240638 width=58) (actual time=24570.085..27382.022 rows=217 loops=1)" " -> Sort (cost=71338.72..71940.31 rows=240638 width=58) (actual time=24566.700..25744.006 rows=322202 loops=1)" " Sort Key: CASE WHEN (bilkaib.raha = 'EEK'::bpchar) THEN 0 ELSE bilkaib.id END, bilkaib.db, bilkaib.cr, bilkaib.raha, bilkaib.exchrate" " -> Hash Left Join (cost=193.31..49829.89 rows=240638 width=58) (actual time=17.072..9901.578 rows=322202 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=42.18..45624.00 rows=240638 width=74) (actual time=4.715..7151.111 rows=322202 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=21.09..41803.63 rows=278581 width=74) (actual time=2.306..4598.703 rows=322202 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Seq Scan on bilkaib (cost=0.00..37384.19 rows=322507 width=74) (actual time=0.075..1895.027 rows=322202 loops=1)" " Filter: ((kuupaev >= '2008-01-01'::date) AND (kuupaev <= '2008-12-31'::date))" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.193..2.193 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.011..1.189 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.386..2.386 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.020..1.394 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=16) (actual time=12.319..12.319 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=16) (actual time=0.032..6.979 rows=1290 loops=1)" "Total runtime: 27434.724 ms" set search_path to firma2,public; SET work_mem = 2097151 ; explain analyze SELECT CASE WHEN bilkaib.raha='EEK' THEN 0 ELSE bilkaib.id END, bilkaib.DB, bilkaib.CR, bilkaib.RAHA, SUM(bilkaib.summa)::numeric(14,2) AS summa from BILKAIB join KONTO CRKONTO ON bilkaib.cr=crkonto.kontonr AND crkonto.iseloom='A' join KONTO DBKONTO ON bilkaib.db=dbkonto.kontonr AND dbkonto.iseloom='A' left join klient on bilkaib.klient=klient.kood WHERE bilkaib.kuupaev BETWEEN '2008-01-01' AND '2008-12-31' GROUP BY 1,2,3,4 "HashAggregate (cost=52837.86..57049.03 rows=240638 width=50) (actual time=11744.137..11745.578 rows=215 loops=1)" " -> Hash Left Join (cost=193.31..49829.89 rows=240638 width=50) (actual time=17.330..9826.549 rows=322202 loops=1)" " Hash Cond: ("outer".klient = "inner".kood)" " -> Hash Join (cost=42.18..45624.00 rows=240638 width=66) (actual time=4.804..7141.983 rows=322202 loops=1)" " Hash Cond: ("outer".cr = "inner".kontonr)" " -> Hash Join (cost=21.09..41803.63 rows=278581 width=66) (actual time=2.343..4600.683 rows=322202 loops=1)" " Hash Cond: ("outer".db = "inner".kontonr)" " -> Seq Scan on bilkaib (cost=0.00..37384.19 rows=322507 width=66) (actual time=0.081..1939.376 rows=322202 loops=1)" " Filter: ((kuupaev >= '2008-01-01'::date) AND (kuupaev <= '2008-12-31'::date))" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.207..2.207 rows=241 loops=1)" " -> Seq Scan on konto dbkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.014..1.179 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=20.49..20.49 rows=241 width=14) (actual time=2.426..2.426 rows=241 loops=1)" " -> Seq Scan on konto crkonto (cost=0.00..20.49 rows=241 width=14) (actual time=0.029..1.444 rows=241 loops=1)" " Filter: (iseloom = 'A'::bpchar)" " -> Hash (cost=147.90..147.90 rows=1290 width=16) (actual time=12.477..12.477 rows=1290 loops=1)" " -> Seq Scan on klient (cost=0.00..147.90 rows=1290 width=16) (actual time=0.034..7.081 rows=1290 loops=1)" "Total runtime: 11748.066 ms"
On Fri, Nov 28, 2008 at 12:51 PM, Scott Carey <scott@richrelevance.com> wrote: > I'm not sure that postgres allocates the whole work_mem each time, and in any event, the OS won't try and allocate to RAMuntil the page is actually used. My experience is that oversized work_mem doesn't hurt until it is actually used. Else,my configuration with 1000 concurrent connections and work_mem = 1GB would have blown up. I don't have that much RAM+ SWAP * overcommit. Of the 1000 connections, only 8 run queries that would ever need more than 2 or 3 MB of space toexecute. Of course, one has to be certain what the client connections do for it to be very over sized, so I would notrecommend the above in general. That's kinda like saying I have a revolver with 100 chambers and only one bullet, and it seems perfectly safe to put it to my head and keep pulling the trigger. Of course pg doesn't allocate the whole amount every time. It allocates what it needs, up to the max you allow. by setting it to 1G it's quite possible that eventually enough queries will be running that need a fair bit of work_mem and suddenly bang, your server is in a swap storm and goes non-responsive. It's far better to set it to something reasonable, like 4 or 8 Meg, then for the individual queries that need more set it higher at run time.
>I it seems that slowness is caused by grouping by column > > exchrate numeric(13,8) exchrate has different values in few rows. It has usually value 0 In this sample query it is always 0. I tried not change exchrate with nullif( bilkaib.EXCHRATE,0) but this does not up speed query, no idea why. Andrus.
>I'm still not sure why the planner chose to sort rather than hash with >oversized work_mem (is there an implied order in the query results I >missed?). Group by contains decimal column exchrate. Maybe pg is not capable to use hash with numeric datatype. > My guess is that this query can still get much faster if a hash is > possible on the last part. It looks like the gain so far has more to do > with sorting purely in memory which reduced the number of compares > required. But that is just a guess. I fixed this by adding cast to :::float bilkaib.exchrate:::float In this case query is much faster. Hopefully this will not affect to result since numeric(13,8) can casted to float without data loss. Andrus.
On Sat, Nov 29, 2008 at 6:43 PM, Andrus <kobruleht2@hot.ee> wrote: >> I'm still not sure why the planner chose to sort rather than hash with >> oversized work_mem (is there an implied order in the query results I >> missed?). > > Group by contains decimal column exchrate. Maybe pg is not capable to use > hash with numeric datatype. It is in 8.3. I think sorting was improved dramatically since 8.1 as well. > I fixed this by adding cast to :::float > > bilkaib.exchrate:::float > > In this case query is much faster. > Hopefully this will not affect to result since numeric(13,8) can casted to > float without data loss. That's not true. Even pretty simple values like 1.1 cannot be represented precisely in a float. It would display properly though which might be all you're concerned with here. I'm not sure whether that's true for all values in numeric(13,8) though Do you really need to be grouping on so many columns? If they're normally all the same perhaps you can do two queries, one which fetches the common values without any group by, just a simple aggregate, and a second which groups by all these columns but only for the few exceptional records. You could avoid the collation support on the char() columns by casting them to bytea first. That might be a bit of a headache though. -- greg