Thread: performance with query
Hi everybody, I'm creating my database on postgres and after some days of hard work I'm arrived to obtain good performance and owfull performace with the same configuration. I have complex query that perform very well with mergejoin on and nestloop off. If I activate nestloop postgres try to use it and the query execution become inconclusive: after 3 hours still no answare so I kill the query. Tht's ok but, with this configuration, very simple and little query like "slect colum from table where primarykey=value bacome incredibly slow. The only solutionI found at the momento is to set mergejoin to off before doing this query. That is an awfull solution because with that solution I have to change all the software (a big, old software) in the (many) points in witch this kind of query are used (the same problem to set to off mergejoin for all the system and activate it on che connection that have to make the hard query). Do you have any suggestion to accelerate both complex and silply query? I've tried a lot of configuration in enabling different "Planner Method Configuration" but the only combination that really accelerate hard query is mergejoin on and nestloop off, other settings seems to be useless. Thank's in advance.
On Tue, Jun 16, 2009 at 03:37:42PM +0200, Alberto Dalmaso wrote: > Hi everybody, I'm creating my database on postgres and after some days > of hard work I'm arrived to obtain good performance and owfull > performace with the same configuration. > I have complex query that perform very well with mergejoin on and > nestloop off. > If I activate nestloop postgres try to use it and the query execution > become inconclusive: after 3 hours still no answare so I kill the query. > Tht's ok but, with this configuration, very simple and little query like > "slect colum from table where primarykey=value bacome incredibly slow. > The only solutionI found at the momento is to set mergejoin to off > before doing this query. > That is an awfull solution because with that solution I have to change > all the software (a big, old software) in the (many) points in witch > this kind of query are used (the same problem to set to off mergejoin > for all the system and activate it on che connection that have to make > the hard query). > Do you have any suggestion to accelerate both complex and silply query? > I've tried a lot of configuration in enabling different "Planner Method > Configuration" but the only combination that really accelerate hard > query is mergejoin on and nestloop off, other settings seems to be > useless. > Thank's in advance. It would be helpful if you posted EXPLAIN ANALYZE results for both queries. This will require you to run each query to completion; if that's not possible for the 3 hour query, at least run EXPLAIN and post those results. - Josh / eggyknap
Attachment
Alberto Dalmaso <dalmaso@clesius.it> wrote: > I have complex query that perform very well with mergejoin on and > nestloop off. > If I activate nestloop postgres try to use it and the query > execution become inconclusive: after 3 hours still no answare so I > kill the query. > Tht's ok but, with this configuration, very simple and little query > like "slect colum from table where primarykey=value bacome > incredibly slow. > The only solutionI found at the momento is to set mergejoin to off > before doing this query. We'll need a lot more information to be able to provide useful advice. What version of PostgreSQL? What OS? What does the hardware look like? (CPUs, drives, memory, etc.) Do you have autovacuum running? What other regular maintenance to you do? What does your postgresql.conf file look like? (If you can strip out all comments and show the rest, that would be great.) With that as background, if you can show us the schema for the table(s) involved and the text of a query, along with the EXPLAIN ANALYZE output (or just EXPLAIN, if the query runs too long to get the EXPLAIN ANALYZE results) that would allow us to wee where things are going wrong. Please show this information without setting any of the optimizer options off; but then, as a diagnostic step, *also* show EXPLAIN ANALYZE results when you set options to a configuration that runs faster. -Kevin
> What version of PostgreSQL? 8.3 that comes with opensuse 11.1 > > What OS? Linux, opensuse 11.1 64 bit > > What does the hardware look like? (CPUs, drives, memory, etc.) 2 * opteron dual core 8 GB RAM, 70 GB SCSI U320 RAID 1 > > Do you have autovacuum running? What other regular maintenance to you > do? YES, autovacuum and analyze are running, the only other activity is the wal backup > > What does your postgresql.conf file look like? (If you can strip out > all comments and show the rest, that would be great.) I'll post only the value I've changed shared_buffers = 1536MB temp_buffers = 5MB max_prepared_transactions = 30 work_mem = 50MB # I've lot of work in order by maintenance_work_mem =50MB max_stack_depth = 6MB max_fsm_pages = 160000 max_fsm_relations = 5000 wal_buffers = 3072kB enable_bitmapscan = on enable_hashagg = on enable_hashjoin = off enable_indexscan = on enable_mergejoin = on enable_nestloop = off enable_seqscan = off enable_sort = off enable_tidscan = on effective_cache_size = 3600MB geqo = off default_statistics_target = 100 > > With that as background, if you can show us the schema for the > table(s) involved and the text of a query, along with the EXPLAIN > ANALYZE output (or just EXPLAIN, if the query runs too long to get the > EXPLAIN ANALYZE results) that would allow us to wee where things are > going wrong. Please show this information without setting any of the > optimizer options off; but then, as a diagnostic step, *also* show > EXPLAIN ANALYZE results when you set options to a configuration that > runs faster. > > -Kevin The problem is that in the simply query it uses mergejoin instead of nastedloop (obvious for the parameters I set) but in this situation in becomes very very slow (15 sec vs 5 ms when I set to off mergejoin). That is the explain of the complex query that works with more than acceptable performance "Merge Right Join (cost=508603077.17..508603195.59 rows=1 width=227)" " Merge Cond: (ve_edil_rendite.id_domanda = domande.id_domanda)" " -> GroupAggregate (cost=0.00..105.51 rows=1031 width=11)" " -> Index Scan using pk_ve_edil_rendite on ve_edil_rendite (cost=0.00..86.84 rows=1157 width=11)" " -> Materialize (cost=508603077.17..508603077.18 rows=1 width=195)" " -> Nested Loop (cost=506932259.90..508603077.17 rows=1 width=195)" " -> Merge Join (cost=406932259.90..408603074.89 rows=1 width=188)" " Merge Cond: (domande.id_domanda = c_elaout_7.id_domanda)" " -> Merge Join (cost=406932259.90..408188339.97 rows=1 width=240)" " Merge Cond: (c_elaout_5.id_domanda = domande.id_domanda)" " -> Merge Join (cost=3895.15..1259628.81 rows=138561 width=41)" " Merge Cond: (edil_veneto.id_domanda = c_elaout_5.id_domanda)" " -> Merge Join (cost=1123.18..372710.75 rows=98122 width=29)" " Merge Cond: (edil_veneto.id_domanda = c_elaout_6.id_domanda)" " -> Index Scan using "IDX_pk_Edil_Veneto" on edil_veneto (cost=0.00..11825.14 rows=232649 width=17)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_6 (cost=0.00..359914.34 rows=98122 width=12)" " Index Cond: ((c_elaout_6.node)::text = 'contributo_sociale'::text)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_5 (cost=0.00..887091.20 rows=245306 width=12)" " Index Cond: ((c_elaout_5.node)::text = 'contributo'::text)" " -> Materialize (cost=406928364.74..406928364.75 rows=1 width=199)" " -> Nested Loop (cost=402583154.89..406928364.74 rows=1 width=199)" " Join Filter: ((r_enti.codice_ente)::text = (r_luoghi.cod_catastale)::text)" " -> Merge Join (cost=202583154.89..206928031.60 rows=1 width=198)" " Merge Cond: (domande.id_domanda = c_elaout_4.id_domanda)" " -> Merge Join (cost=202583154.89..206425374.54 rows=1 width=186)" " Merge Cond: (domande.id_domanda = c_elain_3.id_domanda)" " -> Merge Join (cost=201328203.80..205170407.27 rows=41 width=138)" " Merge Cond: (domande.id_domanda = c_elain_7.id_domanda)" " -> Merge Join (cost=201328203.80..204498966.35 rows=93 width=126)" " Merge Cond: (domande.id_domanda = c_elain_9.id_domanda)" " -> Merge Join (cost=201322293.83..203828121.81 rows=424 width=114)" " Merge Cond: (domande.id_domanda = c_elain_8.id_domanda)" " -> Nested Loop (cost=201318498.02..203164011.74 rows=2431 width=102)" " -> Merge Join (cost=101318498.02..103147289.10 rows=2431 width=79)" " Merge Cond: (domande.id_domanda = doc.id)" " -> Merge Join (cost=101318487.80..103060677.64 rows=2493 width=75)" " Merge Cond: (domande.id_domanda = c_elain_1.id_domanda)" " -> Merge Join (cost=101316002.90..102447327.03 rows=15480 width=63)" " Merge Cond: (domande.id_domanda = c_elain.id_domanda)" " -> Merge Join (cost=101314975.72..101780946.74 rows=88502 width=51)" " Merge Cond: (c_elain_2.id_domanda = domande.id_domanda)" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_2 (cost=0.00..461104.96 rows=129806 width=12)" " Index Cond: ((node)::text = 'N_componenti'::text)" " -> Sort (cost=101314967.66..101316800.15 rows=732995 width=39)" " Sort Key: domande.id_domanda" " -> Merge Join (cost=119414.31..1243561.32 rows=732995 width=39)" " Merge Cond: (domande.id_dichiarazione = generiche_data_nascita_piu_anziano.id_dichiarazione)" " -> Merge Join (cost=18770.82..1126115.64 rows=123933 width=39)" " Merge Cond: (domande.id_dichiarazione = c_elaout.id_domanda)" " -> Index Scan using "IDX_5_domande" on domande (cost=0.00..91684.40 rows=31967 width=27)" " Index Cond: (id_servizio = 11002)" " Filter: (id_ente > 0)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout (cost=0.00..1031179.16 rows=805279 width=12)" " Filter: ((c_elaout.node)::text = 'ISEE'::text)" " -> Materialize (cost=100643.49..106653.58 rows=601009 width=12)" " -> Subquery Scan generiche_data_nascita_piu_anziano (cost=0.00..100042.48 rows=601009 width=12)" " -> GroupAggregate (cost=0.00..94032.39 rows=601009 width=12)" " -> Index Scan using "IDX_1_componenti" on componenti (cost=0.00..76403.45 rows=2023265 width=12)" " -> Index Scan using "IDX_1_c_elain" on c_elain (cost=0.00..665581.51 rows=188052 width=12)" " Index Cond: ((c_elain.node)::text = 'VSE'::text)" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_1 (cost=0.00..613000.48 rows=173074 width=12)" " Index Cond: ((c_elain_1.node)::text = 'AffittoISEE'::text)" " -> Index Scan using pk_doc on doc (cost=0.00..81963.12 rows=1847118 width=4)" " Filter: (doc.id_tp_stato_doc = 1)" " -> Index Scan using "IDX_pk_R_Enti" on r_enti (cost=0.00..6.87 rows=1 width=31)" " Index Cond: (r_enti.id_ente = domande.id_ente)" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_8 (cost=0.00..663631.02 rows=187497 width=12)" " Index Cond: ((c_elain_8.node)::text = 'Spese'::text)" " -> Index Scan using "IDX_2_c_elain" on c_elain c_elain_9 (cost=0.00..670253.16 rows=235758 width=12)" " Filter: ((c_elain_9.node)::text = 'Mesi'::text)" " -> Index Scan using "IDX_2_c_elain" on c_elain c_elain_7 (cost=0.00..670253.16 rows=474845 width=12)" " Filter: ((c_elain_7.node)::text = 'Affitto'::text)" " -> Materialize (cost=1254951.09..1254963.95 rows=1286 width=48)" " -> Merge Join (cost=2423.84..1254949.80 rows=1286 width=48)" " Merge Cond: (c_elain_3.id_domanda = c_elaout_1.id_domanda)" " -> Merge Join (cost=1094.64..606811.53 rows=1492 width=36)" " Merge Cond: (c_elain_3.id_domanda = c_elaout_3.id_domanda)" " -> Merge Join (cost=224.20..182997.39 rows=2667 width=24)" " Merge Cond: (c_elain_3.id_domanda = c_elaout_2.id_domanda)" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_3 (cost=0.00..74101.14 rows=19621 width=12)" " Index Cond: ((node)::text = 'Solo_anziani'::text)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_2 (cost=0.00..108761.74 rows=28155 width=12)" " Index Cond: ((c_elaout_2.node)::text = 'ise_fsa'::text)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_3 (cost=0.00..423543.07 rows=115886 width=12)" " Index Cond: ((c_elaout_3.node)::text = 'incidenza'::text)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_1 (cost=0.00..647740.85 rows=178481 width=12)" " Index Cond: ((c_elaout_1.node)::text = 'isee_fsa'::text)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_4 (cost=0.00..502312.35 rows=137879 width=12)" " Index Cond: ((c_elaout_4.node)::text = 'esito'::text)" " -> Seq Scan on r_luoghi (cost=100000000.00..100000200.84 rows=10584 width=11)" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_7 (cost=0.00..414451.53 rows=113348 width=12)" " Index Cond: ((c_elaout_7.node)::text = 'contributo_regolare'::text)" " -> Index Scan using "IDX_pk_VE_EDIL_tp_superfici" on ve_edil_tp_superfici (cost=0.00..2.27 rows=1 width=11)" " Index Cond: (ve_edil_tp_superfici.id_tp_superficie = edil_veneto.id_tp_superficie)" and that is the explain of the too slow simple query "Merge Join (cost=0.00..1032305.52 rows=4 width=12)" " Merge Cond: (domande.id_dichiarazione = c_elaout.id_domanda)" " -> Index Scan using "IDX_8_domande" on domande (cost=0.00..8.39 rows=1 width=4)" " Index Cond: (id_domanda = 4165757)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout (cost=0.00..1030283.89 rows=805279 width=12)" " Filter: ((c_elaout.node)::text = 'Invalido'::text)" this cost 15 sec with mergejoin to off: "Nested Loop (cost=100000000.00..100000022.97 rows=4 width=12)" " -> Index Scan using "IDX_8_domande" on domande (cost=0.00..8.39 rows=1 width=4)" " Index Cond: (id_domanda = 4165757)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout (cost=0.00..14.54 rows=4 width=12)" " Index Cond: (c_elaout.id_domanda = domande.id_dichiarazione)" " Filter: ((c_elaout.node)::text = 'Invalido'::text)" this cost 15 msec!!! This query work fine even with set enable_mergejoin='on'; set enable_nestloop='on'; "Nested Loop (cost=0.00..22.97 rows=4 width=12) (actual time=10.110..10.122 rows=1 loops=1)" " -> Index Scan using "IDX_8_domande" on domande (cost=0.00..8.39 rows=1 width=4) (actual time=0.071..0.075 rows=1 loops=1)" " Index Cond: (id_domanda = 4165757)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout (cost=0.00..14.54 rows=4 width=12) (actual time=10.029..10.031 rows=1 loops=1)" " Index Cond: (c_elaout.id_domanda = domande.id_dichiarazione)" " Filter: ((c_elaout.node)::text = 'Invalido'::text)" "Total runtime: 10.211 ms" but in this situation the previous kind of query doesn't arrive at the end and the plan becomes: "Merge Right Join (cost=100707011.72..100707130.15 rows=1 width=227)" " Merge Cond: (ve_edil_rendite.id_domanda = domande.id_domanda)" " -> GroupAggregate (cost=0.00..105.51 rows=1031 width=11)" " -> Index Scan using pk_ve_edil_rendite on ve_edil_rendite (cost=0.00..86.84 rows=1157 width=11)" " -> Materialize (cost=100707011.72..100707011.73 rows=1 width=195)" " -> Nested Loop (cost=100689558.36..100707011.72 rows=1 width=195)" " -> Nested Loop (cost=100689558.36..100706997.17 rows=1 width=247)" " -> Nested Loop (cost=100689558.36..100706982.62 rows=1 width=235)" " Join Filter: ((r_enti.codice_ente)::text = (r_luoghi.cod_catastale)::text)" " -> Nested Loop (cost=689558.36..706649.48 rows=1 width=234)" " -> Nested Loop (cost=689558.36..706647.20 rows=1 width=227)" " -> Nested Loop (cost=689558.36..706632.65 rows=1 width=215)" " -> Nested Loop (cost=689558.36..706618.10 rows=1 width=203)" " Join Filter: (domande.id_domanda = edil_veneto.id_domanda)" " -> Index Scan using "IDX_pk_Edil_Veneto" on edil_veneto (cost=0.00..11825.14 rows=232649 width=17)" " -> Materialize (cost=689558.36..689558.37 rows=1 width=186)" " -> Nested Loop (cost=100643.49..689558.36 rows=1 width=186)" " -> Nested Loop (cost=100643.49..689543.81 rows=1 width=174)" " -> Nested Loop (cost=100643.49..689530.86 rows=1 width=162)" " -> Nested Loop (cost=100643.49..689517.93 rows=1 width=150)" " -> Nested Loop (cost=100643.49..689505.01 rows=1 width=138)" " -> Nested Loop (cost=100643.49..689490.46 rows=1 width=126)" " -> Nested Loop (cost=100643.49..688816.73 rows=44 width=114)" " -> Merge Join (cost=100643.49..657277.54 rows=2431 width=102)" " Merge Cond: (domande.id_dichiarazione = generiche_data_nascita_piu_anziano.id_dichiarazione)" " -> Nested Loop (cost=0.00..549096.04 rows=412 width=102)" " -> Nested Loop (cost=0.00..547345.02 rows=106 width=90)" " -> Nested Loop (cost=0.00..546615.85 rows=106 width=67)" " -> Nested Loop (cost=0.00..545694.51 rows=109 width=63)" " -> Nested Loop (cost=0.00..537605.96 rows=621 width=51)" " -> Nested Loop (cost=0.00..487675.59 rows=3860 width=39)" " -> Index Scan using "IDX_5_domande" on domande (cost=0.00..91684.40 rows=31967 width=27)" " Index Cond: (id_servizio = 11002)" " Filter: (id_ente > 0)" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_2 (cost=0.00..12.37 rows=1 width=12)" " Index Cond: (((c_elain_2.node)::text = 'N_componenti'::text) AND (c_elain_2.id_domanda = domande.id_domanda))" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_1 (cost=0.00..12.92 rows=1 width=12)" " Index Cond: (((c_elain_1.node)::text = 'AffittoISEE'::text) AND (c_elain_1.id_domanda = domande.id_domanda))" " -> Index Scan using "IDX_1_c_elain" on c_elain (cost=0.00..13.01 rows=1 width=12)" " Index Cond: (((c_elain.node)::text = 'VSE'::text) AND (c_elain.id_domanda = domande.id_domanda))" " -> Index Scan using pk_doc on doc (cost=0.00..8.44 rows=1 width=4)" " Index Cond: (doc.id = domande.id_domanda)" " Filter: (doc.id_tp_stato_doc = 1)" " -> Index Scan using "IDX_pk_R_Enti" on r_enti (cost=0.00..6.87 rows=1 width=31)" " Index Cond: (r_enti.id_ente = domande.id_ente)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout (cost=0.00..16.47 rows=4 width=12)" " Index Cond: (c_elaout.id_domanda = domande.id_dichiarazione)" " Filter: ((c_elaout.node)::text = 'ISEE'::text)" " -> Materialize (cost=100643.49..106653.58 rows=601009 width=12)" " -> Subquery Scan generiche_data_nascita_piu_anziano (cost=0.00..100042.48 rows=601009 width=12)" " -> GroupAggregate (cost=0.00..94032.39 rows=601009 width=12)" " -> Index Scan using "IDX_1_componenti" on componenti (cost=0.00..76403.45 rows=2023265 width=12)" " -> Index Scan using "IDX_1_c_elain" on c_elain c_elain_3 (cost=0.00..12.96 rows=1 width=12)" " Index Cond: (((c_elain_3.node)::text = 'Solo_anziani'::text) AND (c_elain_3.id_domanda = domande.id_domanda))" " -> Index Scan using "IDX_3_c_elaout" on c_elaout c_elaout_2 (cost=0.00..15.30 rows=1 width=12)" " Index Cond: (((c_elaout_2.node)::text = 'ise_fsa'::text) AND (c_elaout_2.id_domanda = domande.id_domanda))" " -> Index Scan using "IDX_2_c_elaout" on c_elaout c_elaout_3 (cost=0.00..14.54 rows=1 width=12)" " Index Cond: (c_elaout_3.id_domanda = domande.id_domanda)" " Filter: ((c_elaout_3.node)::text = 'incidenza'::text)" " -> Index Scan using "IDX_2_c_elain" on c_elain c_elain_9 (cost=0.00..12.91 rows=1 width=12)" " Index Cond: (c_elain_9.id_domanda = domande.id_domanda)" " Filter: ((c_elain_9.node)::text = 'Mesi'::text)" " -> Index Scan using "IDX_2_c_elain" on c_elain c_elain_8 (cost=0.00..12.91 rows=1 width=12)" " Index Cond: (c_elain_8.id_domanda = domande.id_domanda)" " Filter: ((c_elain_8.node)::text = 'Spese'::text)" " -> Index Scan using "IDX_2_c_elain" on c_elain c_elain_7 (cost=0.00..12.91 rows=3 width=12)" " Index Cond: (c_elain_7.id_domanda = domande.id_domanda)" " Filter: ((c_elain_7.node)::text = 'Affitto'::text)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout c_elaout_1 (cost=0.00..14.54 rows=1 width=12)" " Index Cond: (c_elaout_1.id_domanda = domande.id_domanda)" " Filter: ((c_elaout_1.node)::text = 'isee_fsa'::text)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout c_elaout_7 (cost=0.00..14.54 rows=1 width=12)" " Index Cond: (c_elaout_7.id_domanda = domande.id_domanda)" " Filter: ((c_elaout_7.node)::text = 'contributo_regolare'::text)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout c_elaout_6 (cost=0.00..14.54 rows=1 width=12)" " Index Cond: (c_elaout_6.id_domanda = domande.id_domanda)" " Filter: ((c_elaout_6.node)::text = 'contributo_sociale'::text)" " -> Index Scan using "IDX_pk_VE_EDIL_tp_superfici" on ve_edil_tp_superfici (cost=0.00..2.27 rows=1 width=11)" " Index Cond: (ve_edil_tp_superfici.id_tp_superficie = edil_veneto.id_tp_superficie)" " -> Seq Scan on r_luoghi (cost=100000000.00..100000200.84 rows=10584 width=11)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout c_elaout_5 (cost=0.00..14.54 rows=1 width=12)" " Index Cond: (c_elaout_5.id_domanda = domande.id_domanda)" " Filter: ((c_elaout_5.node)::text = 'contributo'::text)" " -> Index Scan using "IDX_2_c_elaout" on c_elaout c_elaout_4 (cost=0.00..14.54 rows=1 width=12)" " Index Cond: (c_elaout_4.id_domanda = domande.id_domanda)" " Filter: ((c_elaout_4.node)::text = 'esito'::text)" Really thanks for your interest and your help!
On Tue, 16 Jun 2009, Alberto Dalmaso wrote: >> What does your postgresql.conf file look like? > enable_hashjoin = off > enable_nestloop = off > enable_seqscan = off > enable_sort = off Why are these switched off? > and that is the explain of the too slow simple query > > "Merge Join (cost=0.00..1032305.52 rows=4 width=12)" > " Merge Cond: (domande.id_dichiarazione = c_elaout.id_domanda)" > " -> Index Scan using "IDX_8_domande" on domande (cost=0.00..8.39 > rows=1 width=4)" > " Index Cond: (id_domanda = 4165757)" > " -> Index Scan using "IDX_2_c_elaout" on c_elaout > (cost=0.00..1030283.89 rows=805279 width=12)" > " Filter: ((c_elaout.node)::text = 'Invalido'::text)" > > this cost 15 sec > > > with mergejoin to off: > > "Nested Loop (cost=100000000.00..100000022.97 rows=4 width=12)" > " -> Index Scan using "IDX_8_domande" on domande (cost=0.00..8.39 > rows=1 width=4)" > " Index Cond: (id_domanda = 4165757)" > " -> Index Scan using "IDX_2_c_elaout" on c_elaout (cost=0.00..14.54 > rows=4 width=12)" > " Index Cond: (c_elaout.id_domanda = domande.id_dichiarazione)" > " Filter: ((c_elaout.node)::text = 'Invalido'::text)" > > this cost 15 msec!!! Well duh. What you're effectively doing is telling Postgres to NEVER use a nested loop. Then you're getting upset because it isn't using a nested loop. When you tell it to NEVER use anything (switching all join algorithms off), it ignores you and chooses the right plan anyway. Matthew -- You can configure Windows, but don't ask me how. -- Bill Gates
Alberto Dalmaso <dalmaso@clesius.it> wrote: >> What version of PostgreSQL? > 8.3 that comes with opensuse 11.1 Could you show us the result of SELECT version(); ? > max_prepared_transactions = 30 Unless you're using distributed transactions or need a lot of locks, that's just going to waste some RAM. Zero is fine for most people. > maintenance_work_mem =50MB That's a little small -- this only comes into play for maintenance tasks like index builds. Not directly part of your reported problem, but maybe something to bump to the 1GB range. > max_fsm_pages = 160000 > max_fsm_relations = 5000 Have you done any VACUUM VERBOSE lately and captured the output? If so, what do the last few lines say? (That's a lot of relations for the number of pages; just curious how it maps to actual.) > enable_hashjoin = off > enable_nestloop = off > enable_seqscan = off > enable_sort = off That's probably a bad idea. If particular queries aren't performing well, you can always set these temporarily on a particular connection. Even then, turning these off is rarely a good idea except for diagnostic purposes. I *strongly* recommend you put all of these back to the defaults of 'on' and start from there, turning off selected items as needed to get EXPLAIN ANALYZE output to demonstrate the better plans you've found for particular queries. > effective_cache_size = 3600MB That seems a little on the low side for an 8GB machine, unless you have other things on there using a lot of RAM. Do you? If you could set the optimizer options back on and get new plans where you show specifically which options (if any) where turned off for the run, that would be good. Also, please attach the plans to the email instead of pasting -- the word wrap makes them hard to read. Finally, if you could do \d on the tables involved in the query, it would help. I'll hold off looking at these in hopes that you can do the above. -Kevin
Il giorno mar, 16/06/2009 alle 15.58 +0100, Matthew Wakeling ha scritto: > On Tue, 16 Jun 2009, Alberto Dalmaso wrote: > >> What does your postgresql.conf file look like? > > > enable_hashjoin = off > > enable_nestloop = off > > enable_seqscan = off > > enable_sort = off > > Why are these switched off? > because of the need to pump up the performance of the complex query. If I set then to on then it try to use nasted loop even in the complex query and that query does never arrive to a response.... and, of course, I need a response from it!!! So my problem is to find a configuration taht save performance for all the two kind of query, but I'm not abble to find it. Move to parameters of the RAM can save a 10% of the time in the complex query, wile I have no changes on the simple one...
Alberto Dalmaso <dalmaso@clesius.it> writes: > Il giorno mar, 16/06/2009 alle 15.58 +0100, Matthew Wakeling ha scritto: >>> enable_hashjoin = off >>> enable_nestloop = off >>> enable_seqscan = off >>> enable_sort = off >> >> Why are these switched off? >> > because of the need to pump up the performance of the complex query. That is *not* the way to improve performance of a query. Turning off specific enable_ parameters can be helpful while investigating planner behavior, but it is never recommended as a production solution. You have already found out why. regards, tom lane
Alberto Dalmaso <dalmaso@clesius.it> wrote: > Il giorno mar, 16/06/2009 alle 15.58 +0100, Matthew Wakeling ha > scritto: >> On Tue, 16 Jun 2009, Alberto Dalmaso wrote: >> > enable_hashjoin = off >> > enable_nestloop = off >> > enable_seqscan = off >> > enable_sort = off >> >> Why are these switched off? >> > because of the need to pump up the performance of the complex query. These really are meant primarily for diagnostic purposes. As a last resort, you could set them off right before running a problem query, and set them back on again afterward; but you will be much better off if you can cure the underlying problem. The best chance of that is to show us the plan you get with all turned on. -Kevin
> Could you show us the result of SELECT version(); ? of course I can PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu, compiled by GCC gcc (SUSE Linux) 4.3.2 [gcc-4_3-branch revision 141291] > > Have you done any VACUUM VERBOSE lately and captured the output? If > so, what do the last few lines say? (That's a lot of relations for > the number of pages; just curious how it maps to actual.) It need a lot of time (20 GB database), when I will have the answare I'll post it > > > enable_hashjoin = off > > enable_nestloop = off > > enable_seqscan = off > > enable_sort = off > > That's probably a bad idea. If particular queries aren't performing > well, you can always set these temporarily on a particular connection. > Even then, turning these off is rarely a good idea except for > diagnostic purposes. I *strongly* recommend you put all of these back > to the defaults of 'on' and start from there, turning off selected > items as needed to get EXPLAIN ANALYZE output to demonstrate the > better plans you've found for particular queries. OK, it will became the viceversa of what I'm doing now (set them to on and set them to off only on the appropriate connection instead of set them to off and set them to on only on some appropriate connection). But the question is: do you thing it is impossible to find a configuration that works fine for both the kind of query? The application have to run even versus oracle db... i wont have to write a different source for the two database... > > > effective_cache_size = 3600MB > > That seems a little on the low side for an 8GB machine, unless you > have other things on there using a lot of RAM. Do you? yes there are two instances of postgress running on the same server (the database have to stay complitely separated). > > If you could set the optimizer options back on and get new plans where > you show specifically which options (if any) where turned off for the > run, that would be good. Also, please attach the plans to the email > instead of pasting -- the word wrap makes them hard to read. Finally, > if you could do \d on the tables involved in the query, it would help. > I'll hold off looking at these in hopes that you can do the above. > > -Kevin I attach the explanation of the log query after setting all the enable to on. In this condition the query will never finish...
Attachment
Alberto Dalmaso <dalmaso@clesius.it> wrote: > do you thing it is impossible to find a > configuration that works fine for both the kind of query? No. We probably just need a little more information. > The application have to run even versus oracle db... i wont have to > write a different source for the two database... I understand completely. > I attach the explanation of the log query after setting all the > enable to on. In this condition the query will never finish... We're getting close. Can you share the table structure and the actual query you are running? It's a lot easier (for me, anyway) to put this puzzle together with all the pieces in hand. Also, if you can set off some of the optimizer options and get a fast plan, please show us an EXPLAIN ANALYZE for that, with information on which settings were turned off. That will help show where bad estimates may be causing a problem, or possibly give a hint of table or index bloat problems. I think we're getting close to critical mass for seeing the solution.... -Kevin
Il giorno mar, 16/06/2009 alle 11.31 -0400, Tom Lane ha scritto: > Alberto Dalmaso <dalmaso@clesius.it> writes: > > Il giorno mar, 16/06/2009 alle 15.58 +0100, Matthew Wakeling ha scritto: > >>> enable_hashjoin = off > >>> enable_nestloop = off > >>> enable_seqscan = off > >>> enable_sort = off > >> > >> Why are these switched off? > >> > > because of the need to pump up the performance of the complex query. > > That is *not* the way to improve performance of a query. Turning off > specific enable_ parameters can be helpful while investigating planner > behavior, but it is never recommended as a production solution. You > have already found out why. > > regards, tom lane Ok, but the problem is that my very long query performes quite well when it works with merge join but it cannot arrive to an end if it use other kind of joining. If i put all the parameter to on, as both of you tell me, in the explanation I'll see that the db use nasted loop. If i put to off nasted loop, it will use hash join. How can I write the query so that the analyzer will use mergejoin (that is the only option that permit the query to give me the waited answare) without changing the settings every time on the connection?
Alberto Dalmaso <dalmaso@clesius.it> wrote: > I attach the explanation of the log query after setting all the > enable to on. In this condition the query will never finish... I notice that you many joins in there. If the query can't be simplified, you probably need to boost the join_collapse_limit and from_collapse_limit quite a bit. If planning time goes through the roof in that case, you may need to enable geqo -- this is what it's intended to help. If you try geqo, you may need to tune it; I'm not familiar with the knobs for tuning that, so maybe someone else will jump in if you get to that point. -Kevin
Alberto Dalmaso <dalmaso@clesius.it> writes: > Ok, but the problem is that my very long query performes quite well when > it works with merge join but it cannot arrive to an end if it use other > kind of joining. > If i put all the parameter to on, as both of you tell me, in the > explanation I'll see that the db use nasted loop. > If i put to off nasted loop, it will use hash join. > How can I write the query so that the analyzer will use mergejoin (that > is the only option that permit the query to give me the waited answare) > without changing the settings every time on the connection? You have the wrong mindset completely. Instead of thinking "how can I force the planner to do it my way", you need to be thinking "why is the planner guessing wrong about which is the best way to do it? And how can I improve its guess?" There's not really enough information in what you've posted so far to let people help you with that question, but one thing that strikes me from the EXPLAIN is that you have a remarkably large number of joins. Perhaps increasing from_collapse_limit and/or join_collapse_limit (to more than the number of tables in the query) would help. regards, tom lane
Unfortunatly the query need that level of complxity as the information I have to show are spread around different table. I have tryed the geqo on at the beginning but only with the default parameters. Tomorrow (my working day here in Italy is finished some minutes ago, so I will wait for the end of the explain analyze and the go home ;-P ) I'll try to increase, as you suggest, join_collapse_limit and from_collapse_limit. If someone can give me some information on how to configure geqo, I'll try it again. In the meantime this night I leave the vacuum verbose to work for me.
Even if the query end in aproximately 200 sec, the explain analyze is still working and there are gone more than 1000 sec... I leave it working this night. Have a nice evening and thenks for the help.
Good afternoon. I have developed an application to efficiently schedule chemotherapy patients at our hospital. The application takes into account several resource constraints (available chairs, available nurses, nurse coverage assignment to chairs) as well as the chair time and nursing time required for a regimen. The algorithm for packing appointments in respects each constraint and typically schedules a day of treatments (30-60) within 9-10 seconds on my workstation, down from 27 seconds initially. I would like to get it below 5 seconds if possible. I think what's slowing is down is simply the number of rows and joins. The algorithm creates a scheduling matrix with one row per 5 minute timeslot, per unit, per nurse assigned to the unit. That translates to 3,280 rows for the days I have designed in development (each day can change). To determine the available slots, the algorithm finds the earliest slot that has an available chair and a count of the required concurrent intervals afterwards. So a 60 minute regimen requires 12 concurrent rows. This is accomplished by joining the table on itself. A second query is ran for the same range, but with respect to the nurse time and an available nurse. Finally, those two are joined against each other. Effectively, it is: Select * From ( Select * From matrix m1, matrix m2 Where m1.xxxxx = m2.xxxxx ) chair, ( Select * From matrix m1, matrix m2 Where m1.xxxxx = m2.xxxxx ) nurse Where chair.id = nurse.id With matrix having 3,280 rows. Ugh. I have tried various indexes and clustering approachs with little success. Any ideas? Thanks, Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital (613) 549-6666 x4294
On the DB side of things, you will want to make sure that your caching as much as possible - putting a front-end like memcached could help. I assume you have indexes on the appropriate tables? What does the EXPLAIN ANALYZE on that query look like? Not necessarily a "postgres" solution, but I'd think this type of solution would work really, really well inside of say a a mixed integer or integer solver ... something like glpk or cbc. You'd need to reformulate the problem, but we've built applications using these tools which can crunch through multiple billions of combinations in under 1 or 2 seconds. (Of course, you still need to store the results, and feed the input, using a database of some kind). -- Anthony Presley On Tue, 2009-06-16 at 14:35 -0400, Hartman, Matthew wrote: > Good afternoon. > > I have developed an application to efficiently schedule chemotherapy > patients at our hospital. The application takes into account several > resource constraints (available chairs, available nurses, nurse coverage > assignment to chairs) as well as the chair time and nursing time > required for a regimen. > > The algorithm for packing appointments in respects each constraint and > typically schedules a day of treatments (30-60) within 9-10 seconds on > my workstation, down from 27 seconds initially. I would like to get it > below 5 seconds if possible. > > I think what's slowing is down is simply the number of rows and joins. > The algorithm creates a scheduling matrix with one row per 5 minute > timeslot, per unit, per nurse assigned to the unit. That translates to > 3,280 rows for the days I have designed in development (each day can > change). > > To determine the available slots, the algorithm finds the earliest slot > that has an available chair and a count of the required concurrent > intervals afterwards. So a 60 minute regimen requires 12 concurrent > rows. This is accomplished by joining the table on itself. A second > query is ran for the same range, but with respect to the nurse time and > an available nurse. Finally, those two are joined against each other. > Effectively, it is: > > Select * > From ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) chair, > ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) nurse > Where chair.id = nurse.id > > With matrix having 3,280 rows. Ugh. > > I have tried various indexes and clustering approachs with little > success. Any ideas? > > Thanks, > > Matthew Hartman > Programmer/Analyst > Information Management, ICP > Kingston General Hospital > (613) 549-6666 x4294 > >
Ok, here are the last rows for the vacuum analyze verbose INFO: free space map contains 154679 pages in 39 relations DETAIL: A total of 126176 page slots are in use (including overhead). 126176 page slots are required to track all free space. Current limits are: 160000 page slots, 5000 relations, using 1476 kB. L'interrogazione è stata eseguita con successo, ma senza risultato, in 1332269 ms. and I attach the complete explain analyze of the complex query. Giving more detail about the tables involved in the query could be not so easy as they are a lot. The joins are made between columns that are primary key in a table and indexed in the other. All the where clausole are on indexed colums (perhaps there are too many indexes...) Thanks a lot.
Attachment
Alberto Dalmaso wrote: [...] > in the explanation I'll see that the db use nasted loop. [...] Sorry for the remark off topic, but I *love* the term "nasted loop". It should not go to oblivion unnoticed. Yours, Laurenz Albe
Matthew Hartman wrote: > To determine the available slots, the algorithm finds the earliest slot > that has an available chair and a count of the required concurrent > intervals afterwards. So a 60 minute regimen requires 12 concurrent > rows. This is accomplished by joining the table on itself. A second > query is ran for the same range, but with respect to the nurse time and > an available nurse. Finally, those two are joined against each other. > Effectively, it is: > > Select * > From ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) chair, > ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) nurse > Where chair.id = nurse.id > > With matrix having 3,280 rows. Ugh. > > I have tried various indexes and clustering approachs with little > success. Any ideas? I don't understand your data model well enough to understand the query, so I can only give you general hints (which you probably already know): - Frequently the biggest performance gains can be reached by a (painful) redesign. Can ou change the table structure in a way that makes this query less expensive? - You have an index on matrix.xxxxx, right? - Can you reduce the row count of the two subqueries by adding additional conditions that weed out rows that can be excluded right away? - Maybe you can gain a little by changing the "select *" to "select id" in both subqueries and adding an additional join with matrix that adds the relevant columns in the end. I don't know the executor, so I don't know if that will help, but it would be a simple thing to test in an experiment. Yours, Laurenz Albe
On Wed, Jun 17, 2009 at 8:33 AM, Albe Laurenz<laurenz.albe@wien.gv.at> wrote: > > I don't understand your data model well enough to understand > the query, so I can only give you general hints (which you probably > already know): He is effectively joining same table 4 times in a for loop, to get result, this is veeery ineffective. imagine: for(x) for(x) for(x) for(x) .. where X is number of rows in table matrix. not scarred yet ? -- GJ
On Tue, Jun 16, 2009 at 2:35 PM, Hartman, Matthew<Matthew.Hartman@krcc.on.ca> wrote: > Good afternoon. > > I have developed an application to efficiently schedule chemotherapy > patients at our hospital. The application takes into account several > resource constraints (available chairs, available nurses, nurse coverage > assignment to chairs) as well as the chair time and nursing time > required for a regimen. > > The algorithm for packing appointments in respects each constraint and > typically schedules a day of treatments (30-60) within 9-10 seconds on > my workstation, down from 27 seconds initially. I would like to get it > below 5 seconds if possible. > > I think what's slowing is down is simply the number of rows and joins. > The algorithm creates a scheduling matrix with one row per 5 minute > timeslot, per unit, per nurse assigned to the unit. That translates to > 3,280 rows for the days I have designed in development (each day can > change). > > To determine the available slots, the algorithm finds the earliest slot > that has an available chair and a count of the required concurrent > intervals afterwards. So a 60 minute regimen requires 12 concurrent > rows. This is accomplished by joining the table on itself. A second > query is ran for the same range, but with respect to the nurse time and > an available nurse. Finally, those two are joined against each other. > Effectively, it is: > > Select * > From ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) chair, > ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) nurse > Where chair.id = nurse.id > > With matrix having 3,280 rows. Ugh. > > I have tried various indexes and clustering approachs with little > success. Any ideas? how far in advance do you schedule? As far as necessary? How many chairs are there? How many nurses are there? This is a tricky (read: interesting) problem. merlin
Alberto Dalmaso <dalmaso@clesius.it> wrote: > Ok, here are the last rows for the vacuum analyze verbose > > INFO: free space map contains 154679 pages in 39 relations > DETAIL: A total of 126176 page slots are in use (including > overhead). > 126176 page slots are required to track all free space. > Current limits are: 160000 page slots, 5000 relations, using 1476 ? kB. No indication of bloat there. You could afford to free some RAM by reducing the max_fsm_relations setting. (You have 39 relations but are reserving RAM to keep track of free space in 5000 relations.) > and I attach the complete explain analyze of the complex query. I'll see what I can glean from that when I get some time. > All the where clausole are on indexed colums (perhaps there are too > many indexes...) That's not usually a problem. The other thing I was hoping to see, which I don't think you've sent, is an EXPLAIN ANALYZE of the same query with the settings which you have found which cause it to pick a faster plan. As I understand it, that runs pretty fast, so hopefully that's a quick one for you to produce. -Kevin
Thanks for the replies everyone. I'll try to answer them all in this one email. I will send another email immediately afterthis with additional details about the query. > - Frequently the biggest performance gains can be reached by > a (painful) redesign. Can ou change the table structure in a way > that makes this query less expensive? I have considered redesigning the algorithm to accommodate this. As I've said, there's one row per five minute time slot.Instead, I could represent an interval of time with a row. For example, "start_time" of "08:00" with an "end_time" of"12:00" or perhaps an interval "duration" of "4 hours". The difficulty becomes in managing separate time requirements (nursevs unit) for each time slot, and in inserting/updating new rows as pieces of those time slots or intervals are usedup. Having a row per five minute interval avoids those complications so far. Still, I'd start with 32 rows and increasethe number, never reaching 3,280.. :) > - You have an index on matrix.xxxxx, right? I have tried indexes on each common join criteria. Usually it's "time,unit", "time,nurse", or "time,unit_scheduled", "time,nurse_scheduled"(the later two being Booleans). In the first two cases it's made a difference of less than a second.In the last two, the time actually increases if I add "analyze" statements in after updates are made. > - Can you reduce the row count of the two subqueries by adding > additional conditions that weed out rows that can be excluded > right away? I use some additional conditions. I'll paste the meat of the query below. > - Maybe you can gain a little by changing the "select *" to > "select id" in both subqueries and adding an additional join > with matrix that adds the relevant columns in the end. > I don't know the executor, so I don't know if that will help, > but it would be a simple thing to test in an experiment. I wrote the "select *" as simplified, but really, it returns the primary key for that row. > how far in advance do you schedule? As far as necessary? It's done on a per day basis, each day taking 8-12 seconds or so on my workstation. We typically schedule patients as muchas three to six months in advance. The query already pulls data to a temporary table to avoid having to manage a massivenumber of rows. > How many chairs are there? How many nurses are there? This is a > tricky (read: interesting) problem. In my current template there are 17 chairs and 7 nurses. Chairs are grouped into pods of 2-4 chairs. Nurses cover one tomany pods, allowing for a primary nurse per pod as well as floater nurses that cover multiple pods. Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital (613) 549-6666 x4294 -----Original Message----- From: Merlin Moncure [mailto:mmoncure@gmail.com] Sent: Wednesday, June 17, 2009 9:09 AM To: Hartman, Matthew Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Speeding up a query. On Tue, Jun 16, 2009 at 2:35 PM, Hartman, Matthew<Matthew.Hartman@krcc.on.ca> wrote: > Good afternoon. > > I have developed an application to efficiently schedule chemotherapy > patients at our hospital. The application takes into account several > resource constraints (available chairs, available nurses, nurse coverage > assignment to chairs) as well as the chair time and nursing time > required for a regimen. > > The algorithm for packing appointments in respects each constraint and > typically schedules a day of treatments (30-60) within 9-10 seconds on > my workstation, down from 27 seconds initially. I would like to get it > below 5 seconds if possible. > > I think what's slowing is down is simply the number of rows and joins. > The algorithm creates a scheduling matrix with one row per 5 minute > timeslot, per unit, per nurse assigned to the unit. That translates to > 3,280 rows for the days I have designed in development (each day can > change). > > To determine the available slots, the algorithm finds the earliest slot > that has an available chair and a count of the required concurrent > intervals afterwards. So a 60 minute regimen requires 12 concurrent > rows. This is accomplished by joining the table on itself. A second > query is ran for the same range, but with respect to the nurse time and > an available nurse. Finally, those two are joined against each other. > Effectively, it is: > > Select * > From ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) chair, > ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) nurse > Where chair.id = nurse.id > > With matrix having 3,280 rows. Ugh. > > I have tried various indexes and clustering approachs with little > success. Any ideas? how far in advance do you schedule? As far as necessary? How many chairs are there? How many nurses are there? This is a tricky (read: interesting) problem. merlin
That what i send is the quick execution, with other parameters this query simply doesn't come to an end. It is the little query that changing the settings (using the default with all the query analyzer on) becames really quick, while with this settings (with some analyzer switched off) became very slow. I don't belleve: using this settings set enable_hashjoin = 'on'; set enable_nestloop = 'on'; set enable_seqscan = 'on'; set enable_sort = 'on'; set from_collapse_limit = 8; set join_collapse_limit = 3; select * from v_fsa_2007_estrazione; finnally end in acceptable time (156 sec) what does it mean using join_collapse_limit = 3 (that is really a lot of object less that what i'm using in taht query). I'm executing an explain analyze in this new situation... It is possible that such a configuration can create performance problem on other queryes? (join_collapse_limit is set to a really low value) I'll made some test in this direction.
Sorry, I missed this reponse. I'm entirely new to PostgreSQL and have yet to figure out how to use EXPLAIN ANALYZE on a function. I think I realize where the problem is though (the loop), I simply do not know how to fix it ;). Glpk and cbc, thanks, I'll look into those. You're right, the very nature of using a loop suggests that another tool might be more appropriate. Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital (613) 549-6666 x4294 -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Anthony Presley Sent: Tuesday, June 16, 2009 3:37 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Speeding up a query. On the DB side of things, you will want to make sure that your caching as much as possible - putting a front-end like memcached could help. I assume you have indexes on the appropriate tables? What does the EXPLAIN ANALYZE on that query look like? Not necessarily a "postgres" solution, but I'd think this type of solution would work really, really well inside of say a a mixed integer or integer solver ... something like glpk or cbc. You'd need to reformulate the problem, but we've built applications using these tools which can crunch through multiple billions of combinations in under 1 or 2 seconds. (Of course, you still need to store the results, and feed the input, using a database of some kind). -- Anthony Presley On Tue, 2009-06-16 at 14:35 -0400, Hartman, Matthew wrote: > Good afternoon. > > I have developed an application to efficiently schedule chemotherapy > patients at our hospital. The application takes into account several > resource constraints (available chairs, available nurses, nurse coverage > assignment to chairs) as well as the chair time and nursing time > required for a regimen. > > The algorithm for packing appointments in respects each constraint and > typically schedules a day of treatments (30-60) within 9-10 seconds on > my workstation, down from 27 seconds initially. I would like to get it > below 5 seconds if possible. > > I think what's slowing is down is simply the number of rows and joins. > The algorithm creates a scheduling matrix with one row per 5 minute > timeslot, per unit, per nurse assigned to the unit. That translates to > 3,280 rows for the days I have designed in development (each day can > change). > > To determine the available slots, the algorithm finds the earliest slot > that has an available chair and a count of the required concurrent > intervals afterwards. So a 60 minute regimen requires 12 concurrent > rows. This is accomplished by joining the table on itself. A second > query is ran for the same range, but with respect to the nurse time and > an available nurse. Finally, those two are joined against each other. > Effectively, it is: > > Select * > From ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) chair, > ( > Select * > From matrix m1, matrix m2 > Where m1.xxxxx = m2.xxxxx > ) nurse > Where chair.id = nurse.id > > With matrix having 3,280 rows. Ugh. > > I have tried various indexes and clustering approachs with little > success. Any ideas? > > Thanks, > > Matthew Hartman > Programmer/Analyst > Information Management, ICP > Kingston General Hospital > (613) 549-6666 x4294 > > -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Alberto Dalmaso <dalmaso@clesius.it> wrote: > what does it mean using join_collapse_limit = 3 http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER -Kevin
P.S.: to understand what the query has to make (and 80% of the view hve these to make): a lot of time is spend to pivoting a table with a structure like identifier, description_of_value, numeric value that has to be transformed in identifier, description_1, description_2, ..., description_n where n is not a fixed number (it changes in function of the type of calculation that was used to generate the rows in the table). perhaps this information could help. thanks everybady
Alberto Dalmaso <dalmaso@clesius.it> wrote: > P.S.: to understand what the query has to make (and 80% of the view > hve these to make): a lot of time is spend to pivoting a table with > a structure like > identifier, description_of_value, numeric value > that has to be transformed in > identifier, description_1, description_2, ..., description_n > where n is not a fixed number (it changes in function of the type of > calculation that was used to generate the rows in the table). > > perhaps this information could help. What would help more is the actual query, if that can be shared. It leaves a lot less to the imagination than descriptions of it. There are a couple things which have been requested which would help pin down the reason the optimizer is not getting to a good plan, so that it can be allowed to do a good job. As Tom said, this would be a much more productive focus than casting about for ways to force it to do what you think is the best thing. (Maybe, given the chance, it can come up with a plan which runs in seconds, rather than over the 24 minutes you've gotten.) With all the optimizer options on, and the from_collapse_limit and join_collapse_limit values both set to 100, run an EXPLAIN (no ANALYZE) on your big problem query. Let us know how long the EXPLAIN runs. If it gets any errors, copy and paste all available information. (General descriptions aren't likely to get us very far.) Since EXPLAIN without ANALYZE only *plans* the query, but doesn't run it, it should not take long to do this. If there are any views or custom functions involved, showing those along with the query source would be good. If we get this information, we have a much better chance to find the real problem and get it fixed. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > With all the optimizer options on, and the from_collapse_limit and > join_collapse_limit values both set to 100, run an EXPLAIN (no > ANALYZE) on your big problem query. Let us know how long the EXPLAIN > runs. If it gets any errors, copy and paste all available > information. (General descriptions aren't likely to get us very far.) > Since EXPLAIN without ANALYZE only *plans* the query, but doesn't run > it, it should not take long to do this. One issue here is that with the collapse limits cranked up to more than geqo_threshold, he's going to be coping with GEQO's partially-random plan selection; so whatever he reports might or might not be especially reflective of day-to-day results. I'm tempted to ask that he also push up geqo_threshold. It's possible that that *will* send the planning time to the moon; but it would certainly be worth trying, to find out what plan is produced. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> With all the optimizer options on, and the from_collapse_limit and >> join_collapse_limit values both set to 100, run an EXPLAIN (no >> ANALYZE) on your big problem query. Let us know how long the >> EXPLAIN runs. If it gets any errors, copy and paste all available >> information. (General descriptions aren't likely to get us very >> far.) Since EXPLAIN without ANALYZE only *plans* the query, but >> doesn't run it, it should not take long to do this. > > One issue here is that with the collapse limits cranked up to more > than geqo_threshold, he's going to be coping with GEQO's partially- > random plan selection; so whatever he reports might or might not be > especially reflective of day-to-day results. I'm tempted to ask > that he also push up geqo_threshold. In an earlier post[1] he said that he had geqo turned off. It does pay to be explicit, though; I'd hate to assume it's of if he's been changing things. Alberto, please ensure that you still have geqo off when you run the test I suggested. Also, I see that I didn't explicitly say that you should send the ANALYZE output, but that's what would be helpful. > It's possible that that *will* send the planning time to the moon; > but it would certainly be worth trying, to find out what plan is > produced. Agreed. What plan is produced, and how long that takes. (And whether he gets an out of memory error.) I figured it was best to get a clear answer to those before moving on.... -Kevin [1] http://archives.postgresql.org/pgsql-performance/2009-06/msg00186.php
On Tue, 2009-06-16 at 14:35 -0400, Hartman, Matthew wrote: > The algorithm for packing appointments in respects each constraint and > typically schedules a day of treatments (30-60) within 9-10 seconds on > my workstation, down from 27 seconds initially. I would like to get it > below 5 seconds if possible. > > I think what's slowing is down is simply the number of rows and joins. > The algorithm creates a scheduling matrix with one row per 5 minute > timeslot, per unit, per nurse assigned to the unit. That translates to > 3,280 rows for the days I have designed in development (each day can > change). ISTM the efficiency of your algorithm is geometrically related to the number of time slots into which appointments might fit. So reduce number of possible time slots... Assign the slot (randomly/hash/round robin) to either the morning or the afternoon and then run exactly same queries just with half number of time slots. That should reduce your execution time by one quarter without using multiple CPUs for each morning/afternoon. Then run twice, once for morning, once for afternoon. You could parallelise this and run both at same time on different CPUs, if the extra work is worthwhile, but it seems not, judging from your requirements. Another way would be to arrange all appointments that need odd number of timeslots into pairs so that you have at most one appointment that needs an odd number of timeslots. Then schedule appointments on 10 minute boundaries, rounding up their timeslot requirement. (The single odd timeslot appointment will always waste 1 timeslot). Hope that helps. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
> From: Simon Riggs [mailto:simon@2ndQuadrant.com] > Sent: Tuesday, July 07, 2009 5:39 AM > > Another way would be to arrange all appointments that need odd number of > timeslots into pairs so that you have at most one appointment that needs > an odd number of timeslots. Then schedule appointments on 10 minute > boundaries, rounding up their timeslot requirement. (The single odd > timeslot appointment will always waste 1 timeslot). Now THAT is an interesting idea. I'll have to play with this in my head a bit (during really boring meetings) and get back to you. Thanks! Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital