Thread: Retry: Is this possible / slow performance?
Hi all, A retry of the question asked before. All tables freshly vacuumed an analized. Two queries: one with "set enable_seqscan = on" , the other with "set enable_seqscan = off". The first query lasts 59403ms, the second query 31 ms ( the desc order variant has the same large difference: 122494 ms vs. 1297 ms). (for thequery plans see below). Can I, without changing the SQL (because it is generated by a tool) or explicitely setting "set enable_seqscan = off" forthis query, trick PostgreSQL in taking the fast variant of the queryplan? TIA Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: J.Kraaijeveld@Askesis.nl web: www.askesis.nl ------------------------------- Query 1 begin; set enable_seqscan = on; declare SQL_CUR01 cursor for SELECT A.ordernummer, B.klantnummer FROM "orders" A LEFT OUTER JOIN "klt_alg" B ON A.Klantnummer=B.Klantnummer ORDER BY A.klantnummer; fetch 100 in SQL_CUR01; commit; QUERY PLAN Sort (cost=259968.77..262729.72 rows=1104380 width=12) Sort Key: a.klantnummer, a.ordernummer -> Hash Left Join (cost=42818.43..126847.70 rows=1104380 width=12) Hash Cond: ("outer".klantnummer = "inner".klantnummer) -> Seq Scan on orders a (cost=0.00..46530.79 rows=1104379 width=8) -> Hash (cost=40635.14..40635.14 rows=368914 width=4) -> Seq Scan on klt_alg b (cost=0.00..40635.14 rows=368914 width=4) Actual running time: 59403 ms. ------------------------------- Query 2 begin; set enable_seqscan = off; declare SQL_CUR01 cursor for SELECT A.ordernummer, B.klantnummer FROM "orders" A LEFT OUTER JOIN "klt_alg" B ON A.Klantnummer=B.Klantnummer ORDER BY A.klantnummer; fetch 100 in SQL_CUR01; commit; QUERY PLAN Merge Left Join (cost=0.00..2586604.86 rows=1104380 width=12) Merge Cond: ("outer".klantnummer = "inner".klantnummer) -> Index Scan using orders_klantnummer on orders a (cost=0.00..2435790.17 rows=1104379 width=8) -> Index Scan using klt_alg_klantnummer on klt_alg b (cost=0.00..44909.11 rows=368914 width=4) Actual running time: 31 ms.
"Joost Kraaijeveld" <J.Kraaijeveld@Askesis.nl> writes: > Two queries: one with "set enable_seqscan = on" , the other with "set enable_seqscan = off". The first query lasts 59403ms, the second query 31 ms ( the desc order variant has the same large difference: 122494 ms vs. 1297 ms). (for thequery plans see below). The reason for the difference is that the mergejoin plan has a much lower startup cost than the hash plan, and since you're only fetching 100 rows the startup cost is dominant. IIRC the planner does make some allowance for this effect when preparing a DECLARE CURSOR plan (ie, it puts some weight on startup cost rather than considering only total cost) ... but it's not so optimistic as to assume that you only want 100 out of an estimated 1 million+ result rows. The best solution is probably to put a LIMIT into the DECLARE CURSOR, so that the planner can see how much you intend to fetch. regards, tom lane
Does the planner also take into account that the Hash Join will need a huge temporary space which will exist for the whole length of the cursor existence (which may be quite long if he intends to fetch everything), whereas the Merge Join should need very little space as it is sending the rows as it fetches them using the Indexes ? On Mon, 07 Feb 2005 12:03:56 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Joost Kraaijeveld" <J.Kraaijeveld@Askesis.nl> writes: >> Two queries: one with "set enable_seqscan = on" , the other with "set >> enable_seqscan = off". The first query lasts 59403 ms, the second query >> 31 ms ( the desc order variant has the same large difference: 122494 ms >> vs. 1297 ms). (for the query plans see below). > > The reason for the difference is that the mergejoin plan has a much > lower startup cost than the hash plan, and since you're only fetching > 100 rows the startup cost is dominant. IIRC the planner does make some > allowance for this effect when preparing a DECLARE CURSOR plan (ie, > it puts some weight on startup cost rather than considering only total > cost) ... but it's not so optimistic as to assume that you only want 100 > out of an estimated 1 million+ result rows. > > The best solution is probably to put a LIMIT into the DECLARE CURSOR, > so that the planner can see how much you intend to fetch. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
>> The best solution is probably to put a LIMIT into the DECLARE CURSOR, >> so that the planner can see how much you intend to fetch. I assume that this limits the resultset to a LIMIT. That is not what I was hoping for. I was hoping for a way to scrolllthrought the whole tables with orders. I have tested, and if one really wants the whole table the query with "set enable_seqscan = on" lasts 137 secs, the querywith "set enable_seqscan = off" lasts 473 secs, so (alas), the planner is right. I sure would like to have ISAM like behaviour once in a while. Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: J.Kraaijeveld@Askesis.nl web: www.askesis.nl
> >> The best solution is probably to put a LIMIT into the DECLARE CURSOR, > >> so that the planner can see how much you intend to fetch. > I assume that this limits the resultset to a LIMIT. That is not what I was > hoping for. I was hoping for a way to scrolll throught the whole tables > with orders. > > I have tested, and if one really wants the whole table the query with "set > enable_seqscan = on" lasts 137 secs, the query with "set enable_seqscan = > off" lasts 473 secs, so (alas), the planner is right. > > I sure would like to have ISAM like behaviour once in a while. Then stop using cursors. A few months back I detailed the relative merits of using Cursors v. Queries to provide ISAM like functionality and Queries win hands down. Right now I am using pg as an ISAM backend for a relatively old and large COBOL ERP via a C++ ISAM driver, for which a publicly available version of the source will be available Real Soon Now :-). Merlin