Thread: Turning the PLANNER off
I'm looking at different ways of optimizing queries with a large number of joins. I write the same query in a number of different ways and compare the running times. Now the problem is I do not want the optimizer changing the queries. So I explicit state the order of the joins in the FROM clause. I also turn off everything I can except for one type of join (say hash join), and I've turned off geqo. But I find that the PLANNER still takes an enormous amount of time for some queries. It doesn't look like the Planner is actually optimizing (changing) anything, but just in case, I was wondering if there was a way to turn off the PLANNER. Note, when I say an enormous amount of time, I mean at least double the time the EXECUTOR takes to actually answer the query. Thanks for your help, Ben McMahan ps. here is a small example of what my queries look like (so you can see if there is something else it might be deciding on): SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5 FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN (r1 c1 (x4,x1,x2) JOIN r1 c0 (x1,x3,x4) ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 )) ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 )) ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 )) ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 ); A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4 where it also renames the columns to x4, x2, and x5 respectively.
That is a good question. The planner does more than just analyse the query. It generates the Plan used by the executor, so that can't be removed. It is always a pain when the optimizer/planner takes longer than the executor. We do have PREPARE/EXECUTE in 7.3beta for you to use. --------------------------------------------------------------------------- Ben McMahan wrote: > I'm looking at different ways of optimizing queries with a large number of > joins. I write the same query in a number of different ways and compare > the running times. Now the problem is I do not want the optimizer > changing the queries. So I explicit state the order of the joins in the > FROM clause. I also turn off everything I can except for one type of join > (say hash join), and I've turned off geqo. But I find that the PLANNER > still takes an enormous amount of time for some queries. It doesn't look > like the Planner is actually optimizing (changing) anything, but just in > case, I was wondering if there was a way to turn off the PLANNER. > > Note, when I say an enormous amount of time, I mean at least double the > time the EXECUTOR takes to actually answer the query. > > Thanks for your help, > > Ben McMahan > > ps. here is a small example of what my queries look like (so you can see > if there is something else it might be deciding on): > > SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5 > FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN (r1 c1 > (x4,x1,x2) JOIN r1 c0 (x1,x3,x4) > ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 )) > ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 )) > ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 )) > ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 ); > > A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4 > where it also renames the columns to x4, x2, and x5 respectively. > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Does PREPARE turn select * from mytable into select mytable.field1,mytable.field2 from mynamespace.mytable ? I was looking for this functionality for one of my projects so I'm curious. On Monday 28 October 2002 06:55 pm, (Via wrote: > That is a good question. The planner does more than just analyse the > query. It generates the Plan used by the executor, so that can't be > removed. > > It is always a pain when the optimizer/planner takes longer than the > executor. We do have PREPARE/EXECUTE in 7.3beta for you to use. > > > --------------------------------------------------------------------------- > > Ben McMahan wrote: > > I'm looking at different ways of optimizing queries with a large number > > of joins. I write the same query in a number of different ways and > > compare the running times. Now the problem is I do not want the > > optimizer changing the queries. So I explicit state the order of the > > joins in the FROM clause. I also turn off everything I can except for > > one type of join (say hash join), and I've turned off geqo. But I find > > that the PLANNER still takes an enormous amount of time for some queries. > > It doesn't look like the Planner is actually optimizing (changing) > > anything, but just in case, I was wondering if there was a way to turn > > off the PLANNER. > > > > Note, when I say an enormous amount of time, I mean at least double the > > time the EXECUTOR takes to actually answer the query. > > > > Thanks for your help, > > > > Ben McMahan > > > > ps. here is a small example of what my queries look like (so you can see > > if there is something else it might be deciding on): > > > > SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5 > > FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN > > (r1 c1 (x4,x1,x2) JOIN r1 c0 (x1,x3,x4) > > ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 )) > > ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 )) > > ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 )) > > ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 ); > > > > A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4 > > where it also renames the columns to x4, x2, and x5 respectively. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly
David Walker wrote: > Does PREPARE turn > select * from mytable > into > select mytable.field1,mytable.field2 from mynamespace.mytable > ? > > I was looking for this functionality for one of my projects so I'm curious. PREPARE turns the query into an internal format used by the executor. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Ah, so Ben finally got around to posting here. Ben's a CS Grad student here at Rice. His (current) project involves taking some interesting results from constraint satisfaction and implementing them on a database: one of the CS faculty has demonstrated that one class of highly joined DB queries maps to a solved problem in constraint satisfaction. The end goal would be an optimizer module or setting that recognizes this class of query, and spits out a mathematically optimized join order. So, in the interim, Ben's trying to do it by hand: preorder the joins and demonstrate that the 'best' order is in fact the best. Then move on to looking into integrating this, if possible: part of the problem is recognizing the structure of the query, of course. Right now, the planner is getting in the way - although he can extract the needed timing info, he's wasting CPU cycles planning things that don't need it, limiting the number of cases he can try. As mentioned before, there's currently no interface to feed in a Plan, so he's out of luck. Is there, programmatically, a way to do it? Serialize a plan tree to a file, and feed it in latter, purely for development purposes. How painful would that be? Should I send him in to see if he can implement one quickly, or are there dragons hiding in there? Ross On Mon, Oct 28, 2002 at 07:55:02PM -0500, Bruce Momjian wrote: > > That is a good question. The planner does more than just analyse the > query. It generates the Plan used by the executor, so that can't be > removed. > > It is always a pain when the optimizer/planner takes longer than the > executor. We do have PREPARE/EXECUTE in 7.3beta for you to use. > > > --------------------------------------------------------------------------- > > Ben McMahan wrote: > > I'm looking at different ways of optimizing queries with a large number of > > joins. I write the same query in a number of different ways and compare > > the running times. Now the problem is I do not want the optimizer > > changing the queries. So I explicit state the order of the joins in the > > FROM clause. I also turn off everything I can except for one type of join > > (say hash join), and I've turned off geqo. But I find that the PLANNER > > still takes an enormous amount of time for some queries. It doesn't look > > like the Planner is actually optimizing (changing) anything, but just in > > case, I was wondering if there was a way to turn off the PLANNER. > > > > Note, when I say an enormous amount of time, I mean at least double the > > time the EXECUTOR takes to actually answer the query. > > > > Thanks for your help, > > > > Ben McMahan > > > > ps. here is a small example of what my queries look like (so you can see > > if there is something else it might be deciding on): > > > > SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5 > > FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN (r1 c1 > > (x4,x1,x2) JOIN r1 c0 (x1,x3,x4) > > ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 )) > > ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 )) > > ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 )) > > ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 ); > > > > A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4 > > where it also renames the columns to x4, x2, and x5 respectively. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
"Ross J. Reedstrom" <reedstrm@rice.edu> writes: > So, in the interim, Ben's trying to do it by hand: preorder the joins > and demonstrate that the 'best' order is in fact the best. Then move on > to looking into integrating this, if possible: part of the problem is > recognizing the structure of the query, of course. Right now, the planner > is getting in the way - although he can extract the needed timing info, > he's wasting CPU cycles planning things that don't need it, limiting > the number of cases he can try. What's the basis for your assertion that it's "planning things that don't need it"? Given a JOIN-constrained query I do not believe the planner will look at any cases other than the intended join order. > As mentioned before, there's currently no interface to feed in a Plan, > so he's out of luck. Is there, programmatically, a way to do it? > Serialize a plan tree to a file, and feed it in latter, purely for > development purposes. How painful would that be? Should I send him in to > see if he can implement one quickly, or are there dragons hiding in there? He can do whatever he wants, as long as he has no illusions about getting it accepted back into the sources ;-). What would probably be more useful is to do some profiling to understand why the planner is taking longer than he wants even with a JOIN-constrained query. I should think this would be pretty quick. regards, tom lane
On Wed, Oct 30, 2002 at 10:59:39PM -0500, Tom Lane wrote: > > What's the basis for your assertion that it's "planning things that > don't need it"? Given a JOIN-constrained query I do not believe the > planner will look at any cases other than the intended join order. Well, that was a loose choice of words - let's say the planner seems to be taking awfully long to build an execution tree with only one choice available. > He can do whatever he wants, as long as he has no illusions about > getting it accepted back into the sources ;-). Understood - this would be a hacking tool only. > What would probably be more useful is to do some profiling to understand > why the planner is taking longer than he wants even with a > JOIN-constrained query. I should think this would be pretty quick. Yup, that targets the same question as above - anything 'uneeded' actually happening in the planner? I'll send him off with this suggestion. Ross