Thread: Speedups
Hi, just to let anyone know: I did some analyzing and searched for areas that could gain more speedups for 6.4. First I had something like an optimizer cache in mind (planner remembers parsetree and if a subsequent parsetree only differs in const values, substitute consts by params and reuse saved plans instead of creating a new plan all the time). But this is what I got for the complete regression test (only queries that went through the planner counted): Parsing and rule rewriting 14 % Optimizer and planning 6 % Query execution 80 % ------ Total time in backend 100 % It clearly shows that there's no need to speedup the optimizer. The parser and the executor are the ones that consume the time. Making the planner/optimizer smarter resulting better plans faster to execute is the way. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
> Parsing and rule rewriting 14 % > Optimizer and planning 6 % > Query execution 80 % > ------ > Total time in backend 100 % > Nice analysis. Certainly looks like Query Execution is the way to go. profiling has shown quite a lot to help us. Usually it is not the executor itself, but the subsystems it calls. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Jan Wieck wrote: > > Hi, > > just to let anyone know: > > I did some analyzing and searched for areas that could gain > more speedups for 6.4. First I had something like an > optimizer cache in mind (planner remembers parsetree and if a > subsequent parsetree only differs in const values, substitute > consts by params and reuse saved plans instead of creating a > new plan all the time). > > But this is what I got for the complete regression test (only > queries that went through the planner counted): > > Parsing and rule rewriting 14 % > Optimizer and planning 6 % > Query execution 80 % > ------ > Total time in backend 100 % > > It clearly shows that there's no need to speedup the > optimizer. The parser and the executor are the ones that > consume the time. Making the planner/optimizer smarter > resulting better plans faster to execute is the way. This may sound like an obvious question, but if a user defines a query, do we save the query plan? This would reduce the communications between the client and server (a small gain), and allow the server to start executing the query as soon as it recognized the name of the stored query and parsed the arguments. Ocie Mitchell
On Wed, 4 Mar 1998 ocie@paracel.com wrote: > This may sound like an obvious question, but if a user defines a > query, do we save the query plan? This would reduce the > communications between the client and server (a small gain), and allow > the server to start executing the query as soon as it recognized the > name of the stored query and parsed the arguments. Not sure ofhand, but it would be useful for JDBC's PreparedStatement and CallableStatement classes -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
Peter T Mount wrote: > > On Wed, 4 Mar 1998 ocie@paracel.com wrote: > > > This may sound like an obvious question, but if a user defines a > > query, do we save the query plan? This would reduce the > > communications between the client and server (a small gain), and allow > > the server to start executing the query as soon as it recognized the > > name of the stored query and parsed the arguments. > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and > CallableStatement classes We can implement it very easy, and fast. Execution plan may be reused many times. Is this feature in standard ? What is proposed syntax if not ? Vadim
Vadim B. Mikheev wrote: > > Peter T Mount wrote: > > > > On Wed, 4 Mar 1998 ocie@paracel.com wrote: > > > > > This may sound like an obvious question, but if a user defines a > > > query, do we save the query plan? This would reduce the > > > communications between the client and server (a small gain), and allow > > > the server to start executing the query as soon as it recognized the > > > name of the stored query and parsed the arguments. > > > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and > > CallableStatement classes > > We can implement it very easy, and fast. Execution plan may be reused > many times. Is this feature in standard ? > What is proposed syntax if not ? I don't think it is so much a question of syntax as it is a question of what we do in the backend. Suppose I create a stored query in SQL. We already store the SQL source for this in the database, right? So when it comes time to execute the query, we take this SQL and execute it as if the user had entered it directly. What I am proposing would be to basically store the compiled query plan as well. I do see a couple sticky points: We would need some information about which variables are to be substituted into this query plan, but this should be fairly straightforward. Some querys may not respond well to this, for example, if a table had an index on an integer field f1, this would probably be the best way to satisfy a select where f1<10. But if this were in a query as f1<x, then a sufficiently high value of x might make this not such a good way to run the query. I haven't looked into this, but I would assume that the optimizer relies on the specific values in such cases. We need to be able to handle changes to the structures and contents of the tables. If the query plan is built and we add 10000 rows to a table it references, the query should probably be recompiled. We could probably do this at vacuum time. There is also a small chance that a table or index that the query plan was using is dropped. We could automatically rebuild the query if the table was created after the query was compiled. Boy, to look at this, you'd think I had already built one of these :) I haven't but I'm willing to give it a shot. Ocie
ocie@paracel.com wrote: > > > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and > > > CallableStatement classes > > > > We can implement it very easy, and fast. Execution plan may be reused > > many times. Is this feature in standard ? > > What is proposed syntax if not ? > > I do see a couple sticky points: > > We would need some information about which variables are to be > substituted into this query plan, but this should be fairly > straightforward. Parser, Planner/Optimizer and Executor are able to handle parameters! No problems with this. > Some querys may not respond well to this, for example, if a table had > an index on an integer field f1, this would probably be the best way > to satisfy a select where f1<10. But if this were in a query as f1<x, > then a sufficiently high value of x might make this not such a good > way to run the query. I haven't looked into this, but I would assume > that the optimizer relies on the specific values in such cases. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Unfortunately, no! We have to add this feature of 'course. I don't know how we could deal with pre-compiled plans after this :( May be, we could prepare/store not single plan, but some number of possible plans. > We need to be able to handle changes to the structures and contents of > the tables. If the query plan is built and we add 10000 rows to a > table it references, the query should probably be recompiled. We > could probably do this at vacuum time. There is also a small chance > that a table or index that the query plan was using is dropped. We > could automatically rebuild the query if the table was created after > the query was compiled. We could mark stored plans as durty in such cases to force re-compiling when an application tries to use this plan. Vadim
Vadim wrote: > > ocie@paracel.com wrote: > > > > > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and > > > > CallableStatement classes > > > > > > We can implement it very easy, and fast. Execution plan may be reused > > > many times. Is this feature in standard ? > > > What is proposed syntax if not ? > > > > I do see a couple sticky points: > > > > We would need some information about which variables are to be > > substituted into this query plan, but this should be fairly > > straightforward. > > Parser, Planner/Optimizer and Executor are able to handle parameters! > No problems with this. Nice discussion - especially when looking at what I initially posted. I assume you think about using SPI's saved plan feature for it. Right? > > > Some querys may not respond well to this, for example, if a table had > > an index on an integer field f1, this would probably be the best way > > to satisfy a select where f1<10. But if this were in a query as f1<x, > > then a sufficiently high value of x might make this not such a good > > way to run the query. I haven't looked into this, but I would assume > > that the optimizer relies on the specific values in such cases. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Unfortunately, no! > We have to add this feature of 'course. > I don't know how we could deal with pre-compiled plans after this :( > May be, we could prepare/store not single plan, but some number of > possible plans. That's something I thought about when I used the SPI functions when I built PL/Tcl. Since the saved plan will be executed via SPI_execp(), we could change some details there. Currently SPI_prepare() and SPI_saveplan() return the plan itself. But they could also return a little control struct that contains the plan and other information. Since I don't expect someone uses these plans for something else than calling SPI_execp(), it wouldn't break anything. SPI_execp() can do some timing calculations. For each execution of a plan it collects the runtime in microseconds (gettimeofday()). After the 5th or 10th call, it builds an average and remembers that permanently. For all subsequent calls it calculates the average time of the last 10 calls and if that gets much higher than the initial average it wouldn't hurt to silently prepare and save the plan again. Using averages lowers the problem that differences in the parameters can cause the runtime differences. Another possible reason for the runtime differences is the overall workload of the server. This could be very high during the initial average calculation. So I think it could be smart to rebuild the plan after e.g. 1000 calls ignoring any runtimes. > > > We need to be able to handle changes to the structures and contents of > > the tables. If the query plan is built and we add 10000 rows to a > > table it references, the query should probably be recompiled. We > > could probably do this at vacuum time. There is also a small chance > > that a table or index that the query plan was using is dropped. We > > could automatically rebuild the query if the table was created after > > the query was compiled. > > We could mark stored plans as durty in such cases to force re-compiling > when an application tries to use this plan. Yep. SPI must remember all prepared and saved plans (and forget about only prepared ones at transaction end). Things like dropping an index or modifying a table structure cause invalidations in the relcache, syscache and catcache (even if another backend did it in some cases). I think it must be possible to tell SPI from there that something happened and which relations are affected. If a plans rangetable contains the affected relation, the plan is marked durty. Things like functions, operators and aggregates are also objects that might change (drop/recreate function -> funcnode in plan get's unusable). I think the best would be that SPI_prepare() set's up a collection of Oid's that cause plan invalidation in the control structure. These are the Oid's of ALL objects (relations, indices, functions etc.) used in the plan. Then a call to SPI_invalidate(Oid) from the cache invalidation handlers doesn't have to walk through the plan itself. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
> I don't think it is so much a question of syntax as it is a question > of what we do in the backend. Suppose I create a stored query in SQL. > We already store the SQL source for this in the database, right? So > when it comes time to execute the query, we take this SQL and execute > it as if the user had entered it directly. What I am proposing would > be to basically store the compiled query plan as well. > > I do see a couple sticky points: > > We would need some information about which variables are to be > substituted into this query plan, but this should be fairly > straightforward. > > Some querys may not respond well to this, for example, if a table had > an index on an integer field f1, this would probably be the best way > to satisfy a select where f1<10. But if this were in a query as f1<x, > then a sufficiently high value of x might make this not such a good > way to run the query. I haven't looked into this, but I would assume > that the optimizer relies on the specific values in such cases. I have thought about this. If we take a query string, remove all quoted constants and numeric constants, we can automatically split apart the query from the parameters. We can then look up the non-parameter query in our cache, and if it matches, replace the new contants with the old and run the query. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
On Thu, 5 Mar 1998, Vadim B. Mikheev wrote: > ocie@paracel.com wrote: > > > > > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and > > > > CallableStatement classes > > > > > > We can implement it very easy, and fast. Execution plan may be reused > > > many times. Is this feature in standard ? > > > What is proposed syntax if not ? > > > > I do see a couple sticky points: > > > > We would need some information about which variables are to be > > substituted into this query plan, but this should be fairly > > straightforward. > > Parser, Planner/Optimizer and Executor are able to handle parameters! > No problems with this. > > > Some querys may not respond well to this, for example, if a table had > > an index on an integer field f1, this would probably be the best way > > to satisfy a select where f1<10. But if this were in a query as f1<x, > > then a sufficiently high value of x might make this not such a good > > way to run the query. I haven't looked into this, but I would assume > > that the optimizer relies on the specific values in such cases. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Unfortunately, no! > We have to add this feature of 'course. > I don't know how we could deal with pre-compiled plans after this :( > May be, we could prepare/store not single plan, but some number of > possible plans. For inserts & updates I would have thought that the plan would be almost identical (correct me here if I'm wrong here), and this is an area where Java really crawls at the moment. Try some 40k inserts in JDBC... and go out for the night. If we can improve things, it would be a big bonus. -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk