Thread: Re: [PATCHES] GUC parameter cursors_tuple_fraction
On Thu, 2008-04-03 at 16:45 +0200, Hell, Robert wrote: > This patch adds a GUC parameter for tuple_fraction of cursors (discussed > earlier here: > http://archives.postgresql.org/pgsql-performance/2008-04/msg00018.php). > By setting this parameter the planner's favor to use fast-start plans > for cursors can be affected. I think this patch looks OK coding-wise, but not tested, yet. If we did apply this patch it would need significantly more documentation, probably examples and the like. But I think writing that docs would open up a can of worms, hence copying -hackers. Other RDBMS allow users to specify whether they want fast-start or all-data plans. We should discuss whether we want to set the fraction directly or whether we should have 2 (or more) specific settings such as "fast" and "all". Also, if we did have this parameter then I don't think it should be included in postgresql.conf. I don't see any need to change the default setting for *all* cursors, but I can see the need to change the cursor fraction for *one* specific query. Which raises wider issues. * We could add to DECLARE syntax that says something like OPTIMIZE FOR FIRST ROWS or OPTIMIZE FOR ALL ROWS. But our policy AIUI is that we do not want to further decorate SQL Standard commands. * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we "Don't want hints". If that's what we really think, then this patch must surely be rejected because its a hint... That isn't my view. I *now* think we do need hints of various kinds. Decorating queries with *all* necessary information is not always good, but there are some kinds of information that *do* belong on specific queries. The cursor fraction is a great example of information that really does live on a specific query. But in a wider sense, I think support of hints is actually the only way long term of making large applications work within a reasonable timeframe and cost. If we change information at the database object level in order to correct one issue, we are likely to find that more problems are raised elsewhere. Same thing is true of altering optimiser cost models. Few users can wait 2 years while we solve the problem and fix it permanently, or even a few days while they resolve the inner workings of the planner and work out how to re-write it. I had spoken strongly against hints for general use in Postgres previously. Many attendees on recent PostgreSQL performance courses have successfully argued in favour of hints and as a result my viewpoint is now changed. Though we need a central no-hints-allowed GUC for those cases where application programmers need restraining... I think we need Hints. (And not this patch, sorry about that, Robert). (This could well lead to me losing work doing performance tuning, though I believe its the wish of the majority that we should support hints). -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs wrote: > * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we > "Don't want hints". If that's what we really think, then this patch must > surely be rejected because its a hint... That isn't my view. I *now* > think we do need hints of various kinds. cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints we've said "no" to in the past. We don't want hints that work-around planner deficiencies, for example where we get the row count of a node completely wrong. This is different. This is about telling how the application is going to use the result set. It's relevant even assuming that the planner got the estimates spot on. Which plan is the best depends on whether the application can start processing the data as it comes in, or whether it's loading it all in memory first, for example. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Simon Riggs wrote: >> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we >> "Don't want hints". If that's what we really think, then this patch must >> surely be rejected because its a hint... That isn't my view. I *now* >> think we do need hints of various kinds. > cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints > we've said "no" to in the past. More to the point, I think what we've generally meant by "hints" is nonstandard decoration on individual SQL commands (either explicit syntax or one of those interpret-some-comments kluges). Simon is reading the policy in such a way that it would forbid all the planner cost parameters, which is surely not what is intended. I see this as being basically another cost parameter, and as such I don't think it needs more documentation than any of those have. (Now admittedly you could argue that they could all use a ton more documentation than they now have, but it's not reasonable to insist on just this one meeting a different standard.) regards, tom lane
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > Simon Riggs wrote: > >> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we > >> "Don't want hints". If that's what we really think, then this patch must > >> surely be rejected because its a hint... That isn't my view. I *now* > >> think we do need hints of various kinds. > > > cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints > > we've said "no" to in the past. > > More to the point, I think what we've generally meant by "hints" is > nonstandard decoration on individual SQL commands (either explicit > syntax or one of those interpret-some-comments kluges). Yes, that is definitely an Oracle compatibility thought. > Simon is > reading the policy in such a way that it would forbid all the planner > cost parameters, which is surely not what is intended. So we're allowed to influence the behaviour of the planner, but just not by touching the individual statements. OK. Can we allow a statement like SET index_weighting = '{{my_index, 0.1},{another_index, 0.5}}' That would allow us to tell a specific SQL statement that it should use a cost weighting of 0.1 * normal cost for the "my_index" index (etc). SET enable_seqscan = off; is a blunt instrument that can sometimes achieve the same thing, but insufficiently exact to be really useful. Many people use that (Sun, in their first published PostgreSQL benchmark...) We/I want to make the planner even better, but the above is roughly what people want while they're waiting for us to get the planner right. > I see this as being basically another cost parameter, and as such > I don't think it needs more documentation than any of those have. > (Now admittedly you could argue that they could all use a ton more > documentation than they now have, but it's not reasonable to insist > on just this one meeting a different standard.) OK, seems fair. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Fri, 2008-05-02 at 16:17 +0100, Heikki Linnakangas wrote: > Simon Riggs wrote: > > * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we > > "Don't want hints". If that's what we really think, then this patch must > > surely be rejected because its a hint... That isn't my view. I *now* > > think we do need hints of various kinds. > > cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints > we've said "no" to in the past. We don't want hints that work-around > planner deficiencies, for example where we get the row count of a node > completely wrong. This is different. This is about telling how the > application is going to use the result set. It's relevant even assuming > that the planner got the estimates spot on. Yes, thats what I see. > Which plan is the best > depends on whether the application can start processing the data as it > comes in, or whether it's loading it all in memory first, for example. Agreed, which is why people want to tell us that also, when they know the answer in the context of their application. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote: > I see this as being basically another cost parameter, and as such > I don't think it needs more documentation than any of those have. > (Now admittedly you could argue that they could all use a ton more > documentation than they now have, but it's not reasonable to insist > on just this one meeting a different standard.) OK, if that's the view then the patch is ready for commit, AFAICS. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> writes: > OK, if that's the view then the patch is ready for commit, AFAICS. Use of the plural in the name seems a bit odd to me. Anyone have a problem with calling it "cursor_tuple_fraction" instead? regards, tom lane
You're right - that's just a typo in the subject of the post. It's called cursor_tuple_fraction in the submitted patch. Regards, Robert -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Freitag, 02. Mai 2008 22:36 To: Simon Riggs Cc: Heikki Linnakangas; Hell, Robert; pgsql-patches@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PATCHES] GUC parameter cursors_tuple_fraction Simon Riggs <simon@2ndquadrant.com> writes: > OK, if that's the view then the patch is ready for commit, AFAICS. Use of the plural in the name seems a bit odd to me. Anyone have a problem with calling it "cursor_tuple_fraction" instead? regards, tom lane
On Fri, 2008-05-02 at 16:36 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > OK, if that's the view then the patch is ready for commit, AFAICS. > > Use of the plural in the name seems a bit odd to me. Anyone have a > problem with calling it "cursor_tuple_fraction" instead? Agreed. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
"Hell, Robert" <Robert.Hell@fabasoft.com> writes: > You're right - that's just a typo in the subject of the post. > It's called cursor_tuple_fraction in the submitted patch. Ah, I hadn't actually read the patch yet ;-). As penance for the noise, I will do so now. regards, tom lane
On Friday 02 May 2008 13:35:27 Simon Riggs wrote: > On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote: > > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > > Simon Riggs wrote: > > >> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that > > >> we "Don't want hints". If that's what we really think, then this patch > > >> must surely be rejected because its a hint... That isn't my view. I > > >> *now* think we do need hints of various kinds. > > > > > > cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints > > > we've said "no" to in the past. > > > > More to the point, I think what we've generally meant by "hints" is > > nonstandard decoration on individual SQL commands (either explicit > > syntax or one of those interpret-some-comments kluges). > > Yes, that is definitely an Oracle compatibility thought. > > > Simon is > > reading the policy in such a way that it would forbid all the planner > > cost parameters, which is surely not what is intended. > > So we're allowed to influence the behaviour of the planner, but just not > by touching the individual statements. OK. > > Can we allow a statement like > > SET index_weighting = '{{my_index, 0.1},{another_index, 0.5}}' > > That would allow us to tell a specific SQL statement that it should use > a cost weighting of 0.1 * normal cost for the "my_index" index (etc). > SET enable_seqscan = off; is a blunt instrument that can sometimes > achieve the same thing, but insufficiently exact to be really useful. > Many people use that (Sun, in their first published PostgreSQL > benchmark...) > > We/I want to make the planner even better, but the above is roughly what > people want while they're waiting for us to get the planner right. > I think the above would be helpful, but even then I am not sure it goes far enough, since there might be cases where you need and index wieghted high for a specific join within the query, but low for a different join in that query. A further problem with this implementation would be that in general it would require that you issue a set, run your query, and then issue another set to put those weightings back to the defaults, which seems like an excessive amount of overhead. As much as people like to turn their nose to in-line query hints, the manifestation of deficiencies in the planner always manifiest themselves at the query level, so it makes it difficult to create a solid solution that operates somewhere else. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL