Thread: Re: [PATCHES] GUC parameter cursors_tuple_fraction

Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Simon Riggs
Date:
On Thu, 2008-04-03 at 16:45 +0200, Hell, Robert wrote:
> This patch adds a GUC parameter for tuple_fraction of cursors (discussed
> earlier here:
> http://archives.postgresql.org/pgsql-performance/2008-04/msg00018.php).
> By setting this parameter the planner's favor to use fast-start plans
> for cursors can be affected.

I think this patch looks OK coding-wise, but not tested, yet.

If we did apply this patch it would need significantly more
documentation, probably examples and the like. But I think writing that
docs would open up a can of worms, hence copying -hackers.

Other RDBMS allow users to specify whether they want fast-start or
all-data plans. We should discuss whether we want to set the fraction
directly or whether we should have 2 (or more) specific settings such as
"fast" and "all".

Also, if we did have this parameter then I don't think it should be
included in postgresql.conf. I don't see any need to change the default
setting for *all* cursors, but I can see the need to change the cursor
fraction for *one* specific query. Which raises wider issues.

* We could add to DECLARE syntax that says something like OPTIMIZE FOR
FIRST ROWS or OPTIMIZE FOR ALL ROWS. But our policy AIUI is that we do
not want to further decorate SQL Standard commands.

* We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
"Don't want hints". If that's what we really think, then this patch must
surely be rejected because its a hint... That isn't my view. I *now*
think we do need hints of various kinds.

Decorating queries with *all* necessary information is not always good,
but there are some kinds of information that *do* belong on specific
queries. The cursor fraction is a great example of information that
really does live on a specific query.

But in a wider sense, I think support of hints is actually the only way
long term of making large applications work within a reasonable
timeframe and cost. If we change information at the database object
level in order to correct one issue, we are likely to find that more
problems are raised elsewhere. Same thing is true of altering optimiser
cost models. Few users can wait 2 years while we solve the problem and
fix it permanently, or even a few days while they resolve the inner
workings of the planner and work out how to re-write it.

I had spoken strongly against hints for general use in Postgres
previously. Many attendees on recent PostgreSQL performance courses have
successfully argued in favour of hints and as a result my viewpoint is
now changed. Though we need a central no-hints-allowed GUC for those
cases where application programmers need restraining...

I think we need Hints. (And not this patch, sorry about that, Robert).

(This could well lead to me losing work doing performance tuning, though
I believe its the wish of the majority that we should support hints).

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
"Heikki Linnakangas"
Date:
Simon Riggs wrote:
> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
> "Don't want hints". If that's what we really think, then this patch must
> surely be rejected because its a hint... That isn't my view. I *now*
> think we do need hints of various kinds.

cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
we've said "no" to in the past. We don't want hints that work-around
planner deficiencies, for example where we get the row count of a node
completely wrong. This is different. This is about telling how the
application is going to use the result set. It's relevant even assuming
that the planner got the estimates spot on. Which plan is the best
depends on whether the application can start processing the data as it
comes in, or whether it's loading it all in memory first, for example.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Tom Lane
Date:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Simon Riggs wrote:
>> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
>> "Don't want hints". If that's what we really think, then this patch must
>> surely be rejected because its a hint... That isn't my view. I *now*
>> think we do need hints of various kinds.

> cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
> we've said "no" to in the past.

More to the point, I think what we've generally meant by "hints" is
nonstandard decoration on individual SQL commands (either explicit
syntax or one of those interpret-some-comments kluges).  Simon is
reading the policy in such a way that it would forbid all the planner
cost parameters, which is surely not what is intended.

I see this as being basically another cost parameter, and as such
I don't think it needs more documentation than any of those have.
(Now admittedly you could argue that they could all use a ton more
documentation than they now have, but it's not reasonable to insist
on just this one meeting a different standard.)

            regards, tom lane

Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Simon Riggs
Date:
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> > Simon Riggs wrote:
> >> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
> >> "Don't want hints". If that's what we really think, then this patch must
> >> surely be rejected because its a hint... That isn't my view. I *now*
> >> think we do need hints of various kinds.
>
> > cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
> > we've said "no" to in the past.
>
> More to the point, I think what we've generally meant by "hints" is
> nonstandard decoration on individual SQL commands (either explicit
> syntax or one of those interpret-some-comments kluges).

Yes, that is definitely an Oracle compatibility thought.

> Simon is
> reading the policy in such a way that it would forbid all the planner
> cost parameters, which is surely not what is intended.

So we're allowed to influence the behaviour of the planner, but just not
by touching the individual statements. OK.

Can we allow a statement like

SET index_weighting = '{{my_index, 0.1},{another_index, 0.5}}'

That would allow us to tell a specific SQL statement that it should use
a cost weighting of 0.1 * normal cost for the "my_index" index (etc).
SET enable_seqscan = off; is a blunt instrument that can sometimes
achieve the same thing, but insufficiently exact to be really useful.
Many people use that (Sun, in their first published PostgreSQL
benchmark...)

We/I want to make the planner even better, but the above is roughly what
people want while they're waiting for us to get the planner right.

> I see this as being basically another cost parameter, and as such
> I don't think it needs more documentation than any of those have.
> (Now admittedly you could argue that they could all use a ton more
> documentation than they now have, but it's not reasonable to insist
> on just this one meeting a different standard.)

OK, seems fair.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Simon Riggs
Date:
On Fri, 2008-05-02 at 16:17 +0100, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > * We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
> > "Don't want hints". If that's what we really think, then this patch must
> > surely be rejected because its a hint... That isn't my view. I *now*
> > think we do need hints of various kinds.
>
> cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
> we've said "no" to in the past. We don't want hints that work-around
> planner deficiencies, for example where we get the row count of a node
> completely wrong. This is different. This is about telling how the
> application is going to use the result set. It's relevant even assuming
> that the planner got the estimates spot on.

Yes, thats what I see.

> Which plan is the best
> depends on whether the application can start processing the data as it
> comes in, or whether it's loading it all in memory first, for example.

Agreed, which is why people want to tell us that also, when they know
the answer in the context of their application.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Simon Riggs
Date:
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote:

> I see this as being basically another cost parameter, and as such
> I don't think it needs more documentation than any of those have.
> (Now admittedly you could argue that they could all use a ton more
> documentation than they now have, but it's not reasonable to insist
> on just this one meeting a different standard.)

OK, if that's the view then the patch is ready for commit, AFAICS.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> OK, if that's the view then the patch is ready for commit, AFAICS.

Use of the plural in the name seems a bit odd to me.  Anyone have a
problem with calling it "cursor_tuple_fraction" instead?

            regards, tom lane

Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
"Hell, Robert"
Date:
You're right - that's just a typo in the subject of the post.
It's called cursor_tuple_fraction in the submitted patch.

Regards,
Robert

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Freitag, 02. Mai 2008 22:36
To: Simon Riggs
Cc: Heikki Linnakangas; Hell, Robert; pgsql-patches@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [PATCHES] GUC parameter cursors_tuple_fraction

Simon Riggs <simon@2ndquadrant.com> writes:
> OK, if that's the view then the patch is ready for commit, AFAICS.

Use of the plural in the name seems a bit odd to me.  Anyone have a
problem with calling it "cursor_tuple_fraction" instead?

            regards, tom lane

Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Simon Riggs
Date:
On Fri, 2008-05-02 at 16:36 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > OK, if that's the view then the patch is ready for commit, AFAICS.
>
> Use of the plural in the name seems a bit odd to me.  Anyone have a
> problem with calling it "cursor_tuple_fraction" instead?

Agreed.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Tom Lane
Date:
"Hell, Robert" <Robert.Hell@fabasoft.com> writes:
> You're right - that's just a typo in the subject of the post.
> It's called cursor_tuple_fraction in the submitted patch.

Ah, I hadn't actually read the patch yet ;-).  As penance for the noise,
I will do so now.

            regards, tom lane

Re: [PATCHES] GUC parameter cursors_tuple_fraction

From
Robert Treat
Date:
On Friday 02 May 2008 13:35:27 Simon Riggs wrote:
> On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote:
> > "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> > > Simon Riggs wrote:
> > >> * We've said here http://www.postgresql.org/docs/faqs.TODO.html that
> > >> we "Don't want hints". If that's what we really think, then this patch
> > >> must surely be rejected because its a hint... That isn't my view. I
> > >> *now* think we do need hints of various kinds.
> > >
> > > cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
> > > we've said "no" to in the past.
> >
> > More to the point, I think what we've generally meant by "hints" is
> > nonstandard decoration on individual SQL commands (either explicit
> > syntax or one of those interpret-some-comments kluges).
>
> Yes, that is definitely an Oracle compatibility thought.
>
> > Simon is
> > reading the policy in such a way that it would forbid all the planner
> > cost parameters, which is surely not what is intended.
>
> So we're allowed to influence the behaviour of the planner, but just not
> by touching the individual statements. OK.
>
> Can we allow a statement like
>
> SET index_weighting = '{{my_index, 0.1},{another_index, 0.5}}'
>
> That would allow us to tell a specific SQL statement that it should use
> a cost weighting of 0.1 * normal cost for the "my_index" index (etc).
> SET enable_seqscan = off; is a blunt instrument that can sometimes
> achieve the same thing, but insufficiently exact to be really useful.
> Many people use that (Sun, in their first published PostgreSQL
> benchmark...)
>
> We/I want to make the planner even better, but the above is roughly what
> people want while they're waiting for us to get the planner right.
>

I think the above would be helpful, but even then I am not sure it goes far 
enough, since there might be cases where you need and index wieghted high for 
a specific join within the query, but low for a different join in that query. 

A further problem with this implementation would be that in general it would 
require that you issue a set, run your query, and then issue another set to 
put those weightings back to the defaults, which seems like an excessive 
amount of overhead. As much as people like to turn their nose to in-line 
query hints, the manifestation of deficiencies in the planner always 
manifiest themselves at the query level, so it makes it difficult to create a 
solid solution that operates somewhere else.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL