Re: Cost of initiating cursors - Mailing list pgsql-general

From Matthew Churcher
Subject Re: Cost of initiating cursors
Date
Msg-id 1645159141.63958.1365699932278.JavaMail.root@realvnc.com
Whole thread Raw
In response to Re: Cost of initiating cursors  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Cost of initiating cursors
List pgsql-general
Thanks Tom,

That's very enlightening and I really appreciate you taking time to respond.

I've tried cursor_tuple_fraction values as low as 0.0000000001 which by my reckoning should be low enough and also 0
butthe planner is still selecting seqscan-and-sort.  

Kind regards,
Matthew Churcher


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Matthew Churcher" <matthew.churcher@realvnc.com>
Cc: pgsql-general@postgresql.org
Sent: Thursday, 11 April, 2013 4:12:25 PM
Subject: Re: [GENERAL] Cost of initiating cursors

Matthew Churcher <matthew.churcher@realvnc.com> writes:
> For example:
>   FOR curr_foo
>   IN
>   SELECT foo FROM bar
>     WHERE wibble
>     ORDER BY wobble
>   LOOP
>      EXIT; -- always break out of loop
>   END LOOP;

> For some reason this is hugely expensive and slow regardless of the selected execution plan and available indexes.
TheWHERE and particularly the ORDER BY clause appear to be highly significant despite having appropriate indexes in
place. 

For a full-table query, the planner will frequently decide that a
seqscan-and-sort is cheaper than an indexscan that happens to produce
the right order (but involves a lot of random page access).  That's
problematic for this case because the whole scan and sort has to happen
before the first result row can be delivered.  Grabbing the remaining
rows would be quite cheap, but since you don't want them, the work
is wasted.

> It's the combination of the following behaviours I find particular perplexing:-
>   1.) Removing the WHERE and ORDER BY clauses results in a very fast query

No surprise.  The query devolves to a seqscan, and you stop after
getting the first row.

>   2.) Adding a LIMIT clause also results in a very fast query.

That would encourage the planner to use a fast-start plan, ie an
indexscan in the right order, which wins here since you stop after
fetching the first row.

The basic problem with the above FOR loop is that you're not giving the
planner any visibility into the fact that you just want the first (few)
rows, so it's going for a plan that will be more efficient for actually
performing the whole query it's been given.  Adding a LIMIT is a good
thing because it fixes that lack of communication.

> The configuration parameter 'cursor_tuple_fraction' is having no observable effect.

You did not say what values you'd tried, but a sufficiently small value
ought to force a similar result.  Whether such a value would be a wise
thing to set globally is a different question.  On the whole I think
adding a LIMIT isn't a bad idea here.

            regards, tom lane


pgsql-general by date:

Previous
From: Steve Atkins
Date:
Subject: Re: crash proof for semi-embedded system
Next
From: Tom Lane
Date:
Subject: Re: Cost of initiating cursors