Re: Cost of initiating cursors - Mailing list pgsql-general

From Tom Lane
Subject Re: Cost of initiating cursors
Date
Msg-id 3217.1365693145@sss.pgh.pa.us
Whole thread Raw
In response to Cost of initiating cursors  (Matthew Churcher <matthew.churcher@realvnc.com>)
Responses Re: Cost of initiating cursors
List pgsql-general
Matthew Churcher <matthew.churcher@realvnc.com> writes:
> For example:
>   FOR curr_foo
>   IN
>   SELECT foo FROM bar
>     WHERE wibble
>     ORDER BY wobble
>   LOOP
>      EXIT; -- always break out of loop
>   END LOOP;

> For some reason this is hugely expensive and slow regardless of the selected execution plan and available indexes.
TheWHERE and particularly the ORDER BY clause appear to be highly significant despite having appropriate indexes in
place. 

For a full-table query, the planner will frequently decide that a
seqscan-and-sort is cheaper than an indexscan that happens to produce
the right order (but involves a lot of random page access).  That's
problematic for this case because the whole scan and sort has to happen
before the first result row can be delivered.  Grabbing the remaining
rows would be quite cheap, but since you don't want them, the work
is wasted.

> It's the combination of the following behaviours I find particular perplexing:-
>   1.) Removing the WHERE and ORDER BY clauses results in a very fast query

No surprise.  The query devolves to a seqscan, and you stop after
getting the first row.

>   2.) Adding a LIMIT clause also results in a very fast query.

That would encourage the planner to use a fast-start plan, ie an
indexscan in the right order, which wins here since you stop after
fetching the first row.

The basic problem with the above FOR loop is that you're not giving the
planner any visibility into the fact that you just want the first (few)
rows, so it's going for a plan that will be more efficient for actually
performing the whole query it's been given.  Adding a LIMIT is a good
thing because it fixes that lack of communication.

> The configuration parameter 'cursor_tuple_fraction' is having no observable effect.

You did not say what values you'd tried, but a sufficiently small value
ought to force a similar result.  Whether such a value would be a wise
thing to set globally is a different question.  On the whole I think
adding a LIMIT isn't a bad idea here.

            regards, tom lane


pgsql-general by date:

Previous
From: David Greco
Date:
Subject: Streaming Replication 9.2
Next
From: Shaun Thomas
Date:
Subject: Re: Streaming Replication 9.2