Thread: Counting records in a PL/pgsql cursor

Counting records in a PL/pgsql cursor

From
"Magnus Hagander"
Date:
Is there any way to count how many hits I got in a cursor in PL/pgsql?

I have a function that will "window" through the result of a (large)
query based on two parameters, but I also want to return the number of
hits to the client. Right now I'm looping through the entire cursor and
incrementing a local variable, which I later return (along with the
first <n> records in the resultset) to the client. But this seems
horribly inefficient... I'd just like to ask "how many rows are in this
cursor", is there a way to do that without looping through them all?


//Magnus

Re: Counting records in a PL/pgsql cursor

From
Richard Huxton
Date:
Magnus Hagander wrote:
> Is there any way to count how many hits I got in a cursor in PL/pgsql?
>
> I have a function that will "window" through the result of a (large)
> query based on two parameters, but I also want to return the number of
> hits to the client. Right now I'm looping through the entire cursor and
> incrementing a local variable, which I later return (along with the
> first <n> records in the resultset) to the client. But this seems
> horribly inefficient... I'd just like to ask "how many rows are in this
> cursor", is there a way to do that without looping through them all?

Not really - that's the point of a cursor, after all. If possible, the
planner should give you the first row before it has all of them, so it
might not even know.

--
   Richard Huxton
   Archonet Ltd

Re: Counting records in a PL/pgsql cursor

From
Martijn van Oosterhout
Date:
On Thu, Nov 02, 2006 at 10:43:58PM +0100, Magnus Hagander wrote:
> Is there any way to count how many hits I got in a cursor in PL/pgsql?
>
> I have a function that will "window" through the result of a (large)
> query based on two parameters, but I also want to return the number of
> hits to the client. Right now I'm looping through the entire cursor and
> incrementing a local variable, which I later return (along with the
> first <n> records in the resultset) to the client. But this seems
> horribly inefficient... I'd just like to ask "how many rows are in this
> cursor", is there a way to do that without looping through them all?

You can move to the end, look at the row number, then move to the
beginning. It will still need to materialise the entire resultset
though.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: Counting records in a PL/pgsql cursor

From
"Magnus Hagander"
Date:
> > Is there any way to count how many hits I got in a cursor
> in PL/pgsql?
> >
> > I have a function that will "window" through the result of
> a (large)
> > query based on two parameters, but I also want to return
> the number of
> > hits to the client. Right now I'm looping through the entire cursor
> > and incrementing a local variable, which I later return (along with
> > the first <n> records in the resultset) to the client. But
> this seems
> > horribly inefficient... I'd just like to ask "how many rows are in
> > this cursor", is there a way to do that without looping
> through them all?
>
> You can move to the end, look at the row number, then move to
> the beginning. It will still need to materialise the entire
> resultset though.

How do I do that? remember this is a pL/pgsql cursor. From what I can
find at
http://www.postgresql.org/docs/8.1/static/plpgsql-cursors.html#PLPGSQL-C
URSOR-USING, I can only do FETCH to get the next row, or CLOSE.

I can deal with materializing the resultset, but I want to get away from
the loop-a-thousand-times-doing-plus-one...

//Magnus

Re: Counting records in a PL/pgsql cursor

From
"Merlin Moncure"
Date:
On 11/3/06, Magnus Hagander <mha@sollentuna.net> wrote:
> > > Is there any way to count how many hits I got in a cursor
> > in PL/pgsql?
> > >
> > > I have a function that will "window" through the result of
> > a (large)
> > > query based on two parameters, but I also want to return
> > the number of
> > > hits to the client. Right now I'm looping through the entire cursor
> > > and incrementing a local variable, which I later return (along with
> > > the first <n> records in the resultset) to the client. But
> > this seems
> > > horribly inefficient... I'd just like to ask "how many rows are in
> > > this cursor", is there a way to do that without looping
> > through them all?
> >
> > You can move to the end, look at the row number, then move to
> > the beginning. It will still need to materialise the entire
> > resultset though.
>
> How do I do that? remember this is a pL/pgsql cursor. From what I can
> find at
> http://www.postgresql.org/docs/8.1/static/plpgsql-cursors.html#PLPGSQL-C
> URSOR-USING, I can only do FETCH to get the next row, or CLOSE.
>
> I can deal with materializing the resultset, but I want to get away from
> the loop-a-thousand-times-doing-plus-one...

i dont think its possible.  note that you can make a refcursor inside
your plpgsql function and pass it to an sql function which can do sql
cursor operations on it -- i think :-)..haven't tried it yet.

merlin

Re: Counting records in a PL/pgsql cursor

From
Richard Troy
Date:
On Fri, 3 Nov 2006, Merlin Moncure wrote:
> >
> > I can deal with materializing the resultset, but I want to get away from
> > the loop-a-thousand-times-doing-plus-one...
>
> i dont think its possible.  note that you can make a refcursor inside
> your plpgsql function and pass it to an sql function which can do sql
> cursor operations on it -- i think :-)..haven't tried it yet.
>
> merlin

...If you know your application well enough, you might get away with doing
a select count() with the same where clause just before entering the
cursor. It _could_ of course be wrong, though! OTOH, it would be much
faster. If the only down-side is occasionally giving users an incorrect
count, then perhaps call it a "row estimate", and let them marvel at how
accurate the estimate is most of hte time!

Good luck,
Richard



--
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
rtroy@ScienceTools.com, http://ScienceTools.com/


Re: Counting records in a PL/pgsql cursor

From
"Merlin Moncure"
Date:
On 11/3/06, Richard Troy <rtroy@sciencetools.com> wrote:
> On Fri, 3 Nov 2006, Merlin Moncure wrote:
> > >
> > > I can deal with materializing the resultset, but I want to get away from
> > > the loop-a-thousand-times-doing-plus-one...
> >
> > i dont think its possible.  note that you can make a refcursor inside
> > your plpgsql function and pass it to an sql function which can do sql
> > cursor operations on it -- i think :-)..haven't tried it yet.
> >
> > merlin
>
> ...If you know your application well enough, you might get away with doing
> a select count() with the same where clause just before entering the
> cursor. It _could_ of course be wrong, though! OTOH, it would be much
> faster. If the only down-side is occasionally giving users an incorrect
> count, then perhaps call it a "row estimate", and let them marvel at how
> accurate the estimate is most of hte time!


you could guarantee correctness by doing serializable transations. or
by locking the resources in question.  however if the non-trivial
portions of the query can't be optimized out in a count(*), this is
pretty much a no-go cause you have to do everything twice...

merlin

Re: Counting records in a PL/pgsql cursor

From
"Magnus Hagander"
Date:
> > > I can deal with materializing the resultset, but I want
> to get away
> > > from the loop-a-thousand-times-doing-plus-one...
> >
> > i dont think its possible.  note that you can make a
> refcursor inside
> > your plpgsql function and pass it to an sql function which
> can do sql
> > cursor operations on it -- i think :-)..haven't tried it yet.
> >
> > merlin
>
> ...If you know your application well enough, you might get
> away with doing a select count() with the same where clause
> just before entering the cursor. It _could_ of course be
> wrong, though! OTOH, it would be much faster. If the only
> down-side is occasionally giving users an incorrect count,
> then perhaps call it a "row estimate", and let them marvel at
> how accurate the estimate is most of hte time!

Nope, the query is way too expensive to run it twice. (GIN scan over
well over half a million rows. It's faster to do the
get-as-cursor-then-loop-and-increment, I've measured that)

//Magnus

Re: Counting records in a PL/pgsql cursor

From
"redhog"
Date:
> Nope, the query is way too expensive to run it twice. (GIN scan over
> well over half a million rows. It's faster to do the
> get-as-cursor-then-loop-and-increment, I've measured that)

Can't you select into a temporary table and then do select count(*)
over that table, aswell as run your cursor over that temporary table?
Or is that way to space expensive?


Re: Counting records in a PL/pgsql cursor

From
"Magnus Hagander"
Date:
> > Nope, the query is way too expensive to run it twice. (GIN
> scan over
> > well over half a million rows. It's faster to do the
> > get-as-cursor-then-loop-and-increment, I've measured that)
>
> Can't you select into a temporary table and then do select
> count(*) over that table, aswell as run your cursor over that
> temporary table?
> Or is that way to space expensive?

I could, but just looping through the resultset is cheaper than that -
everything sticks in memory. The actual resultset isn't thatbig (max
1,000 rows), so there is no real risk to run out of memory for it inside
the functino.

//Magnus