Re: Returning the total number of rows as a separate column when using limit - Mailing list pgsql-sql

From Andreas Joseph Krogh
Subject Re: Returning the total number of rows as a separate column when using limit
Date
Msg-id 200711051733.10324.andreak@officenet.no
Whole thread Raw
In response to Re: Returning the total number of rows as a separate column when using limit  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Returning the total number of rows as a separate column when using limit  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-sql
On Monday 05 November 2007 16:27:03 Gregory Stark wrote:
> > Andreas Joseph Krogh <andreak@officenet.no> writes:
> >> On Monday 05 November 2007 15:18:22 Tom Lane wrote:
> >>> That's only an estimate.  Since the query doesn't get executed to
> >>> completion thanks to the LIMIT, Postgres really has no idea whether
> >>> the estimate is accurate.
> >>
> >> Ok. The query is ORDER-ed, but you're saying that it doesn't matter and
> >> PG still doesn't have to know the total numbers even if it has to sort
> >> the result?
>
> At a guess you're displaying pages of information and want to display
> something like "displaying 1-10 of 150" ?

Exactly:-)

> Postgres is kind of lacking a solution for this problem. The last time I
> needed to do this I bit the bullet and ran the query twice, once with a
> "select count(*) from (...)" around it and once with "select * from (...)
> order by x offset n limit m" around it. The second time runs relatively
> quickly since all the raw data is in cache.

That's what I'm doing now. I run the query with "limit+1" as limit and if it
results in more than limit, I know there is more data and I run count(*) to
count them all. But count(*) cannot use indices in PG so it's limited in
speed anyway AFAICS.

> The "right" way to do this would probably be to have a temporary table
> which you populate in one step, perform the count(*) on in a second query,
> then select the page of data with the ordering in a third query. Then you
> can keep the data around for some limited amount of time in case the user
> accesses a second page. But this requires some infrastructure to keep track
> of the cached data and what query it corresponded to and determine when to
> replace it with new data or drop it.
>
> However Postgres temporary tables are fairly expensive and if you're
> creating them for every web access you're going to have to vacuum the
> system catalogs quite frequently. They're not really well suited for this
> task.
>
> Alternatively you could create a cursor and play with that. But I don't
> think that's a great solution either. (yet? I think cursors are getting
> more useful in Postgres, perhaps it will be eventually.)

I really hoped there was an "Oracle over()" equivalent way in PG. I understand
that Oracle's LIMIT-hack with "3 subselects and rownum between 1 AND 20" is
rather expensive compared to PG's implementation of LIMIT. Oralce keeps
snapshot-info in the index, so counting only involves the index AFAIK.

--
Andreas Joseph Krogh <andreak@officenet.no>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS            | The most difficult thing in the world is to |
Karenslyst Allé 11      | know how to do a thing and to watch         |
PO. Box 529 Skøyen      | somebody else doing it wrong, without       |
0214 Oslo               | comment.                                    |
NORWAY                  |                                             |
Tlf:    +47 24 15 38 90 |                                             |
Fax:    +47 24 15 38 91 |                                             |
Mobile: +47 909  56 963 |                                             |
------------------------+---------------------------------------------+


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: omitting redundant join predicate
Next
From: Gregory Stark
Date:
Subject: Re: Returning the total number of rows as a separate column when using limit