Thread: Pagination count strategies
Hi, in one of our systems, we added a kind of pagination feature, that shows N records of Total records. To do this, we added a "count(*) over() as Total" field in our queries in replacement of doing two queries, one for fetching the records, and other for getting the count. This improved the performance, but we are't happy with the results yet, by removing the count, the query takes 200ms vs 2000ms with it. We are thinking of removing the count, but if we do that, the system will lack an interesting feature. What strategy for showing the total number of records returned do you recommend?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877
On Thu, Apr 03, 2014 at 10:34:32AM -0300, Leonardo M. Ramé wrote: > > What strategy for showing the total number of records returned do you > recommend?. The best answer for this I've ever seen is to limit the number of rows you're counting (at least at first) to some reasonably small number -- say 5000. This is usually reasonably fast for a well-indexed query, and your pagination can say something like "First n of at least 5000 results", unless you have fewer than 5000 results, in which case you know the number (and the count returned quickly anyway). As you're displaying those first 5000 results, you can work in the background getting a more accurate number. This is more work for your application, but it provides a much better user experience (and you can delay getting the detailed number until the user pages through to the second page of results, so you don't count everything needlessly in case the user just uses the first page, which IME happens a lot). Note that even Google doesn't give you an accurate number -- they just say "about ten trillion" or whatever. Hope that's useful, A -- Andrew Sullivan ajs@crankycanuck.ca
On 3 April 2014 15:34, Leonardo M. Ramé <l.rame@griensu.com> wrote: > Hi, in one of our systems, we added a kind of pagination feature, that > shows N records of Total records. > > To do this, we added a "count(*) over() as Total" field in our queries > in replacement of doing two queries, one for fetching the records, and > other for getting the count. This improved the performance, but we are't > happy with the results yet, by removing the count, the query takes > 200ms vs 2000ms with it. > > We are thinking of removing the count, but if we do that, the system > will lack an interesting feature. > > What strategy for showing the total number of records returned do you > recommend?. Assuming your results are unique (and what would be the point of showing duplicate results?) in a specific order, it should be possible to (uniquely) identify the last record shown on a previous page and display n results from there on. To add a result counter for displaying purposes to that, since you need to remember the last displayed result anyway, you might as well store the relative record number of that result with it and count from there on. This works well if you only have a next/previous results link, but not so well when people can pick arbitrary page numbers. It can work relative to the current page regardless of which page the user chose to navigate to next, but you'll have to go through all the results in between that page and the current page... That should still be faster than always counting from the start though (and you can be smart about from which end you start counting). -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
Cache the total ?
On Thu, Apr 3, 2014 at 3:34 PM, Leonardo M. Ramé <l.rame@griensu.com> wrote:
Hi, in one of our systems, we added a kind of pagination feature, that
shows N records of Total records.
To do this, we added a "count(*) over() as Total" field in our queries
in replacement of doing two queries, one for fetching the records, and
other for getting the count. This improved the performance, but we are't
happy with the results yet, by removing the count, the query takes
200ms vs 2000ms with it.
We are thinking of removing the count, but if we do that, the system
will lack an interesting feature.
What strategy for showing the total number of records returned do you
recommend?.
Regards,
--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 03/04/14 15:34, Leonardo M. Ramé wrote: > Hi, in one of our systems, we added a kind of pagination feature, that > shows N records of Total records. > > To do this, we added a "count(*) over() as Total" field in our queries > in replacement of doing two queries, one for fetching the records, and > other for getting the count. This improved the performance, but we are't > happy with the results yet, by removing the count, the query takes > 200ms vs 2000ms with it. > > We are thinking of removing the count, but if we do that, the system > will lack an interesting feature. > > What strategy for showing the total number of records returned do you > recommend?. If you need only an estimated number and if your planner statistics are up to date, you can use the planner. Here is my implementation of the explain function. The COMMENT below shows how to use it: CREATE OR REPLACE FUNCTION explain(VARIADIC TEXT[]) RETURNS JSON AS $$ DECLARE tmp TEXT; BEGIN EXECUTE 'EXPLAIN (' || array_to_string(array_append($1[2:array_upper($1, 1)], 'FORMAT JSON'), ', ') || ') ' || $1[1] INTO tmp; RETURN tmp::JSON; END; $$ LANGUAGE plpgsql VOLATILE; COMMENT ON FUNCTION explain(VARIADIC TEXT[]) IS $def$ This function is a SQL interface to the planner. It returns the plan (result of EXPLAIN) of the query passed as TEXT string as the first parameter as JSON object. The remaining parameters are EXPLAIN-modifiers, like ANALYZE or BUFFERS. The function can be used to store plans in the database. Another interesting usage is when you need only an estimated row count for a query. You can use SELECT count(*) ... This gives you an exact number but is usually slow. If your planner statistics are up to date and the query is not too complicated, the planner usually gives a good estimate and is much faster. SELECT explain('SELECT 1 FROM tb WHERE id>80000000') ->0->'Plan'->'Plan Rows'; $def$; Torsten
On 2014-04-03 10:00:18 -0400, Andrew Sullivan wrote: > On Thu, Apr 03, 2014 at 10:34:32AM -0300, Leonardo M. Ramé wrote: > > > > What strategy for showing the total number of records returned do you > > recommend?. > > The best answer for this I've ever seen is to limit the number of rows > you're counting (at least at first) to some reasonably small number -- > say 5000. This is usually reasonably fast for a well-indexed query, > and your pagination can say something like "First n of at least 5000 > results", unless you have fewer than 5000 results, in which case you > know the number (and the count returned quickly anyway). As you're > displaying those first 5000 results, you can work in the background > getting a more accurate number. This is more work for your > application, but it provides a much better user experience (and you > can delay getting the detailed number until the user pages through to > the second page of results, so you don't count everything needlessly > in case the user just uses the first page, which IME happens a lot). > Note that even Google doesn't give you an accurate number -- they just > say "about ten trillion" or whatever. > > Hope that's useful, > > A > Sounds nice, is it possible to modify my "count(*) over()" to what you suggest?. -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877
On 2014-04-03 17:19:56 +0200, Torsten Förtsch wrote: > On 03/04/14 15:34, Leonardo M. Ramé wrote: > > Hi, in one of our systems, we added a kind of pagination feature, that > > shows N records of Total records. > > > > To do this, we added a "count(*) over() as Total" field in our queries > > in replacement of doing two queries, one for fetching the records, and > > other for getting the count. This improved the performance, but we are't > > happy with the results yet, by removing the count, the query takes > > 200ms vs 2000ms with it. > > > > We are thinking of removing the count, but if we do that, the system > > will lack an interesting feature. > > > > What strategy for showing the total number of records returned do you > > recommend?. > > If you need only an estimated number and if your planner statistics are > up to date, you can use the planner. > > Here is my implementation of the explain function. The COMMENT below > shows how to use it: > > CREATE OR REPLACE FUNCTION explain(VARIADIC TEXT[]) > RETURNS JSON AS $$ > DECLARE > tmp TEXT; > BEGIN > EXECUTE 'EXPLAIN (' > || array_to_string(array_append($1[2:array_upper($1, 1)], > 'FORMAT JSON'), ', ') > || ') ' > || $1[1] INTO tmp; > RETURN tmp::JSON; > END; > $$ LANGUAGE plpgsql VOLATILE; > > COMMENT ON FUNCTION explain(VARIADIC TEXT[]) > IS $def$ > This function is a SQL interface to the planner. It returns the plan > (result of EXPLAIN) of the query passed as TEXT string as the first > parameter as JSON object. > > The remaining parameters are EXPLAIN-modifiers, like ANALYZE or > BUFFERS. > > The function can be used to store plans in the database. > > Another interesting usage is when you need only an estimated row > count for a query. You can use > > SELECT count(*) ... > > This gives you an exact number but is usually slow. If your planner > statistics are up to date and the query is not too complicated, the > planner usually gives a good estimate and is much faster. > > SELECT explain('SELECT 1 FROM tb WHERE id>80000000') > ->0->'Plan'->'Plan Rows'; > $def$; > > > Torsten Nice!, do you know if this will work on 8.4?. -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877
On Thu, Apr 03, 2014 at 12:44:23PM -0300, Leonardo M. Ramé wrote: > Sounds nice, is it possible to modify my "count(*) over()" to what you > suggest?. I think the window_definition inside over() can contain a LIMIT, can't it? I didn't check just now, but I can't think any reason why not. (ISTR when I did this in the past we didn't have window functions, so I simulated it another way.) A -- Andrew Sullivan ajs@crankycanuck.ca
On 03/04/14 17:44, Leonardo M. Ramé wrote: > Nice!, do you know if this will work on 8.4?. > no way