Thread: Using window functions to get the unpaginated count for paginated queries
Hi all,
Recently, during a performance improvement sweep for an application at my company, one of the hotspots that was discovered was pagination.
In order to display the correct pagination links on the page, the pagination library we used (most pagination libraries for that matter) ran the query with OFFSET and LIMIT to get the paginated results, and then re-ran the query without the OFFSET and LIMIT and wrapped them in a SELECT COUNT(*) FROM main_query to get the total number of rows.
In an attempt to optimize this, we used a window function as follows:
Given a query that looked as follows:
SELECT a,b,c
FROM table
WHERE clauses
OFFSET x LIMIT y;
add total_entries_count column as follows:
SELECT COUNT(table.id) OVER () AS total_entries_count, a,b,c
FROM table
WHERE clauses
OFFSET x LIMIT y;
This calculates the total number of unpaginated rows correctly, without affecting the runtime of the query. At least as far as I can tell.
The questions I have are:
1) Are there any adverse effects that the above window function can have?
2) Are there any cases where the count would return incorrectly?
3) In general, is this an appropriate use-case for using window functions?
Thanks,
Clemens
Re: Using window functions to get the unpaginated count for paginated queries
From
"Albe Laurenz"
Date:
Clemens Park wrote: > Recently, during a performance improvement sweep for an application at my company, one of the hotspots > that was discovered was pagination. > > In order to display the correct pagination links on the page, the pagination library we used (most > pagination libraries for that matter) ran the query with OFFSET and LIMIT to get the paginated > results, and then re-ran the query without the OFFSET and LIMIT and wrapped them in a SELECT COUNT(*) > FROM main_query to get the total number of rows. > > In an attempt to optimize this, we used a window function as follows: > > Given a query that looked as follows: > > SELECT a,b,c > FROM table > WHERE clauses > OFFSET x LIMIT y; > > add total_entries_count column as follows: > > SELECT COUNT(table.id) OVER () AS total_entries_count, a,b,c > FROM table > WHERE clauses > OFFSET x LIMIT y; > > This calculates the total number of unpaginated rows correctly, without affecting the runtime of the > query. At least as far as I can tell. It can affect the runtime considerably. I created a 10^6 row test table and tried: test=> EXPLAIN ANALYZE SELECT id, val FROM large OFFSET 100 LIMIT 10; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------ Limit (cost=1.49..1.64 rows=10 width=12) (actual time=0.177..0.195 rows=10 loops=1) -> Seq Scan on large (cost=0.00..14902.00 rows=1000000 width=12) (actual time=0.028..0.114 rows=110 loops=1) Total runtime: 0.251 ms (3 rows) test=> EXPLAIN ANALYZE SELECT id, val, COUNT(*) OVER () AS total_entries_count FROM large OFFSET 100 LIMIT 10; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------ Limit (cost=2.74..3.01 rows=10 width=12) (actual time=1893.606..1893.625 rows=10 loops=1) -> WindowAgg (cost=0.00..27402.00 rows=1000000 width=12) (actual time=1893.435..1893.559 rows=110 loops=1) -> Seq Scan on large (cost=0.00..14902.00 rows=1000000 width=12) (actual time=0.025..647.182 rows=1000000 loops=1) Total runtime: 1915.255 ms (4 rows) That is because the second query will have to scan all rows, while the first one can stop scanning after 110 rows. > The questions I have are: > > 1) Are there any adverse effects that the above window function can have? I can only think of the performance degradation mentioned above. > 2) Are there any cases where the count would return incorrectly? No. > 3) In general, is this an appropriate use-case for using window functions? I think it is. Maybe you can do better if you don't retrieve the total count of rows for every set of rows you select. Yours, Laurenz Albe
Re: Using window functions to get the unpaginated count for paginated queries
From
Igor Romanchenko
Date:
1) Are there any adverse effects that the above window function can have?
It can cause severe performance degradation, as mentioned before.
2) Are there any cases where the count would return incorrectly?
It could return incorrect result if there are some rows with table.id = NULL . count(table_field) returns the number of rows, where table_field is not NULL.
3) In general, is this an appropriate use-case for using window functions?
It does the job => it is an appropriate use-case for using window functions.
If this query causes performance degradation and you do not need the exact count of rows, it is better to use something from http://wiki.postgresql.org/wiki/Count_estimate or google for "fast postgresql count".
Re: Using window functions to get the unpaginated count for paginated queries
From
Clemens Park
Date:
Thanks for the reply everyone.
In my case, it looks like there is no real drawback then, since what used to happen is:
SELECT a,b,c
FROM table
WHERE clauses
OFFSET x LIMIT y;
followed by:
SELECT COUNT(*) FROM (
SELECT a,b,c
FROM table
WHERE clauses
);
(notice the lack of OFFSET and LIMIT)
and both of them were replaced with:
SELECT COUNT(table.id) OVER () AS total_entries_count, a,b,c
FROM table
WHERE clauses
OFFSET x LIMIT y;
On Wed, Nov 14, 2012 at 5:11 AM, Igor Romanchenko <igor.a.romanchenko@gmail.com> wrote:
1) Are there any adverse effects that the above window function can have?It can cause severe performance degradation, as mentioned before.2) Are there any cases where the count would return incorrectly?It could return incorrect result if there are some rows with table.id = NULL . count(table_field) returns the number of rows, where table_field is not NULL.3) In general, is this an appropriate use-case for using window functions?It does the job => it is an appropriate use-case for using window functions.If this query causes performance degradation and you do not need the exact count of rows, it is better to use something from http://wiki.postgresql.org/wiki/Count_estimate or google for "fast postgresql count".