Home > mailing lists

Using GROUPING SETS with more than one set disables predicate pushdown? - Mailing list pgsql-performance

From	Zakhar Shapurau
Subject	Using GROUPING SETS with more than one set disables predicate pushdown?
Date	November 21, 2017 14:04:17
Msg-id	1511262257.3219920.1179611992.260EC9D9@webmail.messagingengine.com Whole thread Raw
Responses	Re: Using GROUPING SETS with more than one set disables predicate pushdown?
List	pgsql-performance

Tree view

Hi!
First of all, thanks for the great work! PostgreSQL is amazing, and
community is super helpful.

I found an unexpected behaviour in PostgreSQL, and was advised to post
it to the performance mailing list on IRC. 

Using GROUPING SETS with more than one set disables predicate pushdown?

Version:
PostgreSQL 9.6.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit

Seems like when GROUPING SETS with at least two sets are used in the
subquery, planner
can not push WHERE clauses inside.

Here are two queries that (I think) are equivalent, but produce very
different execution
plans leading to bad performance on real data - and in effect,
making it impossible to abstract away non-trivial grouping logic into a
view.

It might as well be that queries are not really equivalent, but I don't
see how.

Same problem happens even if grouping sets are the same - like `GROUPING
SETS ((), ())`.

CREATE TEMPORARY TABLE test_gs (   x INT,   y INT,   z INT,   PRIMARY KEY (x, y, z)
);

EXPLAIN
SELECT   x,   y,   avg(z) AS mean
FROM test_gs
WHERE x = 1
GROUP BY x, GROUPING SETS ((y), ());
                                      QUERY PLAN
-----------------------------------------------------------------------------------------GroupAggregate
(cost=0.15..8.65rows=20 width=40)  Group Key: x, y  Group Key: x  ->  Index Only Scan using test_gs_pkey on test_gs
(cost=0.15..8.33 rows=10 width=12)        Index Cond: (x = 1)
 
(5 rows)



EXPLAIN
SELECT x, y, mean
FROM (        SELECT            x,            y,            avg(z) AS mean        FROM test_gs        GROUP BY x,
GROUPINGSETS ((y), ())    ) AS g
 
WHERE x = 1;
                                        QUERY PLAN
--------------------------------------------------------------------------------------------GroupAggregate
(cost=0.15..62.10rows=404 width=40)  Group Key: test_gs.x, test_gs.y  Group Key: test_gs.x  Filter: (test_gs.x = 1)  ->
Index Only Scan using test_gs_pkey on test_gs  (cost=0.15..41.75  rows=2040 width=12)
 
(5 rows)


The issue here is that the second query is not using index to filter on
x = 1 , instead it reads all the tuples from an index and applies the
filter.

Here is also a description in gist:
https://gist.github.com/zeveshe/cf92c9d2a6b14518af3180113e767ae7

Thanks a lot!

--  Zakhar Shapurau zak@zvs.no
+47 407 54 397

pgsql-performance by date:

From: Matthew Hall
Date: 21 November 2017, 11:05:09
Subject: Re: insert and query performance on big string table with pg_trgm

From: "Henrik Cednert (Filmlance)"
Date: 21 November 2017, 17:28:43
Subject: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

Using GROUPING SETS with more than one set disables predicate pushdown? - Mailing list pgsql-performance

Previous

Next