Pull up aggregate subquery - Mailing list pgsql-hackers

From Hitoshi Harada
Subject Pull up aggregate subquery
Date
Msg-id BANLkTimW-EqS_9hk5AYj14R8U=uQoc6tuA@mail.gmail.com
Whole thread Raw
Responses Re: Pull up aggregate subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I sometimes wonder if we could pull up aggregate query in the optimizer.

My typical problem is:

Consider two relations, medium size M and large size L. L has
reference column to M's primary key. Say,

    create table size_m as select i as id, repeat(i::text, i % 100) as
val from generate_series(1, 20000) i;
    create table size_l as select i as id, m.id as m_id,
repeat(i::text, i % 100) as val from generate_series(1, 100000) i,
size_m m where i.i / 10 = m.id;

Now, you want to query M under some condition with join aggregate L
group by M's primary key.

    select m_id, sum_len from size_m m inner join(select m_id,
sum(length(val)) as sum_len from size_l group by m_id)l on m.id =
l.m_id where val = '1';

The generated plan is:

                                                           QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=36116.92..38339.67 rows=50 width=235) (actual
time=440.679..1039.964 rows=1 loops=1)
   Join Filter: (m.id = size_l.m_id)
   ->  Seq Scan on size_m m  (cost=0.00..897.00 rows=1 width=227)
(actual time=0.021..16.698 rows=1 loops=1)
         Filter: (val = '1'::text)
   ->  GroupAggregate  (cost=36116.92..37217.09 rows=10026 width=248)
(actual time=440.651..1013.704 rows=10000 loops=1)
         ->  Sort  (cost=36116.92..36366.90 rows=99991 width=248)
(actual time=440.619..593.062 rows=99991 loops=1)
               Sort Key: size_l.m_id
               Sort Method: external sort  Disk: 25736kB
               ->  Seq Scan on size_l  (cost=0.00..4565.91 rows=99991
width=248) (actual time=0.011..138.243 rows=99991 loops=1)
 Total runtime: 1044.039 ms
(10 rows)

Note that most of the result of aggregate is discarded on join M,
because M resulted in small output with filter by M.val. If we can
filter M first and filter L by the M's result before running
aggregate, the response may dramatically get faster.

If you filter by M.id instead of M.val the optimizer is smart enough
to push down the condition to L, which is filtered before aggregate.

    select m_id, sum_len from size_m m inner join(select m_id,
sum(length(val)) as sum_len from size_l group by m_id)l on m.id =
l.m_id where id = 1;

                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..5713.02 rows=2 width=235) (actual
time=72.121..82.364 rows=1 loops=1)
   ->  Seq Scan on size_m m  (cost=0.00..897.00 rows=1 width=227)
(actual time=0.028..10.252 rows=1 loops=1)
         Filter: (id = 1)
   ->  GroupAggregate  (cost=0.00..4815.98 rows=2 width=248) (actual
time=72.065..72.067 rows=1 loops=1)
         ->  Seq Scan on size_l  (cost=0.00..4815.89 rows=10
width=248) (actual time=0.051..71.968 rows=10 loops=1)
               Filter: (m_id = 1)
 Total runtime: 82.525 ms
(7 rows)

This seems like the benefit of EquivalentClass. 1 = M.id = L.m_id is
implied and the optimizer adds implicit constant filter L.m_id = 1 in
the plan tree. In contrast, in the former case of M.val = '1' doesn't
imply any condition for L.m_id. That's fair enough.

However, I think we can filter L by L.m_id before aggregate because
L.m_id is of the group clause as well as the join condition and M.id
is unique in M. In such cases, the query can be transform something
like:
    GroupAggregate
      -> NestLoop (L.m_id = M.id)
        -> SeqScan L
        -> SeqScan M (filter: M.val = '1')
This transformation can be done by pulling up aggregate Query in
pull_up_subqueries(). Currently the optimizer doesn't pull up any
queries which contains aggregate, but as shown above in some cases we
can do it. Attached is WIP proof of concept patch to do it. I know it
breaks general queries but it transforms as I described above. I
suppose the missing piece is adding condition of "when" to pull up
aggregate. "how" is done in the patch.

db1=# explain select m_id, sum_len from size_m m inner join(select
m_id, sum(length(val)) as sum_len from size_l group by m_id)l on m.id
= l.m_id where val = '1';
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=6712.96..6713.16 rows=10 width=471) (actual
time=125.496..125.499 rows=1 loops=1)
   ->  Sort  (cost=6712.96..6712.99 rows=10 width=471) (actual
time=125.228..125.288 rows=10 loops=1)
         Sort Key: size_l.m_id
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=0.00..6712.80 rows=10 width=471)
(actual time=0.142..125.089 rows=10 loops=1)
               Join Filter: (m.id = size_l.m_id)
               ->  Seq Scan on size_m m  (cost=0.00..897.00 rows=1
width=227) (actual time=0.037..8.956 rows=1 loops=1)
                     Filter: (val = '1'::text)
               ->  Seq Scan on size_l  (cost=0.00..4565.91 rows=99991
width=248) (actual time=0.060..65.910 rows=99991 loops=1)
 Total runtime: 125.658 ms
(10 rows)

(You may notice that by the transformation non-group/non-agg
TargetEntry is in the Agg node, but actually there's no check for that
during optimizer and executor. May be thanks to Functional
Dependency.)

Comments?

Regards,

--
Hitoshi Harada

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Extreme bloating of intarray GiST indexes
Next
From: "Kevin Grittner"
Date:
Subject: Re: Predicate locking