[HACKERS] [PATCH] Overestimated filter cost and its mitigation - Mailing list pgsql-hackers

From Yuto Hayamizu
Subject [HACKERS] [PATCH] Overestimated filter cost and its mitigation
Date
Msg-id CANE+7D8G=1dAbhqy4HJrEtutNpxRfn8-6-uhEXq-=705kaWVvA@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation  (David Fetter <david@fetter.org>)
Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Hi hackers,

Currently, cost of a filter with multiple clauses is estimated by
summing up estimated cost of each clause.  As long as a filter
consists of simple clauses and its cost is fairly small, it works
fine. However, when there exists some heavy clauses (like SubQuery or
user-defined functions) and most of tuples are filtered by other
simple clauses, total cost is likely to be overestimated.  An attached
patch mitigates this overestimation by using selectivity estimation of
subsets of clauses in a filter.

Suppose that there are three qual clauses in a scan node, current
postgres estimates per-tuple cost of the filter to be:
   cost(A) + cost(B) + cost(C)

And basic idea of the attached patch is:
   cost(A) + clauselist_selectivity({A}) * cost(B) + clauselist_selectivity({A, B}) * cost(C)


We first noticed this overestimation during performance analysis of
our real applications. In order to reproduce the situation, we created
a similar query with pgbench data.

The database was prepared by following commands:
    pgbench --scale=1000 -i pgb5
    pgbench  -c 10 -t 100000 pgb5

and the query is:
    select e.tid,
            e.aid,
            e.bid,
            e.mtime
       from pgbench_history e
      where e.tid between 1 and 1000
        and (select count(*)
               from pgbench_history f
              where f.mtime < e.mtime
                and f.bid = e.bid
              group by f.bid) > 100
        and e.aid between 1 and 100000;


== Query plan with current upstream ==

1 Seq Scan on pgbench_history e  (cost=0.00..21393523889.00 rows=28 width=20) (actual time=2391.683..21775.191 rows=85 loops=1)
2  Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 100000) AND ((SubPlan 1) > 100))
3  Rows Removed by Filter: 999915
4    SubPlan 1
5     ->  GroupAggregate  (cost=0.00..21393.50 rows=283 width=12) (actual time=229.050..229.050 rows=1 loops=94)
6          Group Key: f.bid
7           ->  Seq Scan on pgbench_history f  (cost=0.00..21389.00 rows=333 width=4) (actual time=5.036..228.851 rows=529 loops=94)
8                 Filter: ((mtime < e.mtime) AND (bid = e.bid))
9                 Rows Removed by Filter: 999471

Most amount of total cost 21,393,523,889 comes from:
  (cost of SubPlan 1) * (# of rows in pgbench_history) = 21,393.50 * 1,000,000 = 21,393,000,000

but in actual run, SubPlan 1 was executed only 94 times, and total
cost was overestimated more than 10000 times greater.


== Query plan with patched upstream ==

1 Seq Scan on pgbench_history e  (cost=0.00..1687169.88 rows=28 width=20) (actual time=2388.802..21721.498 rows=85 loops=1)
2   Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 100000) AND ((SubPlan 1) > 100))
3   Rows Removed by Filter: 999915
4   SubPlan 1
5     ->  GroupAggregate  (cost=0.00..19726.83 rows=283 width=12) (actual time=228.507..228.507 rows=1 loops=94)
6           Group Key: f.bid
7           ->  Seq Scan on pgbench_history f  (cost=0.00..19722.33 rows=333 width=4) (actual time=5.378..228.316 rows=529 loops=94)
8                 Filter: ((mtime < e.mtime) AND (bid = e.bid))
9                 Rows Removed by Filter: 999471

In patched version of upstream, total cost is largely decreased.  In
cost estimation, only 84.4 tuples of pgbench_history are expected to
be evaluated with SubPlan 1.  It is close to actual number 94 to some
extent, and total cost seems much more reasonable than cost with
current upstream.

Any thoughts?

Regards,
Yuto Hayamizu

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Next
From: Jeevan Chalke
Date:
Subject: Re: [HACKERS] Re: proposal - using names as primary names of plpgsqlfunction parameters instead $ based names