Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Date
Msg-id CAKU4AWpFTa2cW3grXeRedoHD5cd0U9itUhou6CRYBcJTpU4kEQ@mail.gmail.com
Whole thread Raw
In response to Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
List pgsql-hackers


On Thu, Mar 3, 2022 at 1:29 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 2, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > So the questions in my mind here are all
> > about whether we can detect this stuff cheaply and whether anybody
> > wants to do the work to make it happen, not whether we'd get a benefit
> > in the cases where it kicks in.
>
> Right, my worries are mostly about the first point.

OK, cool.

I have finished the PoC for planning timing improvement and joinrel rows estimation. 
the design considers the requirement we can enforce any corrective quals during 
execution (rather than must execute the RestirctInfo which user provides), but
nothing is coded for that part so far. 

Copy the commit message here for easy discussion.

== Planning timing part ==
Patch 1:   expand the duties of check_mergejoinable to check non-equal btree
    operators as well to support the EC Filter function.  A new field
    named btreeineqopfamilies is added in RestictInfo and it is set
    with the same round syscache search for check_mergejoinable. Because
    of this, check_mergejoinable is renamed to check_btreeable.
    The bad part of this is it only works for opclause so far.

Patch 2:   Introduce ec_filters in EquivalenceClass struct, the semantics is that the quals can
    be applied to any EquivalenceMember in this EC. Later this information is used
    to generate new RestrictInfo and was distributed to related RelOptInfo very
    soon.  There are 3 major steps here:

    a). In distribute_qual_to_rels to gather the ineq quallist.
    b). After deconstruct_jointree, distribute_filter_quals_to_eclass distributes
    these ineq-quallist to the related EC's ef_filters.
    c). generate_base_implied_equalities_no_const scan the ec_filters and distribute
    the restrictinfo to related RelOptInfo.

Patch 3:     Reduce some planning cost for deriving qual for EC filter feature.
    Mainly changes include:
    1. Check if the qual is simple enough by checking rinfo->right_relids and
    info->right_relids, save the pull_varnos of rinfo->clause calls.
    2. check contain_volatile_functions against RestrictInfo, so that
    the result can be shared with following calls.
    3. By employing the RestictInfo->btreeineqfamility which is calculating.
    with the same round of calculating  RestrictInfo->mergeopfamilies. In this
    way we save some calls for syscache.
    4. Calculating the opfamility and amstrategy at
    distribute_filter_quals_to_eclass and cache the results in EquivalenceFilter.
    if no suitable opfamility and amstrategy are found, bypass the qual immediately
    and at last using the cached value generate_base_implied_equalities_no_const.

    After this change, there is a testcase changed unexpectedly in equivclass.out
    (compared with Patch-2 expectation file.)

    create user regress_user_ectest;
    grant select on ec0 to regress_user_ectest;
    grant select on ec1 to regress_user_ectest;

    set session authorization regress_user_ectest;

    -- with RLS active, the non-leakproof a.ff = 43 clause is not treated
    -- as a suitable source for an EquivalenceClass; currently, this is true
    -- even though the RLS clause has nothing to do directly with the EC
    explain (costs off)
    regression->   select * from ec0 a, ec1 b
    regression->   where a.ff = b.ff and a.ff = 43::bigint::int8alias1;

    The b.ff = 43 has disappeared from ec1 b. But since it even didn't shown
    before the EC filter, so I'm not sure my changes here make something wrong,
    maybe fix an issue by accident?

== Join Rel size estimation part ==

I have revist the strategy before,  the main reasons are 1).  we should consider every
qual *equally*.  2).  In the past, I just wanted to get the same result as ec filters doesn't
happen,  but later I found that even if there is no ec filter, we still have some estimation error
clearly.  for example:

create table ec_t1000 (a int);
insert into ec_t1000 select i from generate_series(1, 1000)i;
create table ec_t110 (a int);
insert into ec_t110 select i from generate_series(1, 110) i;
create table ec_t200 (a int);
insert into ec_t200 select i from generate_series(1, 200) i;
analyze ec_t1000, ec_t110, ec_t200;


query 1: explain select * from ec_t1000, ec_t110 where ec_t1000.a = ec_t110.a and ec_t1000.a > 100; -- (0.9)

query 2: explain select * from ec_t1000, ec_t110 where ec_t1000.a = ec_t110.a and ec_t110.a > 100; --  (0.1)

query 3: explain select * from ec_t1000, ec_t110 , ec_t200 where ec_t1000.a = ec_t110.a and ec_t110.a = ec_t200.a and ec_t1000.a > 100;

query 4: explain select * from ec_t1000, ec_t110 , ec_t200 where ec_t1000.a = ec_t110.a and ec_t110.a = ec_t200.a and ec_t110.a > 100;

query 5: explain select * from ec_t1000, ec_t110 , ec_t200 where ec_t1000.a = ec_t110.a and ec_t110.a = ec_t200.a and ec_t200.a > 100;

we can see query 1 is the same as query 2,  and query 3/4/5 should be the same as well. The fact
is not. Here is the result on the current master and patched version. 

| Query Id | Real rows | Est. Rows at master | Est. rows with patched |
|----------+-----------+---------------------+------------------------|
|        1 |        10 |                  99 |                     10 |
|        2 |        10 |                  10 |                     10 |
|        3 |        10 |                  20 |                     11 |
|        4 |        10 |                   2 |                     11 |
|        5 |        10 |                  11 |                     11 |


Patch 4:  Prepare the code for CorrectiveQual structure.
    Just refactor the method for 2-level loop in
    generate_base_implied_equalities_no_const, no other things are changed.

Patch 5:   struct CorrectiveQuals is as simple as a List of RestrictInfo, the properties
of it are:  a). only one restrictinfo on this group should be counted for any joinrel
size estimation. b). at least 1 restrictinfo in this group should be executed during
execution. In this  commit, only the size estimation issue is tried.

    PlannerInfo.correlative_quals is added to manage all the CorrectiveQuals at
    subquery level. RelOptInfo.cqual_indexes is a List * to indicate a which
    CorrectiveQuals this relation is related to. This is designed for easy to check if
    the both sides of joinrel correlated to the same CorrectiveQuals. The reason why 
    "List *" will be explained later.

    The overall design of handing the joinrel size estimation is:
    a). At the base relation level, we just count everything with the correlative
    quals. b). During any level joinrel size estimation, we just keep 1 side's
    cqual (short for corrective qual) selectivity by eliminating the other one. so
    the size estimation for a mergeable join selectivity becomes to:

    rows = R1.rows X r2.rows X 1 / Max (ndistval_of_colA,  ndistinval_of_colB) X 1 /
    Selectivity(R1's CorrectiveQual).

    r1.rows X 1 / Selectivity(R1's CorrectiveQual) eliminates the impact of
    CorrectiveQual on R1.  After this,  the JoinRel of (R1, R2) still be impacted by
    this CorrectiveQual but "just once" in this level.  Later if JoinRel(R1, R2) needs
    to join with R3,  and R3 is impacted by this CorectiveQuals as well. We
    need to keep one and eliminate the other one as above again.

    The algorithm for which Selectivity should be eliminated and which one should be
    kept is:

    When we join 2 inner_rel and outer_rel with a mergeable join restrictinfo, if
    both sides is impacted with the same CorrectiveQual, we first choose which "side"
    to eliminating based on which side of the restrictinfo has a higher distinct
    value. The reason for this is more or less because we used "Max"(ndistinctValT1,
    ndistinctValT2). After deciding which "side" to eliminate, the real eliminating
    selectivity is RelOptInfo->cqual_selectivity[n],  The left one still takes effect
  and is noted in the joinrel->cqual_selectivitity[n]. 

    Introduction of RelOptInfo->cqual_selectivity:

     The number of elements in cqual_selecitity equals
     the length of cqual_indexes. The semantics is which
     Selectivity in the corresponding CorectiveQuals's qual
     list is taking effect. At any time, only 1 Qual
     Selectivity is counted for any-level of joinrel size estimation. 

    In reality, it is possible to have many CorrectiveQuals, but for design
    discussion, the current implementation only takes care of the 1 CorrectiveQuals.
    This would be helpful for PoC/review/discussion.

    Some flow for the key data:

    1. root->corrective_quals is initialized at
    generate_base_implied_equalities_no_const stage. we create a CorrectiveQual in
    this list for each ec_filter and fill the RestrictInfo part for it. At
    the same time, we note that which RelOptInfo (cqual_indexes) is related to this cqual. 

    2. RelOptInfo->cqual_selecitity for baserel is set at the end of set_rel_size,
    at this time, the selectivity for every RestrictInfo is calculated, we can just
    fetch the cached value.  As for joinrel, it is maintained in
    calc_join_cqual_selectivity, this function would return the Selectivity to
    eliminate and set the above value.

    Limitation in this PoC:
    1. Only support 1 CorrectiveQual in root->correlative_quals
    2. Only tested with INNER_JOIN.
    3. Inherited tables are not supported.

I find it is hard to explain things clearly without the code. Any feedback is welcome. 

--
Best Regards
Andy Fan
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Optionally automatically disable logical replication subscriptions on error
Next
From: Tomas Vondra
Date:
Subject: Re: WIP: WAL prefetch (another approach)