Re: UniqueKey v2 - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: UniqueKey v2 |
Date | |
Msg-id | 21674.1717510459@antos Whole thread Raw |
In response to | Re: UniqueKey v2 (Andy Fan <zhihuifan1213@163.com>) |
List | pgsql-hackers |
Andy Fan <zhihuifan1213@163.com> wrote: > Antonin Houska <ah@cybertec.at> writes: > > >> Could you make the reason clearer for adding 'List *opfamily_lists;' > >> into UniqueKey? You said "This is needed to create ECs in the parent > >> query if the upper relation represents a subquery." but I didn't get the > >> it. Since we need to maintain the UniqueKey in the many places, I'd like > >> to keep it as simple as possbile. Of course, anything essentical should > >> be added for sure. > > > > If unique keys are generated for a subquery output, they also need to be > > created for the corresponding relation in the upper query ("sub" in the > > following example): > > OK. > > > > select * from tab1 left join (select * from tab2) sub; > > > > However, to create an unique key for "sub", you need an EC for each expression > > of the key. > > OK. > > And to create an EC, you in turn need the list of operator > > families. > > I'm thinking if we need to "create" any EC. Can you find out a user case > where the outer EC is missed and the UniqueKey is still interesting? I > don't have an example now. > > convert_subquery_pathkeys has a similar sistuation and has the following > codes: > > outer_ec = > get_eclass_for_sort_expr(root, > (Expr *) outer_var, > sub_eclass->ec_opfamilies, > sub_member->em_datatype, > sub_eclass->ec_collation, > 0, > rel->relids, > NULL, > false); > > /* > * If we don't find a matching EC, sub-pathkey isn't > * interesting to the outer query > */ > if (outer_ec) > best_pathkey = > make_canonical_pathkey(root, > outer_ec, > sub_pathkey->pk_opfamily, > sub_pathkey->pk_strategy, > sub_pathkey->pk_nulls_first); > } I think that convert_subquery_pathkeys() just does not try that hard to achieve its goal. The example where it's important to create the EC in the outer query is what I added to the subselect.sql regression test in the 0004- diff in [1]: create table tabx as select * from generate_series(1,100) idx; create table taby as select * from generate_series(1,100) idy; create unique index on taby using btree (idy); create view view_barrier with (security_barrier=true) as select * from taby; analyze tabx, taby; explain (costs off, verbose on) select * from tabx x left join view_barrier y on idy = idx; If you modify find_ec_position_matching_expr() to return -1 instead of creating the EC, you will get this plan Hash Left Join Output: x.idx, taby.idy Hash Cond: (x.idx = taby.idy) -> Seq Scan on public.tabx x Output: x.idx -> Hash Output: taby.idy -> Seq Scan on public.taby Output: taby.idy instead of this Hash Left Join Output: x.idx, taby.idy Inner Unique: true Hash Cond: (x.idx = taby.idy) -> Seq Scan on public.tabx x Output: x.idx -> Hash Output: taby.idy -> Seq Scan on public.taby Output: taby.idy > >> > * uniquekey_useful_for_merging() > >> > > >> > How does uniqueness relate to merge join? In README.uniquekey you seem to > >> > point out that a single row is always sorted, but I don't think this > >> > function is related to that fact. (Instead, I'd expect that pathkeys are > >> > added to all paths for a single-row relation, but I'm not sure you do that > >> > in the current version of the patch.) > >> > >> The merging is for "mergejoinable join clauses", see function > >> eclass_useful_for_merging. Usually I think it as operator "t1.a = t2.a"; > > > > My question is: why is the uniqueness important specifically to merge join? I > > understand that join evaluation can be more efficient if we know that one > > input relation is unique (i.e. we only scan that relation until we find the > > first match), but this is not specific to merge join. > > So the answer is the "merging" in uniquekey_useful_for_merging() has > nothing with merge join. I don't understand. The function comment does mention merge join: /* * uniquekey_useful_for_merging * Check if the uniquekey is useful for mergejoins above the given relation. * * similar with pathkeys_useful_for_merging. */ static bool uniquekey_useful_for_merging(PlannerInfo *root, UniqueKey * ukey, RelOptInfo *rel) [1] https://www.postgresql.org/message-id/7971.1713526758%40antos -- Antonin Houska Web: https://www.cybertec-postgresql.com
pgsql-hackers by date: