> On Wed, Mar 04, 2020 at 11:32:00AM +1300, David Rowley wrote:
>
> I've been looking over v32 of the patch and have a few comments
> regarding the planner changes.
Thanks for the commentaries!
> I think the changes in create_distinct_paths() need more work. The
> way I think this should work is that create_distinct_paths() gets to
> know exactly nothing about what path types support the elimination of
> duplicate values. The Path should carry the UniqueKeys so that can be
> determined. In create_distinct_paths() you should just be able to make
> use of those paths, which should already have been created when
> creating index paths for the rel due to PlannerInfo's query_uniquekeys
> having been set.
Just for me to clarify. The idea is to "move" information about what
path types support skipping into UniqueKeys (derived from PlannerInfo's
query_uniquekeys), but other checks (e.g. if index am supports that)
still perform in create_distinct_paths?
> Also, should create_grouping_paths() be getting the same code?
> Jesper's UniqueKey patch seems to set query_uniquekeys when there's a
> GROUP BY with no aggregates. So it looks like he has intended that
> something like:
>
> SELECT x FROM t GROUP BY x;
>
> should work the same way as
>
> SELECT DISTINCT x FROM t;
>
> but the 0002 patch does not make this work. Has that just been overlooked?
I believe it wasn't overlooked in 0002 patch, but rather added just in
case in 0001. I guess there are no theoretical problems in implementing
it, but since we wanted to keep scope of the patch under control and
concentrate on the existing functionality it probably makes sense to
plan it as one of the next steps?
> There's also some weird looking assumptions that an EquivalenceMember
> can only be a Var in create_distinct_paths(). I think you're only
> saved from crashing there because a ProjectionPath will be created
> atop of the IndexPath to evaluate expressions, in which case you're
> not seeing the IndexPath. This results in the optimisation not
> working in cases like:
>
> postgres=# create table t (a int); create index on t ((a+1)); explain
> select distinct a+1 from t;
> CREATE TABLE
> CREATE INDEX
> QUERY PLAN
> -----------------------------------------------------------
> HashAggregate (cost=48.25..50.75 rows=200 width=4)
> Group Key: (a + 1)
> -> Seq Scan on t (cost=0.00..41.88 rows=2550 width=4)
Yes, I need to fix it.
> Using unique paths as I mentioned above should see that fixed.
I'm a bit confused about this statement, how exactly unique paths should
fix this?