Re: "could not find pathkey item to sort" for TPC-DS queries 94-96 - Mailing list pgsql-hackers

From James Coleman
Subject Re: "could not find pathkey item to sort" for TPC-DS queries 94-96
Date
Msg-id CAAaqYe-DJ4zazbNzyE0kKv8Q61+0GgHzY9Sbo8k=M7aUokvVhQ@mail.gmail.com
Whole thread Raw
In response to Re: "could not find pathkey item to sort" for TPC-DS queries 94-96  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Apr 14, 2021 at 8:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Apr 14, 2021 at 5:43 PM James Coleman <jtc331@gmail.com> wrote:
> > The query in question is:
> > select  count(*)
> >         from store_sales
> >             ,household_demographics
> >             ,time_dim, store
> >         where ss_sold_time_sk = time_dim.t_time_sk
> >             and ss_hdemo_sk = household_demographics.hd_demo_sk
> >             and ss_store_sk = s_store_sk
> >             and time_dim.t_hour = 15
> >             and time_dim.t_minute >= 30
> >             and household_demographics.hd_dep_count = 7
> >             and store.s_store_name = 'ese'
> >         order by count(*)
> >         limit 100;
> >
> > From debugging output it looks like this is the plan being chosen
> > (cheapest total path):
> >         Gather(store_sales household_demographics time_dim) rows=60626
> > cost=3145.73..699910.15
> >                 HashJoin(store_sales household_demographics time_dim)
> > rows=25261 cost=2145.73..692847.55
> >                   clauses: store_sales.ss_hdemo_sk =
> > household_demographics.hd_demo_sk
> >                         HashJoin(store_sales time_dim) rows=252609
> > cost=1989.73..692028.08
> >                           clauses: store_sales.ss_sold_time_sk =
> > time_dim.t_time_sk
> >                                 SeqScan(store_sales) rows=11998564
> > cost=0.00..658540.64
> >                                 SeqScan(time_dim) rows=1070
> > cost=0.00..1976.35
> >                         SeqScan(household_demographics) rows=720
> > cost=0.00..147.00
>
> This doesn't really make sense to me given the strack trace in the OP.
> That seems to go Limit -> Sort -> Agg -> NestLoop -> NestLoop ->
> NestLoop -> GatherMerge -> Sort. If the plan were as you have it here,
> there would be no Sort and no Gather Merge, so where would be getting
> a failure related to pathkeys?

Also I just realized why this didn't make sense -- I'm not sure what
the above path is. It'd gotten logged as part of the debug options I
have configured, but it must be 1.) incomplete (perhaps at a lower
level of path generation) and/or not the final path selected.

My apologies for the confusion.

James



pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: "could not find pathkey item to sort" for TPC-DS queries 94-96
Next
From: Michael Paquier
Date:
Subject: Re: psql - add SHOW_ALL_RESULTS option