Re: "could not find pathkey item to sort" for TPC-DS queries 94-96 - Mailing list pgsql-hackers

From James Coleman
Subject Re: "could not find pathkey item to sort" for TPC-DS queries 94-96
Date
Msg-id CAAaqYe_NU4hO9COoJdcXWqjtH=dGMknYdsSdJjZ=JOHPTea-Nw@mail.gmail.com
Whole thread Raw
In response to Re: "could not find pathkey item to sort" for TPC-DS queries 94-96  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: "could not find pathkey item to sort" for TPC-DS queries 94-96  (Robert Haas <robertmhaas@gmail.com>)
Re: "could not find pathkey item to sort" for TPC-DS queries 94-96  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Mon, Apr 12, 2021 at 8:37 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 4/12/21 2:24 PM, Luc Vlaming wrote:
> > Hi,
> >
> > When trying to run on master (but afaik also PG-13) TPC-DS queries 94,
> > 95 and 96 on a SF10 I get the error "could not find pathkey item to sort".
> > When I disable enable_gathermerge the problem goes away and then the
> > plan for query 94 looks like below. I tried figuring out what the
> > problem is but to be honest I would need some pointers as the code that
> > tries to matching equivalence members in prepare_sort_from_pathkeys is
> > something i'm really not familiar with.
> >
>
> Could be related to incremental sort, which allowed some gather merge
> paths that were impossible before. We had a couple issues related to
> that fixed in November, IIRC.
>
> > To reproduce you can either ingest and test using the toolkit I used too
> > (see https://github.com/swarm64/s64da-benchmark-toolkit/), or
> > alternatively just use the schema (see
> > https://github.com/swarm64/s64da-benchmark-toolkit/tree/master/benchmarks/tpcds/schemas/psql_native)
> >
>
> Thanks, I'll see if I can reproduce that with your schema.
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

The query in question is:

select  count(*)
        from store_sales
            ,household_demographics
            ,time_dim, store
        where ss_sold_time_sk = time_dim.t_time_sk
            and ss_hdemo_sk = household_demographics.hd_demo_sk
            and ss_store_sk = s_store_sk
            and time_dim.t_hour = 15
            and time_dim.t_minute >= 30
            and household_demographics.hd_dep_count = 7
            and store.s_store_name = 'ese'
        order by count(*)
        limit 100;

From debugging output it looks like this is the plan being chosen
(cheapest total path):
        Gather(store_sales household_demographics time_dim) rows=60626
cost=3145.73..699910.15
                HashJoin(store_sales household_demographics time_dim)
rows=25261 cost=2145.73..692847.55
                  clauses: store_sales.ss_hdemo_sk =
household_demographics.hd_demo_sk
                        HashJoin(store_sales time_dim) rows=252609
cost=1989.73..692028.08
                          clauses: store_sales.ss_sold_time_sk =
time_dim.t_time_sk
                                SeqScan(store_sales) rows=11998564
cost=0.00..658540.64
                                SeqScan(time_dim) rows=1070
cost=0.00..1976.35
                        SeqScan(household_demographics) rows=720
cost=0.00..147.00

prepare_sort_from_pathkeys fails to find a pathkey because
tlist_member_ignore_relabel returns null -- which seemed weird because
the sortexpr is an Aggref (in a single member equivalence class) and
the tlist contains a single member that's also an Aggref. It turns out
that the only difference between the two Aggrefs is that the tlist
entry has "aggsplit = AGGSPLIT_INITIAL_SERIAL" while the sortexpr has
aggsplit = AGGSPLIT_SIMPLE.

That's as far as I've gotten so far, but I figured I'd get that info
out to see if it means anything obvious to anyone else.

James



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: File truncation within PostgresNode::issues_sql_like() wrong on Windows
Next
From: Vik Fearing
Date:
Subject: Re: Converting contrib SQL functions to new style