Re: "could not find pathkey item to sort" for TPC-DS queries 94-96 - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: "could not find pathkey item to sort" for TPC-DS queries 94-96 |
Date | |
Msg-id | CAAaqYe8hst0zS4FDc-gS++u=NRkdKYEn5-zi8FLjDp2+DELBdw@mail.gmail.com Whole thread Raw |
In response to | Re: "could not find pathkey item to sort" for TPC-DS queries 94-96 (James Coleman <jtc331@gmail.com>) |
Responses |
Re: "could not find pathkey item to sort" for TPC-DS queries 94-96
|
List | pgsql-hackers |
On Wed, Apr 14, 2021 at 5:42 PM James Coleman <jtc331@gmail.com> wrote: > > On Mon, Apr 12, 2021 at 8:37 AM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: > > > > On 4/12/21 2:24 PM, Luc Vlaming wrote: > > > Hi, > > > > > > When trying to run on master (but afaik also PG-13) TPC-DS queries 94, > > > 95 and 96 on a SF10 I get the error "could not find pathkey item to sort". > > > When I disable enable_gathermerge the problem goes away and then the > > > plan for query 94 looks like below. I tried figuring out what the > > > problem is but to be honest I would need some pointers as the code that > > > tries to matching equivalence members in prepare_sort_from_pathkeys is > > > something i'm really not familiar with. > > > > > > > Could be related to incremental sort, which allowed some gather merge > > paths that were impossible before. We had a couple issues related to > > that fixed in November, IIRC. > > > > > To reproduce you can either ingest and test using the toolkit I used too > > > (see https://github.com/swarm64/s64da-benchmark-toolkit/), or > > > alternatively just use the schema (see > > > https://github.com/swarm64/s64da-benchmark-toolkit/tree/master/benchmarks/tpcds/schemas/psql_native) > > > > > > > Thanks, I'll see if I can reproduce that with your schema. > > > > > > regards > > > > -- > > Tomas Vondra > > EnterpriseDB: http://www.enterprisedb.com > > The Enterprise PostgreSQL Company > > The query in question is: > > select count(*) > from store_sales > ,household_demographics > ,time_dim, store > where ss_sold_time_sk = time_dim.t_time_sk > and ss_hdemo_sk = household_demographics.hd_demo_sk > and ss_store_sk = s_store_sk > and time_dim.t_hour = 15 > and time_dim.t_minute >= 30 > and household_demographics.hd_dep_count = 7 > and store.s_store_name = 'ese' > order by count(*) > limit 100; > > From debugging output it looks like this is the plan being chosen > (cheapest total path): > Gather(store_sales household_demographics time_dim) rows=60626 > cost=3145.73..699910.15 > HashJoin(store_sales household_demographics time_dim) > rows=25261 cost=2145.73..692847.55 > clauses: store_sales.ss_hdemo_sk = > household_demographics.hd_demo_sk > HashJoin(store_sales time_dim) rows=252609 > cost=1989.73..692028.08 > clauses: store_sales.ss_sold_time_sk = > time_dim.t_time_sk > SeqScan(store_sales) rows=11998564 > cost=0.00..658540.64 > SeqScan(time_dim) rows=1070 > cost=0.00..1976.35 > SeqScan(household_demographics) rows=720 > cost=0.00..147.00 > > prepare_sort_from_pathkeys fails to find a pathkey because > tlist_member_ignore_relabel returns null -- which seemed weird because > the sortexpr is an Aggref (in a single member equivalence class) and > the tlist contains a single member that's also an Aggref. It turns out > that the only difference between the two Aggrefs is that the tlist > entry has "aggsplit = AGGSPLIT_INITIAL_SERIAL" while the sortexpr has > aggsplit = AGGSPLIT_SIMPLE. > > That's as far as I've gotten so far, but I figured I'd get that info > out to see if it means anything obvious to anyone else. This really goes back to [1] where we fixed a similar issue by making find_em_expr_usable_for_sorting_rel parallel the rules in prepare_sort_from_pathkeys. Most of those conditions got copied, and the case we were trying to handle is the fact that prepare_sort_from_pathkeys can generate a target list entry under those conditions if one doesn't exist. However there's a further restriction there I don't remember looking at: it uses pull_var_clause and tlist_member_ignore_relabel to ensure that all of the vars that feed into the sort expression are found in the target list. As I understand it, that is: it will build a target list entry for something like "md5(column)" if "column" (and that was one of our test cases for the previous fix) is in the target list already. But there's an additional detail here: the call to pull_var_clause requests aggregates, window functions, and placeholders be treated as vars. That means for our Aggref case it would require that the two Aggrefs be fully equal, so the differing aggsplit member would cause a target list entry not to be built, hence our error here. I've attached a quick and dirty patch that encodes that final rule from prepare_sort_from_pathkeys into find_em_expr_usable_for_sorting_rel. I can't help but think that there's a cleaner way to do with this with less code duplication, but hindering that is that prepare_sort_from_pathkeys is working with a TargetList while find_em_expr_usable_for_sorting_rel is working with a list of expressions. James 1: https://www.postgresql.org/message-id/CAAaqYe9C3f6A_tZCRfr9Dm7hPpgGwpp4i-K_%3DNS9GWXuNiFANg%40mail.gmail.com
Attachment
pgsql-hackers by date: