Re: fix cost subqueryscan wrong parallel cost - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: fix cost subqueryscan wrong parallel cost
Date
Msg-id CAKFQuwaHhrLgvNVAUQWDMaTKg=V=4vi2P1bSFG6x__eMhZKZcQ@mail.gmail.com
Whole thread Raw
In response to Re: fix cost subqueryscan wrong parallel cost  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Apr 29, 2022 at 12:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> The fact that (baserel.rows > path->subpath->rows) here seems like a
> straight bug: there are no filters involved in this case but in the
> presence of filters baserel->rows should be strictly (<=
> path->subpath->rows), right?

No, because the subpath's rowcount has been derated to represent the
number of rows any one worker is expected to process.  Since the
SubqueryScan is below the Gather, it has to represent that number
of rows too.  Like I said, this design is pretty confusing; though
I do not have a better alternative to suggest.

Anyway, after chewing on this for awhile, it strikes me that the
sanest way to proceed is for cost_subqueryscan to compute its rows
estimate as the subpath's rows estimate times the selectivity of
the subqueryscan's quals (if any).

This is what Robert was getting at, and I followed-up on.

The question I ended up at is why doesn't baserel->rows already produce the value you now propose to calculate directly within cost_subquery

set_baserel_size_estimates (multiplies rel->tuples - which is derated - by selectivity, sets rel->rows)
set_subquery_size_estimates
    rel->subroot = subquery_planner(...)
          // my expectation is that sub_final_rel->cheapest_total_path->rows is the derated number of rows; 
          // the fact you can reach the derated amount later by using path->subpath->rows seems to affirm this.
    sets rel->tuples from sub_final_rel->cheapest_total_path->rows)
set_subquery_pathlist (executes the sizing call stack above, then proceeds to create_subqueryscan_path which in turn calls cost_subquery)
set_rel_size
...


* By analogy to other sorts of relation-scan nodes.  But those don't
have any subpath that they could consult instead.  This analogy is
really a bit faulty, since SubqueryScan isn't a relation scan node
in any ordinary meaning of that term.

I did observe this copy-and-paste dynamic; I take it this is why we cannot or would not just change all of the baserel->rows usages to path->subpath->rows.

So perhaps we should do it more like the attached, which produces
this plan for the UNION case:

 
The fact this changes row counts in a costing function is bothersome - it would be nice to go the other direction and remove the if block.  More generally, path->path.rows should be read-only by the time we get to costing. But I'm not out to start a revolution here either.  But it feels like we are just papering over a bug in how baserel->rows is computed; per my analysis above.

In short, the solution seems like it should, and in fact does here, fix the observed problem.  I'm fine with that.

David J.

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] building postgres with meson -v8
Next
From: Tomas Vondra
Date:
Subject: Re: bogus: logical replication rows/cols combinations