On Wed, Jul 1, 2020 at 11:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> > On Wed, Jul 1, 2020 at 7:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> + baserel->tuples = Max(baserel->tuples, baserel->rows);
>
> > for consistency, this should be
> > baserel->tuples = clamp_row_est(baserel->rows / sel);
> > where sel is the selectivity of the baserestrictinfo clauses?
>
> If we had the selectivity available, maybe so, but we don't.
> (And even less so if we put this logic in the core code.)
>
> Short of sending a whole second query to the remote server, it's
> not clear to me how we could get the full table size (or equivalently
> the target query's selectivity for that table). The best we realistically
> can do is to adopt pg_class.reltuples if there's been an ANALYZE of
> the foreign table. That case already works (and this proposal doesn't
> break it). The problem is what to do when pg_class.reltuples is zero
> or otherwise badly out-of-date.
In estimate_path_cost_size(), if use_remote_estimate is true, we
adjust the rows estimate returned from the remote server, by factoring
in the selectivity of the locally-checked quals. I thought what I
proposed above would be more consistent with that.
Best regards,
Etsuro Fujita