Re: num_sa_scans in genericcostestimate - Mailing list pgsql-hackers

From Tom Lane
Subject Re: num_sa_scans in genericcostestimate
Date
Msg-id 4108679.1662664638@sss.pgh.pa.us
Whole thread Raw
In response to num_sa_scans in genericcostestimate  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
Jeff Janes <jeff.janes@gmail.com> writes:
> When costing a btree index scan, num_sa_scans gets computed twice, once in
> btcostestmeate and once in genericcostestimate.  But the computations are
> different.  It looks like the generic one includes all =ANY in any column
> in the index, while the bt one includes only =ANY which or on columns for
> which all the preceding index columns are tested for equality.

I think this is correct.  As per the comments in btcostestimate:

     * For a btree scan, only leading '=' quals plus inequality quals for the
     * immediately next attribute contribute to index selectivity (these are
     * the "boundary quals" that determine the starting and stopping points of
     * the index scan).  Additional quals can suppress visits to the heap, so
     * it's OK to count them in indexSelectivity, but they should not count
     * for estimating numIndexTuples.  So we must examine the given indexquals
     * to find out which ones count as boundary quals.  ...

and further down

                /* count number of SA scans induced by indexBoundQuals only */
                if (alength > 1)
                    num_sa_scans *= alength;

This num_sa_scans value computed by btcostestimate is (or should be)
only used in calculations related to numIndexTuples, whereas the one
in genericcostestimate should be used for calculations related to the
overall number of heap tuples returned by the indexscan.  Maybe there
is someplace that is using the wrong one, but it's not a bug that they
are different.

> The context for this is that I was looking at cases where btree indexes
> were not using all the columns they could, but rather shoving some of the
> conditions down into a Filter unnecessarily/unhelpfully.  This change
> doesn't fix that, but it does seem to be moving in the right direction.

If it helps, it's only accidental, because this patch is surely wrong.
We *should* be distinguishing these numbers.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PoC] Let libpq reject unexpected authentication requests
Next
From: Tom Lane
Date:
Subject: Re: num_sa_scans in genericcostestimate