Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Use of additional index columns in rows filtering
Date
Msg-id CAH2-WzkM-9wdayo9vHta10QdZ1QuUuS5Gch7mtfBJtO_AeGStg@mail.gmail.com
Whole thread Raw
In response to Re: Use of additional index columns in rows filtering  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Use of additional index columns in rows filtering
List pgsql-hackers
On Tue, Aug 8, 2023 at 1:49 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> So we expect 1250 rows. If that was accurate, the index scan would have
> to do 1250 heap fetches. It's just luck the index scan doesn't need to
> do that. I don't this there's a chance to improve this costing - if the
> inputs are this off, it can't do anything.

Well, that depends. If we can find a way to make the bitmap index scan
capable of doing something like the same trick through other means, in
some other patch, then this particular problem (involving a simple
inequality) just goes away. There may be other cases that look a
little similar, with a more complicated expression, where it just
isn't reasonable to expect a bitmap index scan to compete. Ideally,
bitmap index scans will only be at a huge disadvantage when it just
makes sense, due to the particulars of the expression.

I'm not trying to make this your problem. I'm just trying to establish
the general nature of the problem.

> Also, I think this is related to the earlier discussion about maybe
> costing it according to the worst case - i.e. as if we still needed
> fetch the same number of heap tuples as before. Which will inevitably
> lead to similar issues, with worse plans looking cheaper.

Not in those cases where it just doesn't come up, because we can
totally avoid visibility checks. As I said, securing that guarantee
has the potential to make the costing a lot more reliable/easier to
implement.

> That is certainly true - I'm trying to keep the scope somewhat close to
> the original goal. Obviously, there may be additional things the patch
> really needs to consider, but I'm not sure this is one of those cases
> (perhaps I just don't understand what the issue is - the example seems
> like a run-of-the-mill case of poor estimate / costing).

I'm not trying to impose any particular interpretation here. It's
early in the cycle, and my questions are mostly exploratory. I'm still
trying to develop my own understanding of the trade-offs in this area.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Use of additional index columns in rows filtering
Next
From: Andres Freund
Date:
Subject: Re: Configurable FP_LOCK_SLOTS_PER_BACKEND