RE: Improve EXPLAIN output for multicolumn B-Tree Index - Mailing list pgsql-hackers

From
Subject RE: Improve EXPLAIN output for multicolumn B-Tree Index
Date
Msg-id TYWPR01MB10982D8D6B11B8E3F4ADDCEC2B1D62@TYWPR01MB10982.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Improve EXPLAIN output for multicolumn B-Tree Index  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers
> On Mon, 24 Jun 2024 at 14:42, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> >
> > On Mon, 24 Jun 2024 at 13:02, Matthias van de Meent
> > <boekewurm+postgres@gmail.com> wrote:
> > > It does not really behave similar: index scan keys (such as the
> > > id3=101 scankey) don't require visibility checks in the btree code,
> > > while the Filter condition _does_ require a visibility check, and
> > > delegates the check to the table AM if the scan isn't Index-Only, or
> > > if the VM didn't show all-visible during the check.
> >
> > Any chance you could point me in the right direction for the
> > code/docs/comment about this? I'd like to learn a bit more about why
> > that is the case, because I didn't realize visibility checks worked
> > differently for index scan keys and Filter keys.
> 
> This can be derived by combining how Filter works (it only filters the returned live tuples)
> and how Index-Only scans work (return the index tuple, unless !ALL_VISIBLE, in which
> case the heap tuple is projected). There have been several threads more or less
> recently that also touch this topic and closely related topics, e.g. [0][1].

Thanks! I could understand what is difference between INCLUDE based filter and index filter.

> > > As you can see, there's a huge difference in performance. Putting
> > > both non-bound and "normal" filter clauses in the same Filter clause
> > > will make it more difficult to explain performance issues based on
> > > only the explain output.
> >
> > Fair enough, that's of course the main point of this patch in the
> > first place: being able to better interpret the explain plan when you
> > don't have access to the schema. Still I think Filter is the correct
> > keyword for both, so how about we make it less confusing by making the
> > current "Filter" more specific by calling it something like "Non-key
> > Filter" or "INCLUDE Filter" and then call the other something like
> > "Index Filter" or "Secondary Bound Filter".
> 
> I'm not sure how debuggable explain plans are without access to the schema, especially
> when VERBOSE isn't configured, so I would be hesitant to accept that as an argument
> here.

IMHO, it's nice to be able to understand the differences between each
FILTER even without the VERBOSE option. (+1 for Jelte Fennema-Nio's idea)

Even without access to the schema, it would be possible to quickly know if
the plan is not as expected, and I believe there are virtually no disadvantages
to having multiple "XXX FILTER" outputs.

If it's better to output such information only with the VERBOSE option, 
What do you think about the following idea?
* When the VERBOSE option is not specified, output as "Filter" in all cases
* When the VERBOSE option is specified, output as "Non-key Filter", "INCLUDE Filter" 
  and "Index Filter".

In addition, I think it would be good to mention the differences between each filter in 
the documentation.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Backporting BackgroundPsql
Next
From: vignesh C
Date:
Subject: Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node