Re: REVIEW: EXPLAIN and nfiltered - Mailing list pgsql-hackers

From Tom Lane
Subject Re: REVIEW: EXPLAIN and nfiltered
Date
Msg-id 20606.1295556472@sss.pgh.pa.us
Whole thread Raw
In response to Re: REVIEW: EXPLAIN and nfiltered  (hubert depesz lubaczewski <depesz@depesz.com>)
Responses Re: REVIEW: EXPLAIN and nfiltered  (Robert Haas <robertmhaas@gmail.com>)
Re: REVIEW: EXPLAIN and nfiltered  (Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>)
List pgsql-hackers
hubert depesz lubaczewski <depesz@depesz.com> writes:
> On Thu, Jan 20, 2011 at 02:48:59PM -0500, Stephen Frost wrote:
>> He also mentioned that he didn't feel it was terribly complicated or
>> that it'd be difficult to update for this.  Looking over the code, it's
>> got a simple regex for matching that line which would have to be
>> updated, but I don't think it'd require much more than that.

> i'll be happy to update the Pg::Explain to handle new elements of
> textual plans, so if this would be of concern - please don't treat
> "compatibility with explain.depesz.com" as your responsibility/problem.

The point isn't whether it'd be "terribly difficult" to update client
side EXPLAIN-parsing code ... it's whether we should break it in the
first place.  I don't find the proposed format so remarkably
well-designed that it's worth creating compatibility problems for.

The main functional problem I see with this format is that it assumes
there is one and only one filter step associated with every plan node.
That is just plain wrong.  Many don't have any, and there are important
cases where there are two.  I'm thinking in particular that it might be
useful to distinguish the effects of the recheck and the filter
conditions of a bitmap heap scan.  Maybe it'd also be interesting to
separate the join and non-join filter clauses of a join node, though
I'm less sure about the usefulness of that.

So the line I'm thinking we should pursue is to visually associate the
new counter with the filter condition, either like
Filter Cond: (x > 42)  (nfiltered = 123)

or
Filter Cond: (x > 42)Rows Filtered: 123

The latter is less ambiguous, but takes more vertical space.  The former
is very unlikely to break any client code, because I doubt there is any
that inquires into the details of what a filter condition expression
really means.  The latter *might* break code depending on how much
it assumes about the number of detail lines attached to a plan node
... but as Robert pointed out, we've added new detail lines before.

BTW, is it just me, or is the terminology "number filtered" pretty
confusing/ambiguous in itself?  It doesn't seem at all clear to me
whether that's the number of rows passed by the filter condition or
the number of rows rejected.  Perhaps "nremoved" would be clearer.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pl/python refactoring
Next
From: Robert Haas
Date:
Subject: Re: REVIEW: EXPLAIN and nfiltered