Re: Making clausesel.c Smarter - Mailing list pgsql-hackers

From David Rowley
Subject Re: Making clausesel.c Smarter
Date
Msg-id CAKJS1f_M-KWEkaoHRsxXWfEUoq6hSfuOzY6-EF+D9QhkZ1d_OA@mail.gmail.com
Whole thread Raw
In response to Re: Making clausesel.c Smarter  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: Making clausesel.c Smarter  (Andres Freund <andres@anarazel.de>)
Re: Making clausesel.c Smarter  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
On 1 April 2017 at 16:42, Claudio Freire <klaussfreire@gmail.com> wrote:
>
> I thought to take a quick look at this patch. I'll probably take a
> deeper look later, but some initial comments:
>
> -typedef struct RangeQueryClause
> +typedef struct CachedSelectivityClause
>  {
> -    struct RangeQueryClause *next;        /* next in linked list */
> +    struct CachedSelectivityClause *next;        /* next in linked list */
>      Node       *var;            /* The common variable of the clauses */
> -    bool        have_lobound;    /* found a low-bound clause yet? */
> -    bool        have_hibound;    /* found a high-bound clause yet? */
> +    int            selmask;        /* Bitmask of which sel types are stored */
>      Selectivity lobound;        /* Selectivity of a var > something clause */
>      Selectivity hibound;        /* Selectivity of a var < something clause */
> -} RangeQueryClause;
>
>
> As seems customary in other code, perhaps you should define some
> HAS_X() macros for dealing with the selmask.
>
> Something like
>
> #SELMASK_HAS_LOBOUND(selmask) (((selmask) & CACHESEL_LOBOUND) != 0)
> etc..

Thanks for looking at the patch.

The problem with that is that some of the tests are doing more than
just checking a single flag is enabled.

Consider:

if ((cslist->selmask & (CACHESEL_LOBOUND | CACHESEL_HIBOUND)) ==
(CACHESEL_LOBOUND | CACHESEL_HIBOUND))

Of course, those could be Macro'd too, but it seems I might need to be
inventive with the names, or the meaning might not be all that clear.

It does not seem overly complex and it is isolated to a single file,
so perhaps it might be OK as is?

>
> +static void addCachedSelectivityRangeVar(CachedSelectivityClause
> **cslist, Node *var,
>                 bool varonleft, bool isLTsel, Selectivity s2);
>
> You're changing clause -> var throughout the code when dealing with
> nodes, but looking at their use, they hold neither. All those
> addCachedSelectivity functions are usually passed expression nodes. If
> you're renaming, perhaps you should rename to "expr".

hmm, this is true. I kind of followed the lead of the name of the
variable in the old RangeQueryClause struct. I have changed the names
of these to be expr in the attached, but I didn't change the name of
the Var in the new CachedSelectivityClause struct. It seems like a
change not related to this patch.

Do you think I should change that too?

>
> On Fri, Feb 24, 2017 at 7:00 AM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
> > Now one thing I was unsure of in the patch was this "Other clauses"
> > concept that I've come up with. In the patch we have:
> >
> > default:
> > addCachedSelectivityOtherClause(&cslist, var, s2);
> > break;
> >
> > I was unsure if this should only apply to F_EQSEL and F_NEQSEL. My
> > imagination here won't stretch far enough to imagine a OpExpr which
> > passes with a NULL operand. If it did my logic is broken, and if
> > that's the case then I think limiting "Others" to mean F_EQSEL and
> > F_NEQSEL would be the workaround.
>
> While not specifically applicable only to "Others", something needs
> consideration here:
>
> User-defined functions can be nonstrict. An OpExpr involving such a
> user-defined function could possibly pass on null.

Good point. I overlooked this.

> I would suggest avoiding making the assumption that it doesn't unless
> the operator is strict.
>
> One could argue that such an operator should provide its own
> selectivity estimator, but the strictness check shouldn't be too
> difficult to add, and I think it's a better choice.
>
> So you'd have to check that:
>
> default:
> if (op_strict(expr->opno) && func_strict(expr->opfuncid))
>     addCachedSelectivityOtherClause(&cslist, var, s2);
> else
>     s1 = s1 * s2;
> break;
>

I've changed it to something along those lines. I don't think the
func_strict here is required, though, so I've gone with just
op_strict().

Updated patch attached.

Thanks for reviewing it.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Statement timeout behavior in extended queries
Next
From: Ashutosh Bapat
Date:
Subject: Re: pg_partman 3.0.0 - real-world usage of nativepartitioning and a case for native default