Re: [sqlsmith] Crash in mcv_get_match_bitmap - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [sqlsmith] Crash in mcv_get_match_bitmap
Date
Msg-id 20190719164635.4wuojvl3hdnu2kng@development
Whole thread Raw
In response to Re: [sqlsmith] Crash in mcv_get_match_bitmap  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [sqlsmith] Crash in mcv_get_match_bitmap
List pgsql-hackers
On Thu, Jul 18, 2019 at 11:16:08AM -0400, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> I've pushed the fixes listed in the previous message, with the exception
>> of the collation part, because I had some doubts about that.
>
>Sorry for being slow on this.
>
>> Now, for the collation part - after some more thought and looking at code
>> I think the fix I shared before is OK. It uses per-column collations
>> consistently both when building the stats and estimating clauses, which
>> makes it consistent. I was not quite sure if using Var->varcollid is the
>> same thing as pg_attribute.attcollation, but it seems it is - at least for
>> Vars pointing to regular columns (which for extended stats should always
>> be the case).
>
>I think you are right, but it could use some comments in the code.
>The important point here is that if we coerce a Var's collation to
>something else, that will be represented as a separate CollateExpr
>(which will be a RelabelType by the time it gets here, I believe).
>We don't just replace varcollid, the way eval_const_expressions will
>do to a Const.
>

OK, thanks. I've added a comment about that into mcv_get_match_bitmap (not
all the details about RelabelType, because that gets stripped while
examining the opexpr, but generally about the collations).

>
>While I'm looking at the code --- I don't find this at all convincing:
>
>                            /*
>                             * We don't care about isgt in equality, because
>                             * it does not matter whether it's (var op const)
>                             * or (const op var).
>                             */
>                            match = DatumGetBool(FunctionCall2Coll(&opproc,
>                                                                   DEFAULT_COLLATION_OID,
>                                                                   cst->constvalue,
>                                                                   item->values[idx]));
>
>It *will* matter if the operator is cross-type.  I think there is no
>good reason to have different code paths for the equality and inequality
>cases --- just look at isgt and swap or don't swap the arguments.
>
>BTW, "isgt" seems like a completely misleading name for that variable.
>AFAICS, what that is actually telling is whether the var is on the left
>or right side of the OpExpr.  Everywhere else in the planner with a
>need for that uses "bool varonleft", and I think you should do likewise
>here (though note that that isgt, as coded, is more like "varonright").
>

Yes, you're right in both cases. I've fixed the first issue with equality
by simply using the same code as for all operators (which means we don't
need to check the oprrest at all in this code, making it simpler).

And I've reworked the examine_opclause_expression() to use varonleft
naming - I agree it's a much better name. This was one of the remnants of
the code I initially copied from somewhere in selfuncs.c and massaged
it until it did what I needed.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: sepgsql seems rather thoroughly broken on Fedora 30
Next
From: Andres Freund
Date:
Subject: Re: partition routing layering in nodeModifyTable.c