Re: [sqlsmith] Crash in mcv_get_match_bitmap - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [sqlsmith] Crash in mcv_get_match_bitmap
Date
Msg-id 20190718112001.gtnquw24f4qzxia5@development
Whole thread Raw
In response to Re: [sqlsmith] Crash in mcv_get_match_bitmap  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [sqlsmith] Crash in mcv_get_match_bitmap
List pgsql-hackers
Hi,

I've pushed the fixes listed in the previous message, with the exception
of the collation part, because I had some doubts about that.

1) handling of NULL values in Cons / MCV items

The handling of NULL elements was actually a bit more broken, because it
also was not quite correct for NULL values in the MCV items. The code
treated this as a mismatch, but then skipped the rest of the evaluation
only for AND-clauses (because then 'false' is final). But for OR-clauses
it happily proceeded to call the proc, etc. It was not hard to cause a
crash with statistics on varlena columns.

I've fixed this and added a simple regression test to check this. It
however shows the stats_ext suite needs some improvements - until now it
only had AND-clauses. Now it has one simple OR-clause test, but it needs
more of that - and perhaps some combinations mixing AND/OR. I've tried
adding an copy of each existing query, with AND replaced by OR, and that
works fine (no crashes, estimates seem OK). But it's quite heavy-handed
way to create regression tests, so I'll look into this in PG13 cycle.


2) collations

Now, for the collation part - after some more thought and looking at code
I think the fix I shared before is OK. It uses per-column collations
consistently both when building the stats and estimating clauses, which
makes it consistent. I was not quite sure if using Var->varcollid is the
same thing as pg_attribute.attcollation, but it seems it is - at least for
Vars pointing to regular columns (which for extended stats should always
be the case).

And we reset stats whenever the attribute type changes (which includes
change of collation for the column), so I think it's OK. To be precise, we
only reset MCV list in that case - we keep mvdistinct and dependencies,
but that's probably OK because those don't store values and we won't
run any functions on them.

So I think the attached patch is OK, but I'd welcome some feedback.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: Amit Kapila
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs