Re: Failed assertion clauses != NIL - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: Failed assertion clauses != NIL
Date
Msg-id 20191126174131.p3zv5t2oeqkinx4d@development
Whole thread Raw
In response to Re: Failed assertion clauses != NIL  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Failed assertion clauses != NIL
List pgsql-bugs
On Tue, Nov 26, 2019 at 10:56:33AM +0000, Dean Rasheed wrote:
>On Sun, 24 Nov 2019 at 23:40, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> Attached are two patched, related to this bug report.
>>
>>
>> 0001 - Fix choose_best_statistics to check clauses individually
>> ---------------------------------------------------------------
>>
>> This modifies the choose_best_statistics function to properly check
>> which clauses are actually covered by each statistic object, and only
>> use attnums from those.
>>
>> The patch ended up pretty small, because we already have all the
>> necessary info (per-clause attnums) precalculated. Which means this
>> should not be much more expensive than before.
>>
>> The main drawback is that this does change signature of a function
>> defined in statistics.h - we have to pass more info (per-clause bitmaps
>> and info which clauses are already estimated). Which means ABI break.
>>
>> I'm not sure how likely it is that external code is calling this
>> function, but the probability is non-zero. So maybe even if the patch is
>> fairly small, in backbranches we should use the simple fix with just
>> returning if the list is NIL.
>>
>
>On a quick read-through that algorithm makes a lot more sense. It
>seems pretty unlikely that anyone would be using
>choose_best_statistics() anywhere else, so I think maybe it's fine to
>change that in back-branches.
>
>A couple of comments:
>
>1). I think you should pass estimatedClauses to
>choose_best_statistics() as a pure input parameter (i.e., remove one
>"*"), since it doesn't (and must not) modify that set.

Yeah, good point.

>In fact, on closer inspection, I don't think you need to pass it to
>choose_best_statistics() at all, since its callers already check
>clauses against estimatedClauses. Therefore, in
>choose_best_statistics(), incompatible and already-estimated clauses
>both appear the same (as NULL/empty attribute sets), and therefore the
>estimatedClauses check will never be tripped.
>

Right, but I'm thinking about the patch that allows applying multiple
statistics. With that applied, this changes to a while loop - and we'll
either have to rebuild the list_attnums or pass the bitmap.

>2). The new parameter "clauses" should probably be called something
>like "clause_attnums" or some such, to better reflect what it actually
>is. And it should be documented that NULL values represent
>incompatible/already-estimated clauses.
>

Yes, agreed.

>3). In statext_mcv_clauselist_selectivity(), the check for at least 2
>attributes is no longer needed, because choose_best_statistics() now
>does that, so there's also no need to compute clauses_attnums there.
>

Good point. I'll replace that with an assert.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #16125: Crash of PostgreSQL's wal sender during logicalreplication
Next
From: Tomas Vondra
Date:
Subject: Re: BUG #16125: Crash of PostgreSQL's wal sender during logicalreplication