Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Using multiple extended statistics for estimates |
Date | |
Msg-id | 20191107.133820.1542779476322302789.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Using multiple extended statistics for estimates (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: Using multiple extended statistics for estimates
|
List | pgsql-hackers |
Hello. At Wed, 6 Nov 2019 20:58:49 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in > >Here is a slightly polished v2 of the patch, the main difference being > >that computing clause_attnums was moved to a separate function. > > > > This time with the attachment ;-) This patch is a kind of straight-forward, which repeats what the previous statext_mcv_clauselist_selectivity did as long as remaining clauses matches any of MV-MCVs. Almost no regression in the cases where zero or just one MV-MCV applies to the given clause list. It applies cleanly on the current master and seems working as expected. I have some comments. Could we have description in the documentation on what multiple MV-MCVs are used in a query? And don't we need some regression tests? +/* + * statext_mcv_clause_attnums + * Recalculate attnums from compatible but not-yet-estimated clauses. It returns attnums collected from multiple clause*s*. Is the name OK with "clause_attnums"? The comment says as if it checks the compatibility of each clause but the work is done in the caller side. I'm not sure such strictness is required, but it might be better that the comment represents what exactly the function does. + */ +static Bitmapset * +statext_mcv_clause_attnums(int nclauses, Bitmapset **estimatedclauses, + Bitmapset **list_attnums) The last two parameters are in the same type in notation but in different actual types.. that is, one is a pointer to Bitmapset*, and another is an array of Bitmaptset*. The code in the function itself suggests that, but it would be helpful if a brief explanation of the parameters is seen in the function comment. + /* + * Recompute attnums in the remaining clauses (we simply use the bitmaps + * computed earlier, so that we don't have to inspect the clauses again). + */ + clauses_attnums = statext_mcv_clause_attnums(list_length(clauses), Couldn't we avoid calling this function twice with the same parameters at the first round in the loop? + foreach(l, clauses) { - stat_clauses = lappend(stat_clauses, (Node *) lfirst(l)); - *estimatedclauses = bms_add_member(*estimatedclauses, listidx); + /* + * If the clause is compatible with the selected statistics, mark it + * as estimated and add it to the list to estimate. + */ + if (list_attnums[listidx] != NULL && + bms_is_subset(list_attnums[listidx], stat->keys)) + { + stat_clauses = lappend(stat_clauses, (Node *) lfirst(l)); + *estimatedclauses = bms_add_member(*estimatedclauses, listidx); + } The loop runs through all clauses every time. I agree that that is better than using a copy of the clauses to avoid to step on already estimated clauses, but maybe we need an Assertion that the listidx is not a part of estimatedclauses to make sure no clauses are not estimated twice. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: