Thread: [sqlsmith] Crash in mcv_get_match_bitmap
Hi, running sqlsmith on the regression database of REL_12_STABLE at ff597b656f yielded a crash in mcv_get_match_bitmap. I can reproduce it with the following query on the regression database: select filler1 from mcv_lists where a is not null and (select 42) <= c; Backtrace below. regards, Andreas Program received signal SIGSEGV, Segmentation fault. pg_detoast_datum (datum=0x0) at fmgr.c:1741 (gdb) bt #0 pg_detoast_datum (datum=0x0) at fmgr.c:1741 #1 0x000055b2bbeb2656 in numeric_le (fcinfo=0x7ffceeb2cb90) at numeric.c:2139 #2 0x000055b2bbf3cdca in FunctionCall2Coll (flinfo=flinfo@entry=0x7ffceeb2cc30, collation=collation@entry=100, arg1=<optimized out>, arg2=<optimized out>) at fmgr.c:1162 #3 0x000055b2bbdd7aec in mcv_get_match_bitmap (root=0x55b2bd2acff0, clauses=<optimized out>, keys=0x55b2bd2c4e38, mcvlist=0x55b2bd2c44e0, is_or=false) at mcv.c:1638 #4 0x000055b2bbdda581 in mcv_clauselist_selectivity (root=root@entry=0x55b2bd2acff0, stat=stat@entry=0x55b2bd2c4e00, clauses=clauses@entry=0x55b2bd2c5298, varRelid=varRelid@entry=0, jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0, rel=0x55b2bd2c4158, basesel=0x7ffceeb2cd70, totalsel=0x7ffceeb2cd78) at mcv.c:1876 #5 0x000055b2bbdd6064 in statext_mcv_clauselist_selectivity (estimatedclauses=0x7ffceeb2cde8, rel=0x55b2bd2c4158, sjinfo=<optimized out>, jointype=<optimized out>, varRelid=<optimized out>, clauses=0x55b2bd2c4e00, root=<optimized out>) at extended_stats.c:1146 #6 statext_clauselist_selectivity (root=root@entry=0x55b2bd2acff0, clauses=clauses@entry=0x55b2bd2c5010, varRelid=varRelid@entry=0, jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0, rel=0x55b2bd2c4158, estimatedclauses=0x7ffceeb2cde8) at extended_stats.c:1177 #7 0x000055b2bbd27372 in clauselist_selectivity (root=root@entry=0x55b2bd2acff0, clauses=0x55b2bd2c5010, varRelid=varRelid@entry=0, jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0) at clausesel.c:94 #8 0x000055b2bbd2d788 in set_baserel_size_estimates (root=root@entry=0x55b2bd2acff0, rel=rel@entry=0x55b2bd2c4158) at costsize.c:4411 #9 0x000055b2bbd24658 in set_plain_rel_size (rte=0x55b2bd20cf00, rel=0x55b2bd2c4158, root=0x55b2bd2acff0) at allpaths.c:583 #10 set_rel_size (root=root@entry=0x55b2bd2acff0, rel=rel@entry=0x55b2bd2c4158, rti=rti@entry=1, rte=rte@entry=0x55b2bd20cf00) at allpaths.c:412 #11 0x000055b2bbd264a0 in set_base_rel_sizes (root=<optimized out>) at allpaths.c:323 #12 make_one_rel (root=root@entry=0x55b2bd2acff0, joinlist=joinlist@entry=0x55b2bd2c49c0) at allpaths.c:185 #13 0x000055b2bbd482f8 in query_planner (root=root@entry=0x55b2bd2acff0, qp_callback=qp_callback@entry=0x55b2bbd48ed0 <standard_qp_callback>, qp_extra=qp_extra@entry=0x7ffceeb2d070) at planmain.c:271 #14 0x000055b2bbd4cb32 in grouping_planner (root=<optimized out>, inheritance_update=false, tuple_fraction=<optimized out>) at planner.c:2048 #15 0x000055b2bbd4f900 in subquery_planner (glob=glob@entry=0x55b2bd2b1c88, parse=parse@entry=0x55b2bd20cd88, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:1012 #16 0x000055b2bbd509c6 in standard_planner (parse=0x55b2bd20cd88, cursorOptions=256, boundParams=<optimized out>) at planner.c:406 #17 0x000055b2bbe13b89 in pg_plan_query (querytree=querytree@entry=0x55b2bd20cd88, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at postgres.c:878 [...]
Andreas Seltenreich <seltenreich@gmx.de> writes: > running sqlsmith on the regression database of REL_12_STABLE at > ff597b656f yielded a crash in mcv_get_match_bitmap. I can reproduce it > with the following query on the regression database: > select filler1 from mcv_lists where a is not null and (select 42) <= c; Seems to be the same problem I just complained of in the other thread: mcv_get_match_bitmap has an untenable assumption that "is a pseudoconstant" means "is a Const". I notice it's also assuming that the Const must be non-null. It's not really easy to crash it that way, because if you just write "null <= c" that'll get reduced to constant-NULL earlier. But I bet there's a way. regards, tom lane
On Wed, Jul 10, 2019 at 04:57:54PM -0400, Tom Lane wrote: >Andreas Seltenreich <seltenreich@gmx.de> writes: >> running sqlsmith on the regression database of REL_12_STABLE at >> ff597b656f yielded a crash in mcv_get_match_bitmap. I can reproduce it >> with the following query on the regression database: >> select filler1 from mcv_lists where a is not null and (select 42) <= c; > >Seems to be the same problem I just complained of in the other >thread: mcv_get_match_bitmap has an untenable assumption that >"is a pseudoconstant" means "is a Const". > Yeah, that's a bug. Will fix (not sure how yet). BTW which other thread? I don't see any other threads mentioning this function. >I notice it's also assuming that the Const must be non-null. >It's not really easy to crash it that way, because if you just >write "null <= c" that'll get reduced to constant-NULL earlier. >But I bet there's a way. > Hmmm, I did not think of that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > BTW which other thread? I don't see any other threads mentioning this > function. https://www.postgresql.org/message-id/flat/CA%2Bu7OA65%2BjEFb_TyV5g%2BKq%2BonyJ2skMOPzgTgFH%2BqgLwszRqvw%40mail.gmail.com regards, tom lane
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Yeah, that's a bug. Will fix (not sure how yet). You could do worse than replace this: ok = (NumRelids(clause) == 1) && (is_pseudo_constant_clause(lsecond(expr->args)) || (varonleft = false, is_pseudo_constant_clause(linitial(expr->args)))); with something like if (IsA(linitial(expr->args), Var) && IsA(lsecond(expr->args), Const)) ok = true, varonleft = true; else if (IsA(linitial(expr->args), Const) && IsA(lsecond(expr->args), Var)) ok = true, varonleft = false; Or possibly get rid of varonleft as such, and merge extraction of the "var" and "cst" variables into this test. BTW, I bet passing a unary-argument OpExpr also makes this code unhappy. regards, tom lane
On Wed, Jul 10, 2019 at 05:45:24PM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> Yeah, that's a bug. Will fix (not sure how yet). > >You could do worse than replace this: > > ok = (NumRelids(clause) == 1) && > (is_pseudo_constant_clause(lsecond(expr->args)) || > (varonleft = false, > is_pseudo_constant_clause(linitial(expr->args)))); > >with something like > > if (IsA(linitial(expr->args), Var) && > IsA(lsecond(expr->args), Const)) > ok = true, varonleft = true; > else if (IsA(linitial(expr->args), Const) && > IsA(lsecond(expr->args), Var)) > ok = true, varonleft = false; > >Or possibly get rid of varonleft as such, and merge extraction of the >"var" and "cst" variables into this test. > OK, thanks for the suggestion. I probably also need to look at the "is compatible" test in extended_stats.c which also looks at the clauses. It may not crash as it does not attempt to extract the const values etc. but it likely needs to be in sync with this part. >BTW, I bet passing a unary-argument OpExpr also makes this code >unhappy. Whooops :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Oh ... while we're piling on here, it just sunk into me that mcv_get_match_bitmap is deciding what the semantics of an operator are by seeing what it's using for a selectivity estimator. That is just absolutely, completely wrong. For starters, it means that the whole mechanism fails for any operator that wants to use a specialized estimator --- hardly an unreasonable thing to do. For another, it's going to be pretty unreliable for extensions, because I do not think they're all careful about using the right estimator --- a lot of 'em probably still haven't adapted to the introduction of separate <= / >= estimators, for instance. The right way to determine operator semantics is to look to see whether they are in a btree opclass. That's what the rest of the planner does, and there is no good reason for the mcv code to do it some other way. regards, tom lane
On Wed, Jul 10, 2019 at 11:26:09PM +0200, Tomas Vondra wrote: > Yeah, that's a bug. Will fix (not sure how yet). Please note that I have added an open item for it. -- Michael
Attachment
On Thu, Jul 11, 2019 at 09:55:01AM +0900, Michael Paquier wrote: >On Wed, Jul 10, 2019 at 11:26:09PM +0200, Tomas Vondra wrote: >> Yeah, that's a bug. Will fix (not sure how yet). > >Please note that I have added an open item for it. Thanks. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jul 10, 2019 at 06:48:16PM -0400, Tom Lane wrote: >Oh ... while we're piling on here, it just sunk into me that >mcv_get_match_bitmap is deciding what the semantics of an operator >are by seeing what it's using for a selectivity estimator. >That is just absolutely, completely wrong. For starters, it >means that the whole mechanism fails for any operator that wants >to use a specialized estimator --- hardly an unreasonable thing >to do. For another, it's going to be pretty unreliable for >extensions, because I do not think they're all careful about using >the right estimator --- a lot of 'em probably still haven't adapted >to the introduction of separate <= / >= estimators, for instance. > >The right way to determine operator semantics is to look to see >whether they are in a btree opclass. That's what the rest of the >planner does, and there is no good reason for the mcv code to >do it some other way. > Hmmm, but that will mean we're unable to estimate operators that are not part of a btree opclass. Which is a bit annoying, because that would also affect equalities (and thus functional dependencies), in which case the type may easily have just hash opclass or something. But maybe I just don't understand how the btree opclass detection works and it'd be fine for sensibly defined operators. Can you point me to code doing this elsewhere in the planner? We may need to do something about code in pg10, because functional dependencies this too (although just with F_EQSEL). But maybe we should leave pg10 alone, we got exactly 0 reports about it until now anyway. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jul 11, 2019 at 05:08:22PM +0200, Tomas Vondra wrote: >On Wed, Jul 10, 2019 at 06:48:16PM -0400, Tom Lane wrote: >>Oh ... while we're piling on here, it just sunk into me that >>mcv_get_match_bitmap is deciding what the semantics of an operator >>are by seeing what it's using for a selectivity estimator. >>That is just absolutely, completely wrong. For starters, it >>means that the whole mechanism fails for any operator that wants >>to use a specialized estimator --- hardly an unreasonable thing >>to do. For another, it's going to be pretty unreliable for >>extensions, because I do not think they're all careful about using >>the right estimator --- a lot of 'em probably still haven't adapted >>to the introduction of separate <= / >= estimators, for instance. >> >>The right way to determine operator semantics is to look to see >>whether they are in a btree opclass. That's what the rest of the >>planner does, and there is no good reason for the mcv code to >>do it some other way. >> > >Hmmm, but that will mean we're unable to estimate operators that are not >part of a btree opclass. Which is a bit annoying, because that would also >affect equalities (and thus functional dependencies), in which case the >type may easily have just hash opclass or something. > >But maybe I just don't understand how the btree opclass detection works >and it'd be fine for sensibly defined operators. Can you point me to code >doing this elsewhere in the planner? > >We may need to do something about code in pg10, because functional >dependencies this too (although just with F_EQSEL). But maybe we should >leave pg10 alone, we got exactly 0 reports about it until now anyway. > Here are WIP patches addressing two of the issues: 1) determining operator semantics by matching them to btree opclasses 2) deconstructing OpExpr into Var/Const nodes I'd appreciate some feedback particularly on (1). For the two other issues mentioned in this thread: a) I don't think unary-argument OpExpr are an issue, because this is checked when determining which clauses are compatible (and the code only allows the case with 2 arguments). b) Const with constisnull=true - I'm not yet sure what to do about this. The easiest fix would be to deem those clauses incompatible, but that seems a bit too harsh. The right thing is probably passing the NULL to the operator proc (but that means we can't use FunctionCall). Now, looking at this code, I wonder if using DEFAULT_COLLATION_OID when calling the operator is the right thing. We're using type->typcollation when building the stats, so maybe we should do the same thing here. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On Thu, Jul 11, 2019 at 05:08:22PM +0200, Tomas Vondra wrote: >> On Wed, Jul 10, 2019 at 06:48:16PM -0400, Tom Lane wrote: >>> The right way to determine operator semantics is to look to see >>> whether they are in a btree opclass. That's what the rest of the >>> planner does, and there is no good reason for the mcv code to >>> do it some other way. >> Hmmm, but that will mean we're unable to estimate operators that are not >> part of a btree opclass. Which is a bit annoying, because that would also >> affect equalities (and thus functional dependencies), in which case the >> type may easily have just hash opclass or something. After thinking about this more, I may have been analogizing to the wrong code. It's necessary to use opclass properties when we're reasoning about operators in a way that *must* be correct, for instance to conclude that a partition can be pruned from a query. But this code is just doing selectivity estimation, so the correctness standards are a lot lower. In particular I see that the long-established range-query-detection code in clauselist_selectivity is looking for operators that have F_SCALARLTSEL etc. as restriction estimators (in fact I'm guessing you lifted parts of the mcv code from that, cause it looks pretty similar). So if we've gotten away with that so far over there, there's probably no reason not to do likewise here. I am a little troubled by the point I made about operators possibly wanting to have a more-specific estimator than scalarltsel, but that seems like an issue to solve some other time; and if we do change that logic then clauselist_selectivity needs to change too. > Here are WIP patches addressing two of the issues: > 1) determining operator semantics by matching them to btree opclasses Per above, I'm sort of inclined to drop this, unless you feel better about doing it like this than the existing way. > 2) deconstructing OpExpr into Var/Const nodes deconstruct_opexpr is still failing to verify that the Var is a Var. I'd try something like leftop = linitial(expr->args); while (IsA(leftop, RelabelType)) leftop = ((RelabelType *) leftop)->arg; // and similarly for rightop if (IsA(leftop, Var) && IsA(rightop, Const)) // return appropriate results else if (IsA(leftop, Const) && IsA(rightop, Var)) // return appropriate results else // fail Also, I think deconstruct_opexpr is far too generic a name for what this is actually doing. It'd be okay as a static function name perhaps, but not if you're going to expose it globally. > a) I don't think unary-argument OpExpr are an issue, because this is > checked when determining which clauses are compatible (and the code only > allows the case with 2 arguments). OK. > b) Const with constisnull=true - I'm not yet sure what to do about this. > The easiest fix would be to deem those clauses incompatible, but that > seems a bit too harsh. The right thing is probably passing the NULL to > the operator proc (but that means we can't use FunctionCall). No, because most of the functions in question are strict and will just crash on null inputs. Perhaps you could just deem that cases involving a null Const don't match what you're looking for. > Now, looking at this code, I wonder if using DEFAULT_COLLATION_OID when > calling the operator is the right thing. We're using type->typcollation > when building the stats, so maybe we should do the same thing here. Yeah, I was wondering that too. But really you should be using the column's collation not the type's default collation. See commit 5e0928005. regards, tom lane
On Sat, Jul 13, 2019 at 11:39:55AM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On Thu, Jul 11, 2019 at 05:08:22PM +0200, Tomas Vondra wrote: >>> On Wed, Jul 10, 2019 at 06:48:16PM -0400, Tom Lane wrote: >>>> The right way to determine operator semantics is to look to see >>>> whether they are in a btree opclass. That's what the rest of the >>>> planner does, and there is no good reason for the mcv code to >>>> do it some other way. > >>> Hmmm, but that will mean we're unable to estimate operators that are not >>> part of a btree opclass. Which is a bit annoying, because that would also >>> affect equalities (and thus functional dependencies), in which case the >>> type may easily have just hash opclass or something. > >After thinking about this more, I may have been analogizing to the wrong >code. It's necessary to use opclass properties when we're reasoning about >operators in a way that *must* be correct, for instance to conclude that >a partition can be pruned from a query. But this code is just doing >selectivity estimation, so the correctness standards are a lot lower. >In particular I see that the long-established range-query-detection >code in clauselist_selectivity is looking for operators that have >F_SCALARLTSEL etc. as restriction estimators (in fact I'm guessing you >lifted parts of the mcv code from that, cause it looks pretty similar). >So if we've gotten away with that so far over there, there's probably >no reason not to do likewise here. > >I am a little troubled by the point I made about operators possibly >wanting to have a more-specific estimator than scalarltsel, but that >seems like an issue to solve some other time; and if we do change that >logic then clauselist_selectivity needs to change too. > >> Here are WIP patches addressing two of the issues: > >> 1) determining operator semantics by matching them to btree opclasses > >Per above, I'm sort of inclined to drop this, unless you feel better >about doing it like this than the existing way. > OK. TBH I don't have a very strong opinion on this - I always disliked how we rely on the estimator OIDs in this code, and relying on btree opclasses seems somewhat more principled. But I'm not sure I understand all the implications of such change (and I have some concerns about it too, per my last message), so I'd revisit that in PG13. >> 2) deconstructing OpExpr into Var/Const nodes > >deconstruct_opexpr is still failing to verify that the Var is a Var. >I'd try something like > > leftop = linitial(expr->args); > while (IsA(leftop, RelabelType)) > leftop = ((RelabelType *) leftop)->arg; > // and similarly for rightop > if (IsA(leftop, Var) && IsA(rightop, Const)) > // return appropriate results > else if (IsA(leftop, Const) && IsA(rightop, Var)) > // return appropriate results > else > // fail > Ah, right. The RelabelType might be on top of something that's not Var. >Also, I think deconstruct_opexpr is far too generic a name for what >this is actually doing. It'd be okay as a static function name >perhaps, but not if you're going to expose it globally. > I agree. I can't quite make it static, because it's used from multiple places, but I'll move it to extended_stats_internal.h (and I'll see if I can think of a better name too). >> a) I don't think unary-argument OpExpr are an issue, because this is >> checked when determining which clauses are compatible (and the code only >> allows the case with 2 arguments). > >OK. > >> b) Const with constisnull=true - I'm not yet sure what to do about this. >> The easiest fix would be to deem those clauses incompatible, but that >> seems a bit too harsh. The right thing is probably passing the NULL to >> the operator proc (but that means we can't use FunctionCall). > >No, because most of the functions in question are strict and will just >crash on null inputs. Perhaps you could just deem that cases involving >a null Const don't match what you're looking for. > Makes sense, I'll do that. >> Now, looking at this code, I wonder if using DEFAULT_COLLATION_OID when >> calling the operator is the right thing. We're using type->typcollation >> when building the stats, so maybe we should do the same thing here. > >Yeah, I was wondering that too. But really you should be using the >column's collation not the type's default collation. See commit >5e0928005. > OK, thanks. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
OK, attached is a sequence of WIP fixes for the issues discussed here. 1) using column-specific collations (instead of type/default ones) The collations patch is pretty simple, but I'm not sure it actually does the right thing, particularly during estimation where it uses collation from the Var node (varcollid). But looking at 5e0928005, this should use the same collation as when building the extended statistics (which we get from the per-column stats, as stored in pg_statistic.stacoll#). But we don't actually store collations for extended statistics, so we can either modify pg_statistic_ext_data and store it there, or lookup the per-column statistic info during estimation, and use that. I kinda think the first option is the right one, but that'd mean yet another catversion bump. OTOH 5e0928005 actually did modify the extended statistics (mvdistinct and dependencies) to use type->typcollation during building, so maybe we want to use the default type collation for some reason? 2) proper extraction of Var/Const from opclauses This is the primary issue discussed in this thread - I've renamed the function to examine_opclause_expression() so that it kinda resembles examine_variable() and I've moved it to the "internal" header file. We still need it from two places so it can't be static, but hopefully this naming is acceptable. 3) handling of NULL values (Const and MCV items) Aside from the issue that Const may represent NULL, I've realized the code might do the wrong thing for NULL in the MCV item itself. It did treat it as mismatch and update the bitmap, but it might have invoke the operator procedure anyway (depending on whether it's AND/OR clause, what's the current value in the bitmap, etc.). This patch should fix both issues by treating them as mismatch, and skipping the proc call. 4) refactoring of the bitmap updates This is not a bug per se, but while working on (3) I've realized the code updating the bitmap is quite repetitive and it does stuff like if (is_or) bitmap[i] = Max(bitmap[i], match) else bitmap[i] = Min(bitmap[i], match) over and over on various places. This moves this into a separate static function, which I think makes it way more readable. Also, it replaces the Min/Max with a plain boolean operators (the patch originally used three states, not true/false, hence the Min/Max). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi, I've pushed the fixes listed in the previous message, with the exception of the collation part, because I had some doubts about that. 1) handling of NULL values in Cons / MCV items The handling of NULL elements was actually a bit more broken, because it also was not quite correct for NULL values in the MCV items. The code treated this as a mismatch, but then skipped the rest of the evaluation only for AND-clauses (because then 'false' is final). But for OR-clauses it happily proceeded to call the proc, etc. It was not hard to cause a crash with statistics on varlena columns. I've fixed this and added a simple regression test to check this. It however shows the stats_ext suite needs some improvements - until now it only had AND-clauses. Now it has one simple OR-clause test, but it needs more of that - and perhaps some combinations mixing AND/OR. I've tried adding an copy of each existing query, with AND replaced by OR, and that works fine (no crashes, estimates seem OK). But it's quite heavy-handed way to create regression tests, so I'll look into this in PG13 cycle. 2) collations Now, for the collation part - after some more thought and looking at code I think the fix I shared before is OK. It uses per-column collations consistently both when building the stats and estimating clauses, which makes it consistent. I was not quite sure if using Var->varcollid is the same thing as pg_attribute.attcollation, but it seems it is - at least for Vars pointing to regular columns (which for extended stats should always be the case). And we reset stats whenever the attribute type changes (which includes change of collation for the column), so I think it's OK. To be precise, we only reset MCV list in that case - we keep mvdistinct and dependencies, but that's probably OK because those don't store values and we won't run any functions on them. So I think the attached patch is OK, but I'd welcome some feedback. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > I've pushed the fixes listed in the previous message, with the exception > of the collation part, because I had some doubts about that. Sorry for being slow on this. > Now, for the collation part - after some more thought and looking at code > I think the fix I shared before is OK. It uses per-column collations > consistently both when building the stats and estimating clauses, which > makes it consistent. I was not quite sure if using Var->varcollid is the > same thing as pg_attribute.attcollation, but it seems it is - at least for > Vars pointing to regular columns (which for extended stats should always > be the case). I think you are right, but it could use some comments in the code. The important point here is that if we coerce a Var's collation to something else, that will be represented as a separate CollateExpr (which will be a RelabelType by the time it gets here, I believe). We don't just replace varcollid, the way eval_const_expressions will do to a Const. While I'm looking at the code --- I don't find this at all convincing: /* * We don't care about isgt in equality, because * it does not matter whether it's (var op const) * or (const op var). */ match = DatumGetBool(FunctionCall2Coll(&opproc, DEFAULT_COLLATION_OID, cst->constvalue, item->values[idx])); It *will* matter if the operator is cross-type. I think there is no good reason to have different code paths for the equality and inequality cases --- just look at isgt and swap or don't swap the arguments. BTW, "isgt" seems like a completely misleading name for that variable. AFAICS, what that is actually telling is whether the var is on the left or right side of the OpExpr. Everywhere else in the planner with a need for that uses "bool varonleft", and I think you should do likewise here (though note that that isgt, as coded, is more like "varonright"). regards, tom lane
On Thu, Jul 18, 2019 at 11:16:08AM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> I've pushed the fixes listed in the previous message, with the exception >> of the collation part, because I had some doubts about that. > >Sorry for being slow on this. > >> Now, for the collation part - after some more thought and looking at code >> I think the fix I shared before is OK. It uses per-column collations >> consistently both when building the stats and estimating clauses, which >> makes it consistent. I was not quite sure if using Var->varcollid is the >> same thing as pg_attribute.attcollation, but it seems it is - at least for >> Vars pointing to regular columns (which for extended stats should always >> be the case). > >I think you are right, but it could use some comments in the code. >The important point here is that if we coerce a Var's collation to >something else, that will be represented as a separate CollateExpr >(which will be a RelabelType by the time it gets here, I believe). >We don't just replace varcollid, the way eval_const_expressions will >do to a Const. > OK, thanks. I've added a comment about that into mcv_get_match_bitmap (not all the details about RelabelType, because that gets stripped while examining the opexpr, but generally about the collations). > >While I'm looking at the code --- I don't find this at all convincing: > > /* > * We don't care about isgt in equality, because > * it does not matter whether it's (var op const) > * or (const op var). > */ > match = DatumGetBool(FunctionCall2Coll(&opproc, > DEFAULT_COLLATION_OID, > cst->constvalue, > item->values[idx])); > >It *will* matter if the operator is cross-type. I think there is no >good reason to have different code paths for the equality and inequality >cases --- just look at isgt and swap or don't swap the arguments. > >BTW, "isgt" seems like a completely misleading name for that variable. >AFAICS, what that is actually telling is whether the var is on the left >or right side of the OpExpr. Everywhere else in the planner with a >need for that uses "bool varonleft", and I think you should do likewise >here (though note that that isgt, as coded, is more like "varonright"). > Yes, you're right in both cases. I've fixed the first issue with equality by simply using the same code as for all operators (which means we don't need to check the oprrest at all in this code, making it simpler). And I've reworked the examine_opclause_expression() to use varonleft naming - I agree it's a much better name. This was one of the remnants of the code I initially copied from somewhere in selfuncs.c and massaged it until it did what I needed. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > [ mcv fixes ] These patches look OK to me. regards, tom lane
On Fri, Jul 19, 2019 at 02:37:19PM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> [ mcv fixes ] > >These patches look OK to me. > OK, thanks. Pushed. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services