Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation |
Date | |
Msg-id | 666679.1591138428@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation
|
List | pgsql-bugs |
I wrote: > I guess the path of least resistance is to change the selectivity > functions to use the query's collation; then, if you get an error > here you would have done so at runtime anyway. The problem of > inconsistency with the histogram collation will be real for > ineq_histogram_selectivity; but we had a variant of that before, > in that always using DEFAULT_COLLATION_OID would give answers > that were wrong for a query using a different collation. I worked on this for awhile and came up with the attached patchset. 0001 does about the minimum required to avoid this failure, by passing the query's collation not stacoll to operators and selectivity functions invoked during selectivity estimation. Unfortunately, it doesn't seem like we could sanely back-patch this, because it requires adding parameters to several globally-visible functions. The odds that some external code is calling those functions seem too high to risk an ABI break. So, while I'd like to squeeze this into v13, we still need to think about what to do for v12. 0002 addresses the mentioned problem with ineq_histogram_selectivity by having that function actually verify that the query operator and collation match what the pg_statistic histogram was generated with. If they don't match, all is not lost. What we can do is just sequentially apply the query's operator and comparison constant to each histogram entry, and take the fraction of matches as our selectivity estimate. This is more or less the same insight we have used in generic_restriction_selectivity: the histogram is a pretty decent sample of the column, even if its ordering is not quite what you want. 0002 also deletes a hack I had put in get_attstatsslot() to insert a dummy value into sslot->stacoll. That hack isn't necessary any longer (because indeed we aren't using sslot->stacoll's value anywhere as of 0001), and it breaks the verification check that 0002 wants to add to ineq_histogram_selectivity, which depends on stacoll being truthful. I also adjusted get_variable_range() to deal with collations more honestly. When I went to test 0002, I found out that it broke some test cases in privileges.sql, and the reason was rather interesting. What those cases are relying on is getting a highly accurate selectivity estimate for a user-defined operator, for which the only thing the planner knows for sure is that it uses scalarltsel as the restriction estimator. Despite this lack of knowledge, the existing code just blithely uses the histogram as though it is *precisely* applicable to the user-defined operator. (Which it is, since that operator is just a wrapper around regular "<" ... but the system has no business assuming that.) So with the patch, the case exercises the new code path that just counts matches, and that gives us only 1/default_statistics_target resolution in the selectivity estimate; which is not enough to get the expected plan to be selected. I worked around this for the moment by cranking up default_statistics_target while running the ANALYZE in that test script, but I wonder if we should instead tweak those test cases to be more robust. I think the combination of 0001+0002 really moves the goalposts a long way in terms of having honest stats estimation for non-default collations, so I'd like to sneak it into v13. As for v12, about the only alternatives I can think of are: 1. Do nothing, reasoning that if nobody noticed for a year, this situation is enough of a corner case that we can leave it unfixed. Obviously that's pretty unsatisfying. 2. Change all the stats functions to pass DEFAULT_COLLATION_OID when invoking operator functions. This is not too attractive either because it essentially reverts 5e0928005; in fact, to avoid breaking things completely we'd likely have to revert the part of that commit that taught ANALYZE to collect stats using column collations instead of DEFAULT_COLLATION_OID. Then we get into questions like what about 6b0faf723 --- it's going to be a mess. 3. Hack things up so that the core code renames all these exposed functions to, say, ineq_histogram_selectivity_ext() and so on, allowing the additional arguments to exist, but the old names would still be there as ABI compatibility wrappers. This might produce slightly funny results for external code calling the wrappers, since the wrappers would have to assume DEFAULT_COLLATION_OID, but it'd avoid an ABI break at least. I don't want to propagate such a thing into HEAD, so this would leave us with unsightly differences between v12 and earlier/later branches -- but there aren't *that* many places involved. (I'd envision this approach as back-porting 0001 but not 0002. For one reason, there's noplace for a wrapper to get the additional operator OID needed for ineq_histogram_selectivity_ext. For another, the results for the privilege test suggest that 0002 might have surprising effects on user-defined operators, so back patching it might draw more complaints.) Alternatives #2 and #3 would result in (different) changes in the selectivity estimates v12 produces when considering columns with non-default collations and/or queries using collations that don't match the relevant columns. So that might be an argument for doing nothing in v12; people tend not to like it when minor releases cause unexpected plan changes. Also, #2 is probably strictly worse than #3 on this score, since it'd move such estimates away from reality not towards it. Thoughts? regards, tom lane diff --git a/contrib/ltree/ltree_op.c b/contrib/ltree/ltree_op.c index 4ac2ed5e54..778dbf1e98 100644 --- a/contrib/ltree/ltree_op.c +++ b/contrib/ltree/ltree_op.c @@ -582,7 +582,7 @@ ltreeparentsel(PG_FUNCTION_ARGS) double selec; /* Use generic restriction selectivity logic, with default 0.001. */ - selec = generic_restriction_selectivity(root, operator, + selec = generic_restriction_selectivity(root, operator, InvalidOid, args, varRelid, 0.001); diff --git a/src/backend/utils/adt/like_support.c b/src/backend/utils/adt/like_support.c index 286e000d4e..ae5c8f084e 100644 --- a/src/backend/utils/adt/like_support.c +++ b/src/backend/utils/adt/like_support.c @@ -92,6 +92,7 @@ static Pattern_Prefix_Status pattern_fixed_prefix(Const *patt, static Selectivity prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, Oid eqopr, Oid ltopr, Oid geopr, + Oid collation, Const *prefixcon); static Selectivity like_selectivity(const char *patt, int pattlen, bool case_insensitive); @@ -534,12 +535,6 @@ patternsel_common(PlannerInfo *root, * something binary-compatible but different.) We can use it to identify * the comparison operators and the required type of the comparison * constant, much as in match_pattern_prefix(). - * - * NOTE: this logic does not consider collations. Ideally we'd force use - * of "C" collation, but since ANALYZE only generates statistics for the - * column's specified collation, we have little choice but to use those. - * But our results are so approximate anyway that it probably hardly - * matters. */ vartype = vardata.vartype; @@ -622,7 +617,7 @@ patternsel_common(PlannerInfo *root, /* * Pattern specifies an exact match, so estimate as for '=' */ - result = var_eq_const(&vardata, eqopr, prefix->constvalue, + result = var_eq_const(&vardata, eqopr, collation, prefix->constvalue, false, true, false); } else @@ -654,7 +649,8 @@ patternsel_common(PlannerInfo *root, opfuncid = get_opcode(oprid); fmgr_info(opfuncid, &opproc); - selec = histogram_selectivity(&vardata, &opproc, constval, true, + selec = histogram_selectivity(&vardata, &opproc, collation, + constval, true, 10, 1, &hist_size); /* If not at least 100 entries, use the heuristic method */ @@ -666,6 +662,7 @@ patternsel_common(PlannerInfo *root, if (pstatus == Pattern_Prefix_Partial) prefixsel = prefix_selectivity(root, &vardata, eqopr, ltopr, geopr, + collation, prefix); else prefixsel = 1.0; @@ -698,7 +695,8 @@ patternsel_common(PlannerInfo *root, * directly to the result selectivity. Also add up the total fraction * represented by MCV entries. */ - mcv_selec = mcv_selectivity(&vardata, &opproc, constval, true, + mcv_selec = mcv_selectivity(&vardata, &opproc, collation, + constval, true, &sumcommon); /* @@ -1196,7 +1194,7 @@ pattern_fixed_prefix(Const *patt, Pattern_Type ptype, Oid collation, * population represented by the histogram --- the caller must fold this * together with info about MCVs and NULLs. * - * We use the specified btree comparison operators to do the estimation. + * We use the given comparison operators and collation to do the estimation. * The given variable and Const must be of the associated datatype(s). * * XXX Note: we make use of the upper bound to estimate operator selectivity @@ -1207,11 +1205,11 @@ pattern_fixed_prefix(Const *patt, Pattern_Type ptype, Oid collation, static Selectivity prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, Oid eqopr, Oid ltopr, Oid geopr, + Oid collation, Const *prefixcon) { Selectivity prefixsel; FmgrInfo opproc; - AttStatsSlot sslot; Const *greaterstrcon; Selectivity eq_sel; @@ -1220,6 +1218,7 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, prefixsel = ineq_histogram_selectivity(root, vardata, &opproc, true, true, + collation, prefixcon->constvalue, prefixcon->consttype); @@ -1229,27 +1228,18 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, return DEFAULT_MATCH_SEL; } - /*------- - * If we can create a string larger than the prefix, say - * "x < greaterstr". We try to generate the string referencing the - * collation of the var's statistics, but if that's not available, - * use DEFAULT_COLLATION_OID. - *------- + /* + * If we can create a string larger than the prefix, say "x < greaterstr". */ - if (HeapTupleIsValid(vardata->statsTuple) && - get_attstatsslot(&sslot, vardata->statsTuple, - STATISTIC_KIND_HISTOGRAM, InvalidOid, 0)) - /* sslot.stacoll is set up */ ; - else - sslot.stacoll = DEFAULT_COLLATION_OID; fmgr_info(get_opcode(ltopr), &opproc); - greaterstrcon = make_greater_string(prefixcon, &opproc, sslot.stacoll); + greaterstrcon = make_greater_string(prefixcon, &opproc, collation); if (greaterstrcon) { Selectivity topsel; topsel = ineq_histogram_selectivity(root, vardata, &opproc, false, false, + collation, greaterstrcon->constvalue, greaterstrcon->consttype); @@ -1278,7 +1268,7 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, * probably off the end of the histogram, and thus we probably got a very * small estimate from the >= condition; so we still need to clamp. */ - eq_sel = var_eq_const(vardata, eqopr, prefixcon->constvalue, + eq_sel = var_eq_const(vardata, eqopr, collation, prefixcon->constvalue, false, true, false); prefixsel = Max(prefixsel, eq_sel); diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c index 863efd3d76..955e0ee87f 100644 --- a/src/backend/utils/adt/network_selfuncs.c +++ b/src/backend/utils/adt/network_selfuncs.c @@ -137,7 +137,8 @@ networksel(PG_FUNCTION_ARGS) * by MCV entries. */ fmgr_info(get_opcode(operator), &proc); - mcv_selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft, + mcv_selec = mcv_selectivity(&vardata, &proc, InvalidOid, + constvalue, varonleft, &sumcommon); /* diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index cfb05682bc..2332277307 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -88,11 +88,7 @@ * (if any) is passed using the standard fmgr mechanism, so that the estimator * function can fetch it with PG_GET_COLLATION(). Note, however, that all * statistics in pg_statistic are currently built using the relevant column's - * collation. Thus, in most cases where we are looking at statistics, we - * should ignore the operator collation and use the stats entry's collation. - * We expect that the error induced by doing this is usually not large enough - * to justify complicating matters. In any case, doing otherwise would yield - * entirely garbage results for ordered stats data such as histograms. + * collation. *---------- */ @@ -149,14 +145,14 @@ get_relation_stats_hook_type get_relation_stats_hook = NULL; get_index_stats_hook_type get_index_stats_hook = NULL; static double eqsel_internal(PG_FUNCTION_ARGS, bool negate); -static double eqjoinsel_inner(Oid opfuncoid, +static double eqjoinsel_inner(Oid opfuncoid, Oid collation, VariableStatData *vardata1, VariableStatData *vardata2, double nd1, double nd2, bool isdefault1, bool isdefault2, AttStatsSlot *sslot1, AttStatsSlot *sslot2, Form_pg_statistic stats1, Form_pg_statistic stats2, bool have_mcvs1, bool have_mcvs2); -static double eqjoinsel_semi(Oid opfuncoid, +static double eqjoinsel_semi(Oid opfuncoid, Oid collation, VariableStatData *vardata1, VariableStatData *vardata2, double nd1, double nd2, bool isdefault1, bool isdefault2, @@ -194,10 +190,11 @@ static double convert_timevalue_to_scalar(Datum value, Oid typid, static void examine_simple_variable(PlannerInfo *root, Var *var, VariableStatData *vardata); static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata, - Oid sortop, Datum *min, Datum *max); + Oid sortop, Oid collation, + Datum *min, Datum *max); static bool get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, - Oid sortop, + Oid sortop, Oid collation, Datum *min, Datum *max); static bool get_actual_variable_endpoint(Relation heapRel, Relation indexRel, @@ -235,6 +232,7 @@ eqsel_internal(PG_FUNCTION_ARGS, bool negate) Oid operator = PG_GETARG_OID(1); List *args = (List *) PG_GETARG_POINTER(2); int varRelid = PG_GETARG_INT32(3); + Oid collation = PG_GET_COLLATION(); VariableStatData vardata; Node *other; bool varonleft; @@ -268,12 +266,12 @@ eqsel_internal(PG_FUNCTION_ARGS, bool negate) * in the query.) */ if (IsA(other, Const)) - selec = var_eq_const(&vardata, operator, + selec = var_eq_const(&vardata, operator, collation, ((Const *) other)->constvalue, ((Const *) other)->constisnull, varonleft, negate); else - selec = var_eq_non_const(&vardata, operator, other, + selec = var_eq_non_const(&vardata, operator, collation, other, varonleft, negate); ReleaseVariableStats(vardata); @@ -287,7 +285,7 @@ eqsel_internal(PG_FUNCTION_ARGS, bool negate) * This is exported so that some other estimation functions can use it. */ double -var_eq_const(VariableStatData *vardata, Oid operator, +var_eq_const(VariableStatData *vardata, Oid operator, Oid collation, Datum constval, bool constisnull, bool varonleft, bool negate) { @@ -356,7 +354,7 @@ var_eq_const(VariableStatData *vardata, Oid operator, * eqproc returns NULL, though really equality functions should * never do that. */ - InitFunctionCallInfoData(*fcinfo, &eqproc, 2, sslot.stacoll, + InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation, NULL, NULL); fcinfo->args[0].isnull = false; fcinfo->args[1].isnull = false; @@ -458,7 +456,7 @@ var_eq_const(VariableStatData *vardata, Oid operator, * This is exported so that some other estimation functions can use it. */ double -var_eq_non_const(VariableStatData *vardata, Oid operator, +var_eq_non_const(VariableStatData *vardata, Oid operator, Oid collation, Node *other, bool varonleft, bool negate) { @@ -573,6 +571,7 @@ neqsel(PG_FUNCTION_ARGS) */ static double scalarineqsel(PlannerInfo *root, Oid operator, bool isgt, bool iseq, + Oid collation, VariableStatData *vardata, Datum constval, Oid consttype) { Form_pg_statistic stats; @@ -672,7 +671,7 @@ scalarineqsel(PlannerInfo *root, Oid operator, bool isgt, bool iseq, * to the result selectivity. Also add up the total fraction represented * by MCV entries. */ - mcv_selec = mcv_selectivity(vardata, &opproc, constval, true, + mcv_selec = mcv_selectivity(vardata, &opproc, collation, constval, true, &sumcommon); /* @@ -681,6 +680,7 @@ scalarineqsel(PlannerInfo *root, Oid operator, bool isgt, bool iseq, */ hist_selec = ineq_histogram_selectivity(root, vardata, &opproc, isgt, iseq, + collation, constval, consttype); /* @@ -722,7 +722,7 @@ scalarineqsel(PlannerInfo *root, Oid operator, bool isgt, bool iseq, * if there is no MCV list. */ double -mcv_selectivity(VariableStatData *vardata, FmgrInfo *opproc, +mcv_selectivity(VariableStatData *vardata, FmgrInfo *opproc, Oid collation, Datum constval, bool varonleft, double *sumcommonp) { @@ -749,7 +749,7 @@ mcv_selectivity(VariableStatData *vardata, FmgrInfo *opproc, * operators that can return NULL. A small side benefit is to not * need to re-initialize the fcinfo struct from scratch each time. */ - InitFunctionCallInfoData(*fcinfo, opproc, 2, sslot.stacoll, + InitFunctionCallInfoData(*fcinfo, opproc, 2, collation, NULL, NULL); fcinfo->args[0].isnull = false; fcinfo->args[1].isnull = false; @@ -813,7 +813,8 @@ mcv_selectivity(VariableStatData *vardata, FmgrInfo *opproc, * prudent to clamp the result range, ie, disbelieve exact 0 or 1 outputs. */ double -histogram_selectivity(VariableStatData *vardata, FmgrInfo *opproc, +histogram_selectivity(VariableStatData *vardata, + FmgrInfo *opproc, Oid collation, Datum constval, bool varonleft, int min_hist_size, int n_skip, int *hist_size) @@ -846,7 +847,7 @@ histogram_selectivity(VariableStatData *vardata, FmgrInfo *opproc, * is to not need to re-initialize the fcinfo struct from scratch * each time. */ - InitFunctionCallInfoData(*fcinfo, opproc, 2, sslot.stacoll, + InitFunctionCallInfoData(*fcinfo, opproc, 2, collation, NULL, NULL); fcinfo->args[0].isnull = false; fcinfo->args[1].isnull = false; @@ -903,7 +904,7 @@ histogram_selectivity(VariableStatData *vardata, FmgrInfo *opproc, * Otherwise, fall back to the default selectivity provided by the caller. */ double -generic_restriction_selectivity(PlannerInfo *root, Oid oproid, +generic_restriction_selectivity(PlannerInfo *root, Oid oproid, Oid collation, List *args, int varRelid, double default_selectivity) { @@ -946,7 +947,8 @@ generic_restriction_selectivity(PlannerInfo *root, Oid oproid, /* * Calculate the selectivity for the column's most common values. */ - mcvsel = mcv_selectivity(&vardata, &opproc, constval, varonleft, + mcvsel = mcv_selectivity(&vardata, &opproc, collation, + constval, varonleft, &mcvsum); /* @@ -955,7 +957,7 @@ generic_restriction_selectivity(PlannerInfo *root, Oid oproid, * population. Otherwise use the default selectivity for the non-MCV * population. */ - selec = histogram_selectivity(&vardata, &opproc, + selec = histogram_selectivity(&vardata, &opproc, collation, constval, varonleft, 10, 1, &hist_size); if (selec < 0) @@ -1029,6 +1031,7 @@ double ineq_histogram_selectivity(PlannerInfo *root, VariableStatData *vardata, FmgrInfo *opproc, bool isgt, bool iseq, + Oid collation, Datum constval, Oid consttype) { double hist_selec; @@ -1042,9 +1045,11 @@ ineq_histogram_selectivity(PlannerInfo *root, * column type. However, to make that work we will need to figure out * which staop to search for --- it's not necessarily the one we have at * hand! (For example, we might have a '<=' operator rather than the '<' - * operator that will appear in staop.) For now, assume that whatever - * appears in pg_statistic is sorted the same way our operator sorts, or - * the reverse way if isgt is true. + * operator that will appear in staop.) The collation might not agree + * either. For now, just assume that whatever appears in pg_statistic is + * sorted the same way our operator sorts, or the reverse way if isgt is + * true. This could result in a bogus estimate, but it still seems better + * than falling back to the default estimate. */ if (HeapTupleIsValid(vardata->statsTuple) && statistic_proc_security_check(vardata, opproc->fn_oid) && @@ -1090,6 +1095,7 @@ ineq_histogram_selectivity(PlannerInfo *root, have_end = get_actual_variable_range(root, vardata, sslot.staop, + collation, &sslot.values[0], &sslot.values[1]); @@ -1107,17 +1113,19 @@ ineq_histogram_selectivity(PlannerInfo *root, have_end = get_actual_variable_range(root, vardata, sslot.staop, + collation, &sslot.values[0], NULL); else if (probe == sslot.nvalues - 1 && sslot.nvalues > 2) have_end = get_actual_variable_range(root, vardata, sslot.staop, + collation, NULL, &sslot.values[probe]); ltcmp = DatumGetBool(FunctionCall2Coll(opproc, - sslot.stacoll, + collation, sslot.values[probe], constval)); if (isgt) @@ -1202,7 +1210,7 @@ ineq_histogram_selectivity(PlannerInfo *root, * values to a uniform comparison scale, and do a linear * interpolation within this bin. */ - if (convert_to_scalar(constval, consttype, sslot.stacoll, + if (convert_to_scalar(constval, consttype, collation, &val, sslot.values[i - 1], sslot.values[i], vardata->vartype, @@ -1342,6 +1350,7 @@ scalarineqsel_wrapper(PG_FUNCTION_ARGS, bool isgt, bool iseq) Oid operator = PG_GETARG_OID(1); List *args = (List *) PG_GETARG_POINTER(2); int varRelid = PG_GETARG_INT32(3); + Oid collation = PG_GET_COLLATION(); VariableStatData vardata; Node *other; bool varonleft; @@ -1394,7 +1403,7 @@ scalarineqsel_wrapper(PG_FUNCTION_ARGS, bool isgt, bool iseq) } /* The rest of the work is done by scalarineqsel(). */ - selec = scalarineqsel(root, operator, isgt, iseq, + selec = scalarineqsel(root, operator, isgt, iseq, collation, &vardata, constval, consttype); ReleaseVariableStats(vardata); @@ -1459,7 +1468,7 @@ boolvarsel(PlannerInfo *root, Node *arg, int varRelid) * A boolean variable V is equivalent to the clause V = 't', so we * compute the selectivity as if that is what we have. */ - selec = var_eq_const(&vardata, BooleanEqualOperator, + selec = var_eq_const(&vardata, BooleanEqualOperator, InvalidOid, BoolGetDatum(true), false, true, false); } else @@ -2185,6 +2194,7 @@ eqjoinsel(PG_FUNCTION_ARGS) JoinType jointype = (JoinType) PG_GETARG_INT16(3); #endif SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4); + Oid collation = PG_GET_COLLATION(); double selec; double selec_inner; VariableStatData vardata1; @@ -2235,7 +2245,7 @@ eqjoinsel(PG_FUNCTION_ARGS) } /* We need to compute the inner-join selectivity in all cases */ - selec_inner = eqjoinsel_inner(opfuncoid, + selec_inner = eqjoinsel_inner(opfuncoid, collation, &vardata1, &vardata2, nd1, nd2, isdefault1, isdefault2, @@ -2262,7 +2272,7 @@ eqjoinsel(PG_FUNCTION_ARGS) inner_rel = find_join_input_rel(root, sjinfo->min_righthand); if (!join_is_reversed) - selec = eqjoinsel_semi(opfuncoid, + selec = eqjoinsel_semi(opfuncoid, collation, &vardata1, &vardata2, nd1, nd2, isdefault1, isdefault2, @@ -2275,7 +2285,7 @@ eqjoinsel(PG_FUNCTION_ARGS) Oid commop = get_commutator(operator); Oid commopfuncoid = OidIsValid(commop) ? get_opcode(commop) : InvalidOid; - selec = eqjoinsel_semi(commopfuncoid, + selec = eqjoinsel_semi(commopfuncoid, collation, &vardata2, &vardata1, nd2, nd1, isdefault2, isdefault1, @@ -2323,7 +2333,7 @@ eqjoinsel(PG_FUNCTION_ARGS) * that it's worth trying to distinguish them here. */ static double -eqjoinsel_inner(Oid opfuncoid, +eqjoinsel_inner(Oid opfuncoid, Oid collation, VariableStatData *vardata1, VariableStatData *vardata2, double nd1, double nd2, bool isdefault1, bool isdefault2, @@ -2373,7 +2383,7 @@ eqjoinsel_inner(Oid opfuncoid, * returns NULL, though really equality functions should never do * that. */ - InitFunctionCallInfoData(*fcinfo, &eqproc, 2, sslot1->stacoll, + InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation, NULL, NULL); fcinfo->args[0].isnull = false; fcinfo->args[1].isnull = false; @@ -2520,7 +2530,7 @@ eqjoinsel_inner(Oid opfuncoid, * Unlike eqjoinsel_inner, we have to cope with opfuncoid being InvalidOid. */ static double -eqjoinsel_semi(Oid opfuncoid, +eqjoinsel_semi(Oid opfuncoid, Oid collation, VariableStatData *vardata1, VariableStatData *vardata2, double nd1, double nd2, bool isdefault1, bool isdefault2, @@ -2603,7 +2613,7 @@ eqjoinsel_semi(Oid opfuncoid, * returns NULL, though really equality functions should never do * that. */ - InitFunctionCallInfoData(*fcinfo, &eqproc, 2, sslot1->stacoll, + InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation, NULL, NULL); fcinfo->args[0].isnull = false; fcinfo->args[1].isnull = false; @@ -2851,6 +2861,7 @@ mergejoinscansel(PlannerInfo *root, Node *clause, Oid op_lefttype; Oid op_righttype; Oid opno, + collation, lsortop, rsortop, lstatop, @@ -2875,6 +2886,7 @@ mergejoinscansel(PlannerInfo *root, Node *clause, if (!is_opclause(clause)) return; /* shouldn't happen */ opno = ((OpExpr *) clause)->opno; + collation = ((OpExpr *) clause)->inputcollid; left = get_leftop((Expr *) clause); right = get_rightop((Expr *) clause); if (!right) @@ -3008,20 +3020,20 @@ mergejoinscansel(PlannerInfo *root, Node *clause, /* Try to get ranges of both inputs */ if (!isgt) { - if (!get_variable_range(root, &leftvar, lstatop, + if (!get_variable_range(root, &leftvar, lstatop, collation, &leftmin, &leftmax)) goto fail; /* no range available from stats */ - if (!get_variable_range(root, &rightvar, rstatop, + if (!get_variable_range(root, &rightvar, rstatop, collation, &rightmin, &rightmax)) goto fail; /* no range available from stats */ } else { /* need to swap the max and min */ - if (!get_variable_range(root, &leftvar, lstatop, + if (!get_variable_range(root, &leftvar, lstatop, collation, &leftmax, &leftmin)) goto fail; /* no range available from stats */ - if (!get_variable_range(root, &rightvar, rstatop, + if (!get_variable_range(root, &rightvar, rstatop, collation, &rightmax, &rightmin)) goto fail; /* no range available from stats */ } @@ -3031,13 +3043,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause, * fraction that's <= the right-side maximum value. But only believe * non-default estimates, else stick with our 1.0. */ - selec = scalarineqsel(root, leop, isgt, true, &leftvar, + selec = scalarineqsel(root, leop, isgt, true, collation, &leftvar, rightmax, op_righttype); if (selec != DEFAULT_INEQ_SEL) *leftend = selec; /* And similarly for the right variable. */ - selec = scalarineqsel(root, revleop, isgt, true, &rightvar, + selec = scalarineqsel(root, revleop, isgt, true, collation, &rightvar, leftmax, op_lefttype); if (selec != DEFAULT_INEQ_SEL) *rightend = selec; @@ -3061,13 +3073,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause, * minimum value. But only believe non-default estimates, else stick with * our own default. */ - selec = scalarineqsel(root, ltop, isgt, false, &leftvar, + selec = scalarineqsel(root, ltop, isgt, false, collation, &leftvar, rightmin, op_righttype); if (selec != DEFAULT_INEQ_SEL) *leftstart = selec; /* And similarly for the right variable. */ - selec = scalarineqsel(root, revltop, isgt, false, &rightvar, + selec = scalarineqsel(root, revltop, isgt, false, collation, &rightvar, leftmin, op_lefttype); if (selec != DEFAULT_INEQ_SEL) *rightstart = selec; @@ -3147,10 +3159,11 @@ matchingsel(PG_FUNCTION_ARGS) Oid operator = PG_GETARG_OID(1); List *args = (List *) PG_GETARG_POINTER(2); int varRelid = PG_GETARG_INT32(3); + Oid collation = PG_GET_COLLATION(); double selec; /* Use generic restriction selectivity logic. */ - selec = generic_restriction_selectivity(root, operator, + selec = generic_restriction_selectivity(root, operator, collation, args, varRelid, DEFAULT_MATCHING_SEL); @@ -5337,9 +5350,11 @@ get_variable_numdistinct(VariableStatData *vardata, bool *isdefault) * * sortop is the "<" comparison operator to use. This should generally * be "<" not ">", as only the former is likely to be found in pg_statistic. + * The collation must be specified too. */ static bool -get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, +get_variable_range(PlannerInfo *root, VariableStatData *vardata, + Oid sortop, Oid collation, Datum *min, Datum *max) { Datum tmin = 0; @@ -5359,7 +5374,7 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, * before enabling this. */ #ifdef NOT_USED - if (get_actual_variable_range(root, vardata, sortop, min, max)) + if (get_actual_variable_range(root, vardata, sortop, collation, min, max)) return true; #endif @@ -5387,7 +5402,7 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, * * If there is a histogram that is sorted with some other operator than * the one we want, fail --- this suggests that there is data we can't - * use. + * use. XXX consider collation too. */ if (get_attstatsslot(&sslot, vardata->statsTuple, STATISTIC_KIND_HISTOGRAM, sortop, @@ -5434,14 +5449,14 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, continue; } if (DatumGetBool(FunctionCall2Coll(&opproc, - sslot.stacoll, + collation, sslot.values[i], tmin))) { tmin = sslot.values[i]; tmin_is_mcv = true; } if (DatumGetBool(FunctionCall2Coll(&opproc, - sslot.stacoll, + collation, tmax, sslot.values[i]))) { tmax = sslot.values[i]; @@ -5471,10 +5486,11 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, * If no data available, return false. * * sortop is the "<" comparison operator to use. + * collation is the required collation. */ static bool get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, - Oid sortop, + Oid sortop, Oid collation, Datum *min, Datum *max) { bool have_data = false; @@ -5514,9 +5530,11 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, continue; /* - * The first index column must match the desired variable and sort - * operator --- but we can use a descending-order index. + * The first index column must match the desired variable, sortop, and + * collation --- but we can use a descending-order index. */ + if (collation != index->indexcollations[0]) + continue; /* test first 'cause it's cheapest */ if (!match_index_to_operand(vardata->var, 0, index)) continue; switch (get_op_opfamily_strategy(sortop, index->sortopfamily[0])) diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h index 9690b4e486..15d2289024 100644 --- a/src/include/utils/selfuncs.h +++ b/src/include/utils/selfuncs.h @@ -144,24 +144,30 @@ extern void get_join_variables(PlannerInfo *root, List *args, bool *join_is_reversed); extern double get_variable_numdistinct(VariableStatData *vardata, bool *isdefault); -extern double mcv_selectivity(VariableStatData *vardata, FmgrInfo *opproc, +extern double mcv_selectivity(VariableStatData *vardata, + FmgrInfo *opproc, Oid collation, Datum constval, bool varonleft, double *sumcommonp); -extern double histogram_selectivity(VariableStatData *vardata, FmgrInfo *opproc, +extern double histogram_selectivity(VariableStatData *vardata, + FmgrInfo *opproc, Oid collation, Datum constval, bool varonleft, int min_hist_size, int n_skip, int *hist_size); -extern double generic_restriction_selectivity(PlannerInfo *root, Oid oproid, +extern double generic_restriction_selectivity(PlannerInfo *root, + Oid oproid, Oid collation, List *args, int varRelid, double default_selectivity); extern double ineq_histogram_selectivity(PlannerInfo *root, VariableStatData *vardata, FmgrInfo *opproc, bool isgt, bool iseq, + Oid collation, Datum constval, Oid consttype); -extern double var_eq_const(VariableStatData *vardata, Oid oproid, +extern double var_eq_const(VariableStatData *vardata, + Oid oproid, Oid collation, Datum constval, bool constisnull, bool varonleft, bool negate); -extern double var_eq_non_const(VariableStatData *vardata, Oid oproid, +extern double var_eq_non_const(VariableStatData *vardata, + Oid oproid, Oid collation, Node *other, bool varonleft, bool negate); diff --git a/src/backend/utils/adt/like_support.c b/src/backend/utils/adt/like_support.c index ae5c8f084e..bcfbaa1c3d 100644 --- a/src/backend/utils/adt/like_support.c +++ b/src/backend/utils/adt/like_support.c @@ -1217,7 +1217,7 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, fmgr_info(get_opcode(geopr), &opproc); prefixsel = ineq_histogram_selectivity(root, vardata, - &opproc, true, true, + geopr, &opproc, true, true, collation, prefixcon->constvalue, prefixcon->consttype); @@ -1238,7 +1238,7 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata, Selectivity topsel; topsel = ineq_histogram_selectivity(root, vardata, - &opproc, false, false, + ltopr, &opproc, false, false, collation, greaterstrcon->constvalue, greaterstrcon->consttype); diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 2332277307..208744cd3a 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -192,6 +192,10 @@ static void examine_simple_variable(PlannerInfo *root, Var *var, static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, Oid collation, Datum *min, Datum *max); +static void get_stats_slot_range(AttStatsSlot *sslot, + Oid opfuncoid, FmgrInfo *opproc, + Oid collation, int16 typLen, bool typByVal, + Datum *min, Datum *max, bool *p_have_data); static bool get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop, Oid collation, @@ -679,7 +683,7 @@ scalarineqsel(PlannerInfo *root, Oid operator, bool isgt, bool iseq, * compute the resulting contribution to selectivity. */ hist_selec = ineq_histogram_selectivity(root, vardata, - &opproc, isgt, iseq, + operator, &opproc, isgt, iseq, collation, constval, consttype); @@ -1019,6 +1023,9 @@ generic_restriction_selectivity(PlannerInfo *root, Oid oproid, Oid collation, * satisfies the inequality condition, ie, VAR < (or <=, >, >=) CONST. * The isgt and iseq flags distinguish which of the four cases apply. * + * While opproc could be looked up from the operator OID, common callers + * also need to call it separately, so we make the caller pass both. + * * Returns -1 if there is no histogram (valid results will always be >= 0). * * Note that the result disregards both the most-common-values (if any) and @@ -1030,7 +1037,7 @@ generic_restriction_selectivity(PlannerInfo *root, Oid oproid, Oid collation, double ineq_histogram_selectivity(PlannerInfo *root, VariableStatData *vardata, - FmgrInfo *opproc, bool isgt, bool iseq, + Oid opoid, FmgrInfo *opproc, bool isgt, bool iseq, Oid collation, Datum constval, Oid consttype) { @@ -1057,7 +1064,9 @@ ineq_histogram_selectivity(PlannerInfo *root, STATISTIC_KIND_HISTOGRAM, InvalidOid, ATTSTATSSLOT_VALUES)) { - if (sslot.nvalues > 1) + if (sslot.nvalues > 1 && + sslot.stacoll == collation && + comparison_ops_are_compatible(sslot.staop, opoid)) { /* * Use binary search to find the desired location, namely the @@ -1332,6 +1341,49 @@ ineq_histogram_selectivity(PlannerInfo *root, hist_selec = 1.0 - cutoff; } } + else if (sslot.nvalues > 1) + { + /* + * If we get here, we have a histogram but it's not sorted the way + * we want. Do a brute-force search to see how many of the + * entries satisfy the comparison condition, and take that + * fraction as our estimate. (This is identical to the inner loop + * of histogram_selectivity; maybe share code?) + */ + LOCAL_FCINFO(fcinfo, 2); + int nmatch = 0; + + InitFunctionCallInfoData(*fcinfo, opproc, 2, collation, + NULL, NULL); + fcinfo->args[0].isnull = false; + fcinfo->args[1].isnull = false; + fcinfo->args[1].value = constval; + for (int i = 0; i < sslot.nvalues; i++) + { + Datum fresult; + + fcinfo->args[0].value = sslot.values[i]; + fcinfo->isnull = false; + fresult = FunctionCallInvoke(fcinfo); + if (!fcinfo->isnull && DatumGetBool(fresult)) + nmatch++; + } + hist_selec = ((double) nmatch) / ((double) sslot.nvalues); + + /* + * As above, clamp to a hundredth of the histogram resolution. + * This case is surely even less trustworthy than the normal one, + * so we shouldn't believe exact 0 or 1 selectivity. + */ + { + double cutoff = 0.01 / (double) (sslot.nvalues - 1); + + if (hist_selec < cutoff) + hist_selec = cutoff; + else if (hist_selec > 1.0 - cutoff) + hist_selec = 1.0 - cutoff; + } + } free_attstatsslot(&sslot); } @@ -5363,8 +5415,8 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, int16 typLen; bool typByVal; Oid opfuncoid; + FmgrInfo opproc; AttStatsSlot sslot; - int i; /* * XXX It's very tempting to try to use the actual column min and max, if @@ -5395,20 +5447,19 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, (opfuncoid = get_opcode(sortop)))) return false; + opproc.fn_oid = InvalidOid; /* mark this as not looked up yet */ + get_typlenbyval(vardata->atttype, &typLen, &typByVal); /* - * If there is a histogram, grab the first and last values. - * - * If there is a histogram that is sorted with some other operator than - * the one we want, fail --- this suggests that there is data we can't - * use. XXX consider collation too. + * If there is a histogram with the ordering we want, grab the first and + * last values. */ if (get_attstatsslot(&sslot, vardata->statsTuple, STATISTIC_KIND_HISTOGRAM, sortop, ATTSTATSSLOT_VALUES)) { - if (sslot.nvalues > 0) + if (sslot.stacoll == collation && sslot.nvalues > 0) { tmin = datumCopy(sslot.values[0], typByVal, typLen); tmax = datumCopy(sslot.values[sslot.nvalues - 1], typByVal, typLen); @@ -5416,57 +5467,36 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, } free_attstatsslot(&sslot); } - else if (get_attstatsslot(&sslot, vardata->statsTuple, - STATISTIC_KIND_HISTOGRAM, InvalidOid, - 0)) + + /* + * Otherwise, if there is a histogram with some other ordering, scan it + * and get the min and max values according to the ordering we want. This + * of course may not find values that are really extremal according to our + * ordering, but it beats ignoring available data. + */ + if (!have_data && + get_attstatsslot(&sslot, vardata->statsTuple, + STATISTIC_KIND_HISTOGRAM, InvalidOid, + ATTSTATSSLOT_VALUES)) { + get_stats_slot_range(&sslot, opfuncoid, &opproc, + collation, typLen, typByVal, + &tmin, &tmax, &have_data); free_attstatsslot(&sslot); - return false; } /* * If we have most-common-values info, look for extreme MCVs. This is * needed even if we also have a histogram, since the histogram excludes - * the MCVs. However, usually the MCVs will not be the extreme values, so - * avoid unnecessary data copying. + * the MCVs. */ if (get_attstatsslot(&sslot, vardata->statsTuple, STATISTIC_KIND_MCV, InvalidOid, ATTSTATSSLOT_VALUES)) { - bool tmin_is_mcv = false; - bool tmax_is_mcv = false; - FmgrInfo opproc; - - fmgr_info(opfuncoid, &opproc); - - for (i = 0; i < sslot.nvalues; i++) - { - if (!have_data) - { - tmin = tmax = sslot.values[i]; - tmin_is_mcv = tmax_is_mcv = have_data = true; - continue; - } - if (DatumGetBool(FunctionCall2Coll(&opproc, - collation, - sslot.values[i], tmin))) - { - tmin = sslot.values[i]; - tmin_is_mcv = true; - } - if (DatumGetBool(FunctionCall2Coll(&opproc, - collation, - tmax, sslot.values[i]))) - { - tmax = sslot.values[i]; - tmax_is_mcv = true; - } - } - if (tmin_is_mcv) - tmin = datumCopy(tmin, typByVal, typLen); - if (tmax_is_mcv) - tmax = datumCopy(tmax, typByVal, typLen); + get_stats_slot_range(&sslot, opfuncoid, &opproc, + collation, typLen, typByVal, + &tmin, &tmax, &have_data); free_attstatsslot(&sslot); } @@ -5475,6 +5505,61 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, return have_data; } +/* + * get_stats_slot_range: scan sslot for min/max values + * + * Subroutine for get_variable_range. + */ +static void +get_stats_slot_range(AttStatsSlot *sslot, Oid opfuncoid, FmgrInfo *opproc, + Oid collation, int16 typLen, bool typByVal, + Datum *min, Datum *max, bool *p_have_data) +{ + Datum tmin = *min; + Datum tmax = *max; + bool have_data = *p_have_data; + bool found_tmin = false; + bool found_tmax = false; + + /* Look up the comparison function, if we didn't already do so */ + if (opproc->fn_oid != opfuncoid) + fmgr_info(opfuncoid, opproc); + + /* Scan all the slot's values */ + for (int i = 0; i < sslot->nvalues; i++) + { + if (!have_data) + { + tmin = tmax = sslot->values[i]; + found_tmin = found_tmax = true; + *p_have_data = have_data = true; + continue; + } + if (DatumGetBool(FunctionCall2Coll(opproc, + collation, + sslot->values[i], tmin))) + { + tmin = sslot->values[i]; + found_tmin = true; + } + if (DatumGetBool(FunctionCall2Coll(opproc, + collation, + tmax, sslot->values[i]))) + { + tmax = sslot->values[i]; + found_tmax = true; + } + } + + /* + * Copy the slot's values, if we found new extreme values. + */ + if (found_tmin) + *min = datumCopy(tmin, typByVal, typLen); + if (found_tmax) + *max = datumCopy(tmax, typByVal, typLen); +} + /* * get_actual_variable_range diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c index 63d1263502..f3bf413829 100644 --- a/src/backend/utils/cache/lsyscache.c +++ b/src/backend/utils/cache/lsyscache.c @@ -731,6 +731,55 @@ equality_ops_are_compatible(Oid opno1, Oid opno2) return result; } +/* + * comparison_ops_are_compatible + * Return true if the two given comparison operators have compatible + * semantics. + * + * This is trivially true if they are the same operator. Otherwise, + * we look to see if they can be found in the same btree opfamily. + * For example, '<' and '>=' ops match if they belong to the same family. + * + * (This is identical to equality_ops_are_compatible(), except that we + * don't bother to examine hash opclasses.) + */ +bool +comparison_ops_are_compatible(Oid opno1, Oid opno2) +{ + bool result; + CatCList *catlist; + int i; + + /* Easy if they're the same operator */ + if (opno1 == opno2) + return true; + + /* + * We search through all the pg_amop entries for opno1. + */ + catlist = SearchSysCacheList1(AMOPOPID, ObjectIdGetDatum(opno1)); + + result = false; + for (i = 0; i < catlist->n_members; i++) + { + HeapTuple op_tuple = &catlist->members[i]->tuple; + Form_pg_amop op_form = (Form_pg_amop) GETSTRUCT(op_tuple); + + if (op_form->amopmethod == BTREE_AM_OID) + { + if (op_in_opfamily(opno2, op_form->amopfamily)) + { + result = true; + break; + } + } + } + + ReleaseSysCacheList(catlist); + + return result; +} + /* ---------- AMPROC CACHES ---------- */ @@ -3028,19 +3077,6 @@ get_attstatsslot(AttStatsSlot *sslot, HeapTuple statstuple, sslot->staop = (&stats->staop1)[i]; sslot->stacoll = (&stats->stacoll1)[i]; - /* - * XXX Hopefully-temporary hack: if stacoll isn't set, inject the default - * collation. This won't matter for non-collation-aware datatypes. For - * those that are, this covers cases where stacoll has not been set. In - * the short term we need this because some code paths involving type NAME - * do not pass any collation to prefix_selectivity and related functions. - * Even when that's been fixed, it's likely that some add-on typanalyze - * functions won't get the word right away about filling stacoll during - * ANALYZE, so we'll probably need this for awhile. - */ - if (sslot->stacoll == InvalidOid) - sslot->stacoll = DEFAULT_COLLATION_OID; - if (flags & ATTSTATSSLOT_VALUES) { val = SysCacheGetAttr(STATRELATTINH, statstuple, diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h index 91aed1f5a5..fecfe1f4f6 100644 --- a/src/include/utils/lsyscache.h +++ b/src/include/utils/lsyscache.h @@ -82,6 +82,7 @@ extern bool get_op_hash_functions(Oid opno, RegProcedure *lhs_procno, RegProcedure *rhs_procno); extern List *get_op_btree_interpretation(Oid opno); extern bool equality_ops_are_compatible(Oid opno1, Oid opno2); +extern bool comparison_ops_are_compatible(Oid opno1, Oid opno2); extern Oid get_opfamily_proc(Oid opfamily, Oid lefttype, Oid righttype, int16 procnum); extern char *get_attname(Oid relid, AttrNumber attnum, bool missing_ok); diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h index 15d2289024..7ac4a06391 100644 --- a/src/include/utils/selfuncs.h +++ b/src/include/utils/selfuncs.h @@ -159,7 +159,8 @@ extern double generic_restriction_selectivity(PlannerInfo *root, double default_selectivity); extern double ineq_histogram_selectivity(PlannerInfo *root, VariableStatData *vardata, - FmgrInfo *opproc, bool isgt, bool iseq, + Oid opoid, FmgrInfo *opproc, + bool isgt, bool iseq, Oid collation, Datum constval, Oid consttype); extern double var_eq_const(VariableStatData *vardata, diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out index c2d037b614..7caf0c9b6b 100644 --- a/src/test/regress/expected/privileges.out +++ b/src/test/regress/expected/privileges.out @@ -191,7 +191,10 @@ CREATE TABLE atest12 as SELECT x AS a, 10001 - x AS b FROM generate_series(1,10000) x; CREATE INDEX ON atest12 (a); CREATE INDEX ON atest12 (abs(a)); +-- results below depend on having quite accurate stats for atest12 +SET default_statistics_target = 10000; VACUUM ANALYZE atest12; +RESET default_statistics_target; CREATE FUNCTION leak(integer,integer) RETURNS boolean AS $$begin return $1 < $2; end$$ LANGUAGE plpgsql immutable; diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql index 2ba69617dc..0ab5245b1e 100644 --- a/src/test/regress/sql/privileges.sql +++ b/src/test/regress/sql/privileges.sql @@ -136,7 +136,10 @@ CREATE TABLE atest12 as SELECT x AS a, 10001 - x AS b FROM generate_series(1,10000) x; CREATE INDEX ON atest12 (a); CREATE INDEX ON atest12 (abs(a)); +-- results below depend on having quite accurate stats for atest12 +SET default_statistics_target = 10000; VACUUM ANALYZE atest12; +RESET default_statistics_target; CREATE FUNCTION leak(integer,integer) RETURNS boolean AS $$begin return $1 < $2; end$$
pgsql-bugs by date: