Re: BUG #16865: Regression: GIN Negated prefix search returns results that contain the search term - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #16865: Regression: GIN Negated prefix search returns results that contain the search term |
Date | |
Msg-id | 2288864.1613431608@sss.pgh.pa.us Whole thread Raw |
In response to | BUG #16865: Regression: GIN Negated prefix search returns results that contain the search term (PG Bug reporting form <noreply@postgresql.org>) |
Responses |
Re: BUG #16865: Regression: GIN Negated prefix search returns results that contain the search term
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-bugs |
=?UTF-8?Q?Dimitri_N=C3=BCscheler?= <dimitri.nuescheler@gmail.com> writes: > Meanwhile I managed to anonymize the data. I put it in this archive > https://www.violetsky.ch/postgres-issue-anonymized.tar.gz Thanks. I've reproduced the issue and it boils down to doing the wrong thing when there are GIN_MAYBE values in the input data for checkcondition_gin, specifically when the bitmap holding the original GIN index results has become lossy. We correctly get a TS_MAYBE result out of the calculation in TS_execute_recurse, but then TS_execute figures it can throw that detail away and just return TS_YES. I think that this problem existed before 2f2007fbb, but we were masking it by always forcing rechecks. Anyway, this seems to put the final nail in the coffin of the idea that it's sufficient for TS_execute to return a boolean. At least the tsginidx.c callers really need to see the 3-way result. Hence I propose the attached patch. It fixes the given test case, but I wonder if you can try it and see if you see any remaining problems. I wasted quite a bit of time trying to devise a test case that is short enough to be reasonable to include in our regression tests, without success ... you need quite a bit of data to make the GIN bitmap become lossy. So no test case here, but I'm not super comfortable with that. regards, tom lane diff --git a/src/backend/utils/adt/tsginidx.c b/src/backend/utils/adt/tsginidx.c index 6c913baaba..8eed376708 100644 --- a/src/backend/utils/adt/tsginidx.c +++ b/src/backend/utils/adt/tsginidx.c @@ -246,10 +246,22 @@ gin_tsquery_consistent(PG_FUNCTION_ARGS) gcv.map_item_operand = (int *) (extra_data[0]); gcv.need_recheck = recheck; - res = TS_execute(GETQUERY(query), - &gcv, - TS_EXEC_PHRASE_NO_POS, - checkcondition_gin); + switch (TS_execute_ternary(GETQUERY(query), + &gcv, + TS_EXEC_PHRASE_NO_POS, + checkcondition_gin)) + { + case TS_NO: + res = false; + break; + case TS_YES: + res = true; + break; + case TS_MAYBE: + res = true; + *recheck = true; + break; + } } PG_RETURN_BOOL(res); @@ -284,11 +296,12 @@ gin_tsquery_triconsistent(PG_FUNCTION_ARGS) gcv.map_item_operand = (int *) (extra_data[0]); gcv.need_recheck = &recheck; - if (TS_execute(GETQUERY(query), - &gcv, - TS_EXEC_PHRASE_NO_POS, - checkcondition_gin)) - res = recheck ? GIN_MAYBE : GIN_TRUE; + res = TS_execute_ternary(GETQUERY(query), + &gcv, + TS_EXEC_PHRASE_NO_POS, + checkcondition_gin); + if (res == GIN_TRUE && recheck) + res = GIN_MAYBE; } PG_RETURN_GIN_TERNARY_VALUE(res); diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c index 2939fb5c21..9236ebcc8f 100644 --- a/src/backend/utils/adt/tsvector_op.c +++ b/src/backend/utils/adt/tsvector_op.c @@ -1854,6 +1854,18 @@ TS_execute(QueryItem *curitem, void *arg, uint32 flags, return TS_execute_recurse(curitem, arg, flags, chkcond) != TS_NO; } +/* + * Evaluate tsquery boolean expression. + * + * This is the same as TS_execute except that TS_MAYBE is returned as-is. + */ +TSTernaryValue +TS_execute_ternary(QueryItem *curitem, void *arg, uint32 flags, + TSExecuteCallback chkcond) +{ + return TS_execute_recurse(curitem, arg, flags, chkcond); +} + /* * TS_execute recursion for operators above any phrase operator. Here we do * not need to worry about lexeme positions. As soon as we hit an OP_PHRASE diff --git a/src/include/tsearch/ts_utils.h b/src/include/tsearch/ts_utils.h index 69a9ba8524..4266560151 100644 --- a/src/include/tsearch/ts_utils.h +++ b/src/include/tsearch/ts_utils.h @@ -199,6 +199,9 @@ typedef TSTernaryValue (*TSExecuteCallback) (void *arg, QueryOperand *val, extern bool TS_execute(QueryItem *curitem, void *arg, uint32 flags, TSExecuteCallback chkcond); +extern TSTernaryValue TS_execute_ternary(QueryItem *curitem, void *arg, + uint32 flags, + TSExecuteCallback chkcond); extern bool tsquery_requires_match(QueryItem *curitem); /*
pgsql-bugs by date: