Re: Avoid full GIN index scan when possible - Mailing list pgsql-hackers
From | Nikita Glukhov |
---|---|
Subject | Re: Avoid full GIN index scan when possible |
Date | |
Msg-id | b53614eb-6f9f-8c5c-9df8-f703b0b102b6@postgrespro.ru Whole thread Raw |
In response to | Re: Avoid full GIN index scan when possible (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
Responses |
Re: Avoid full GIN index scan when possible
|
List | pgsql-hackers |
On 26.12.2019 4:59, Alexander Korotkov wrote:
There simply should be inverted condition in the assertion:I've tried to add patch #4 to comparison, but I've catch assertion failure.TRAP: FailedAssertion("key->includeNonMatching", File: "ginget.c", Line: 1340)
Assert(!key->includeNonMatching);
I have looked at v9 patch, and here is my review: 1. I agree with NULL-flag handling simplifications in ginNewScanKey(), ginScanKeyAddHiddenEntry() extraction. 2. I also agree that usage of nrequired/nadditional in keyGetItem() is a more natural solution to implement exclusion keys than my previous attempt of doing that in scanGetKey(). But there are some questions: Can we avoid referencing excludeOnly flag keyGetItem() by replacing these references with !nrequired? Maybe it would be better to move the whole block of keyGetItem() code starting from the first loop over required keys and ending before the loop over additional keys inside 'if (key->nrequired) { ... }'? Can we avoid introducing excludeOnly flag by reusing searchMode and/or by moving the initialization of nrequired/nadditional into ginNewScanKey()? 3. The following two times repeated NULL-filtering check looks too complicated and needs to be refactored somehow: - res = key->triConsistentFn(key); + if (key->excludeOnly && + key->nuserentries < key->nentries && + key->scanEntry[key->nuserentries]->queryCategory == GIN_CAT_NULL_KEY && + key->entryRes[key->nuserentries] == GIN_TRUE) + res = GIN_FALSE; + else + res = key->triConsistentFn(key); For example, a special consistentFn() can be introduced for such NOT_NULL scankeys. Or even a hidden separate one-entry scankey with a trivial consistentFn() can be added instead of adding hidden entry. 4. forcedRecheck flag that was previously used for discarded empty ALL scankeys is removed now. 0-entry exclusion keys can appear instead, and their consistentFn() simply returns constant value. Could this lead to tangible overhead in some cases (in comparison to forcedRecheck flag)? 5. A hidden GIN_CAT_EMPTY_QUERY is added only for the first empty ALL-scankey, NULLs in other columns are filtered out with GIN_CAT_NULL_KEY. This looks like asymmetric, and it leads to accelerations is some cases and slowdowns in others (depending on NULL fractions and their correlations in columns). The following test shows a significant performance regression of v9: insert into t select array[i], NULL, NULL from generate_series(1, 1000000) i; | Query time, ms WHERE condition | master | v8 | v9 ---------------------------------------+--------+--------+---------a @> '{}' | 224 | 213 | 212a @> '{}' and b @> '{}' | 52 | 57 | 255a @> '{}' and b @> '{}' and c @> '{}' | 51 | 58 | 290 In the older version of the patch I tried to do the similar things (initialize only one NOT_NULL entry for the first column), but refused to do this in v8. So, to avoid slowdowns relative to master, I can offer simply to add GIN_CAT_EMPTY_QUERY entry for each column with empty ALL-keys if there are no normal keys. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: