Re: BUG #6278: Index scans on '>' condition on field with many NULLS - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #6278: Index scans on '>' condition on field with many NULLS |
Date | |
Msg-id | 28958.1320079428@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #6278: Index scans on '>' condition on field with many NULLS (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #6278: Index scans on '>' condition on field with many NULLS
|
List | pgsql-bugs |
I wrote: > I poked at this a bit last night. The reason it's happening is that the > ">" key is only marked SK_BT_REQBKWD, not SK_BT_REQFWD, so _bt_checkkeys > doesn't think it can stop when it hits the NULLs. Right at the moment > it seems like we could mark that key with both flags, which leads to the > conclusion that two flags are unnecessary and we could get by with only > one direction-independent flag. OK, I swapped some of this knowledge back in. The above idea would work in simple cases with non-redundant scan keys, but in general not. Consider an indexscan on x with WHERE x > 4 We'll start scanning at the x>4 boundary point and run forward. The scankey condition can only fail once we reach NULL index entries, at which point we can stop, so in this case marking this key SK_BT_REQFWD would be OK. However, consider something like WHERE x > 4 AND x > 10 where _bt_preprocess_keys is unable to determine which key is more restrictive. (This could only happen if the comparison constants are of different types and we don't have a suitable cross-type comparison operator in the opfamily. None of the standard opfamilies have such omissions, which is why there's no regression test covering this.) In such a case, _bt_first arbitrarily picks one of the keys to do the initial positioning with. If it picks x>4, then the x>10 scankey will fail until we reach index entries > 10 ... but we can't abandon the scan just because x>10 is failing. This is why we have to have direction-sensitive required-key flags. (If we were to start backing up, failure of either of the redundant keys would indeed be justification for stopping.) We could possibly do something with marking only non-redundant >/>= scankeys as SK_BT_REQFWD, and conversely for </<= keys. But this looks like it'd be a pain to manage in _bt_preprocess_keys, and given the difficulty of testing cases where it matters, I'm not excited about going that direction. What I'm currently thinking is that we could hack _bt_checkkeys to stop the scan when it reaches NULLs if the current scankey is not an ISNULL test and is marked *either* SK_BT_REQFWD or SK_BT_REQBKWD. In such a case we know that the scankey cannot be satisfied by a NULL, so it will fail until we reach the next range of this index column, ie move to the next value of the next higher-order column (if any). And that value can't pass the scankeys, because it must have an equality constraint, else this key wouldn't be marked required. BTW, I had thought that Maksym could work around this with something like WHERE value > 0.9999 and value < 1.1 or WHERE value > 0.9999 and value is not null but it turns out that neither of those work. The second key will be marked SK_BT_REQFWD, but because _bt_checkkeys stops testing as soon as any key has failed, it doesn't reach the second key and doesn't realize that there's a failing SK_BT_REQFWD key available. That's a little bit annoying, but the only clear fix would be to keep testing keys after we already know the index entry has failed, and I think that's probably going to be a net loss. If we fix the NULL case to accept either flag, then the only case where this will matter is redundant scan keys, and that's not a case that I think we should optimize at the cost of pessimizing normal cases. A patch along these lines should be pretty localized. Has anyone got an opinion on whether to back-patch or not? This seems like a performance bug, but given the lack of prior complaints, maybe we shouldn't take any risks for it. regards, tom lane
pgsql-bugs by date: