Thread: Re: Remove redundant NULL check in clause_selectivity_ext() function
Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> writes: > In the `clause_selectivity_ext()` function, there’s a comment regarding > a NULL clause check that asks, "can this still happen?". I decided to > investigate whether this scenario is still valid. > Here’s what I found after reviewing the relevant cases: > - approx_tuple_count(): The function iterates over a loop of other clauses. > - get_foreign_key_join_selectivity(): The function is invoked within an > `if (clause)` condition. > - consider_new_or_clause(): The clause is generated by > `make_restrictinfo()`, which never returns NULL. > - clauselist_selectivity_ext(): This function either checks > `list_length(clauses) == 1` before being called, or it is called within > a loop of clauses. That list_length check doesn't prove anything about whether the list element is NULL, though. While I suspect that you may be right that the case doesn't occur anymore (if it ever did), I'm inclined to leave this test in place. It's cheap enough compared to what the rest of the function will do, and more importantly we can't assume that all interesting call sites are within Postgres core. There are definitely extensions calling clauselist_selectivity and related functions. It's possible that some of them rely on clause_selectivity not crashing on a NULL. Certainly, such an assumption could be argued to be a bug they should fix; but I'm disinclined to make them jump through that hoop for a vanishingly small performance improvement. Also, there are boatloads of other places where the planner has possibly-redundant checks for null clause pointers. It's likely that some of the other ones are more performance-critical than this. But I wouldn't be in favor of retail removal of the others either. Maybe with a more holistic approach we could remove a whole lot of them and make a measurable improvement; but it would require some careful thought about what invariants we want to assume. There's not really any design principles about this right now, and where we've ended up is that most functions operating on expression trees assume they've got to defend against NULL inputs. To remove those checks, we'd need a clear understanding of where caller-side checks need to be added instead. regards, tom lane
On 19.8.24 18:02, Tom Lane wrote:
Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> writes:In the `clause_selectivity_ext()` function, there’s a comment regarding a NULL clause check that asks, "can this still happen?". I decided to investigate whether this scenario is still valid.Here’s what I found after reviewing the relevant cases:- approx_tuple_count(): The function iterates over a loop of other clauses. - get_foreign_key_join_selectivity(): The function is invoked within an `if (clause)` condition. - consider_new_or_clause(): The clause is generated by `make_restrictinfo()`, which never returns NULL. - clauselist_selectivity_ext(): This function either checks `list_length(clauses) == 1` before being called, or it is called within a loop of clauses.That list_length check doesn't prove anything about whether the list element is NULL, though. While I suspect that you may be right that the case doesn't occur anymore (if it ever did), I'm inclined to leave this test in place. It's cheap enough compared to what the rest of the function will do, and more importantly we can't assume that all interesting call sites are within Postgres core. There are definitely extensions calling clauselist_selectivity and related functions. It's possible that some of them rely on clause_selectivity not crashing on a NULL. Certainly, such an assumption could be argued to be a bug they should fix; but I'm disinclined to make them jump through that hoop for a vanishingly small performance improvement. Also, there are boatloads of other places where the planner has possibly-redundant checks for null clause pointers. It's likely that some of the other ones are more performance-critical than this. But I wouldn't be in favor of retail removal of the others either. Maybe with a more holistic approach we could remove a whole lot of them and make a measurable improvement; but it would require some careful thought about what invariants we want to assume. There's not really any design principles about this right now, and where we've ended up is that most functions operating on expression trees assume they've got to defend against NULL inputs. To remove those checks, we'd need a clear understanding of where caller-side checks need to be added instead. regards, tom lane
Let's assume that this check needs to remain, and the length check doesn't guarantee anything. However, I'm a bit concerned that there's a NULL check here, but it's missing in the
clauselist_selectivity_ext()
function. For the reasons mentioned above, I would suggest the following: either we perform the NULL check in both places, or we don't perform it in either. -- Regards, Ilia Evdokimov, Tantor Labs LCC.
On Tue, 20 Aug 2024 at 03:48, Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> wrote: > Let's assume that this check needs to remain, and the length check doesn't guarantee anything. However, I'm a bit concernedthat there's a NULL check here, but it's missing in the clauselist_selectivity_ext() function. For the reasons mentionedabove, I would suggest the following: either we perform the NULL check in both places, or we don't perform it ineither. I don't follow this comparison. clauselist_selectivity_ext() is perfectly capable of accepting a NIL List of clauses. I agree with Tom that it's unlikely to be worth the risk removing the NULL check from clause_selectivity_ext(). From my point of view, the risk-to-reward ratio is nowhere near the level of being worth it. There'd just be no way to measure any sort of speedup from this change as there are far too many other things going on during planning. This one is a drop in the ocean. However, I'd like to encourage you to look for other places that might have a more meaningful impact on performance. For those, it's best to come armed with a benchmark and results that demonstrate the speedup along with your justification as to why you think the change is worthwhile. We've not received the former and you've not convinced two committers with your attempt on the latter. I suggest marking the CF entry for this patch as rejected. David