Re: BRIN index creation on geometry column causes crash - Mailing list pgsql-bugs
From | Tomas Vondra |
---|---|
Subject | Re: BRIN index creation on geometry column causes crash |
Date | |
Msg-id | b32735cc-317f-4da6-b76b-e30398f4e94e@vondra.me Whole thread Raw |
List | pgsql-bugs |
On 2/12/25 11:55, Álvaro Herrera wrote: > On 2025-Jan-03, Tomas Vondra wrote: > >> Reproduced, but I belive this is actually a long-standing PostGIS bug. >> The exact place where it crashes is here: >> >> /* Finally, merge B to A. */ >> finfo = inclusion_get_procinfo(bdesc, attno, PROCNUM_MERGE); >> Assert(finfo != NULL); >> result = FunctionCall2Coll(finfo, colloid, ...); >> >> With asserts, it fails on the assert, i.e. finfo is NULL. This means the >> opfamily is missing the "MERGE" support procedure, which is however >> required (from the very beginning of BRIN in 9.5). > > Hmm, I completely agree that this is a PostGIS bug, but at the same time > I think it's wrong for Postgres not to verify that the support function > exists, so that we can throw an error instead of crashing. I think the > most expedient way to implement this is to add a "missing_ok" argument > to the various "foo_get_procinfo" functions, and make to pass it true > only in cases where the usage of the return value already admit that it > could be a null pointer (meaning it's really an optional support proc). > > This should avoid further crashes with invalidly defined opclasses; see > attached POC patch. The user is still in a bad place, because for > instance if it's autovacuum that tries to do the range merging, then > vacuuming will fail but people may not notice for months unless they're > paying careful attention to the server logs. Crashing is good (tm) > because now a bunch of people are aware that the opclass is broken. > Yeah, I'm not opposed to doing this. I don't know what's the right level of paranoia when dealing with user-defined stuff. But if I get this right, this is merely a mitigation turning a crash into ERROR, it would not help with detecting the opclass brokenness any earlier. Until the parallel builds it was very rare to need the MERGE. > My next question is why didn't brinvalidate detect this problem. Is it > just that nobody ran 'amvalidate' on the opclass, or is it that > brinvalidate does not know that it needs to require the merge proc? But > I'll leave that for later. > I don't know. Either no one tried running it, or maybe it wasn't quite clear which of the reported issues are serious and which can be ignored. I certainly wasn't quite clear to me when I tried running it, because all the messages are INFO, it doesn't say which functions are missing, and at least some of it seemed to be due to cross-type operators. See this for details: https://lists.osgeo.org/pipermail/postgis-devel/2025-January/030457.html > > Now, in writing this patch I noticed that both brin_bloom.c and > brin_minmax_multi.c seem to have cargo-culted the idea that support > procs can be optional; both of them have a single support proc > (PROCNUM_HASH and PROCNUM_DISTANCE, respectively) which is not optional; > so the "extra_proc_missing" stuff seems unnecessary for them. And > therefore the new missing_ok support I introduce in this patch would > also be unnecessary. As far as I can tell, there are no ABI concerns > about removing opaque->extra_proc_missing[] from both these opclass > scaffolds (and reintroducing it later, if we determine that we do need > optional support procs in those scaffolds after all.) > Yeah, copy-paste is my basic coding tool, so it's quite possible the bloom and minmax_multi could do stuff in a simpler way. I can't think of a reason why would removing extra_proc_missing[] break ABI, but I don't quite see the need to change this in backbranches. And if we want to stop doing unnecessary stuff, maybe it'd be enough to just remove the code, but leave extra_proc_missing in the struct. regards -- Tomas Vondra
pgsql-bugs by date: