Thread: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18692 Logged by: Nicolas Maus Email address: nicolas.maus@bertelsmann.de PostgreSQL version: 16.4 Operating system: SLES 15.5 Description: When extending a varchar column with a gist index with a custom signal length the Postgres server crashes with a segmentation fault. Tested on 16.4 and 14.13 on SLES 15.5 How to reproduce: create extension if not exists pg_trgm; create table contains_trgm (trgm_column varchar(200)); -- important: this bug only occurs when a custom signal length (siglen=x) is provided create index gist_index on contains_trgm using gist (trgm_column public.gist_trgm_ops (siglen=32)); -- will trigger segmentation fault (new column size must be bigger than old one): alter table contains_trgm alter column trgm_column type varchar(768); Log output: 2024-11-06 14:55:09.806 CET : LOG: 00000: server process (PID 130312) was terminated by signal 11: Segmentation fault 2024-11-06 14:55:09.806 CET : DETAIL: Failed process was running: alter table contains_trgm alter column trgm_column type varchar(768); 2024-11-06 14:55:09.806 CET : LOCATION: LogChildExit, postmaster.c:3689 2024-11-06 14:55:09.807 CET : LOG: 00000: terminating any other active server processes 2024-11-06 14:55:09.807 CET : LOCATION: HandleChildCrash, postmaster.c:3490 2024-11-06 14:55:09.823 CET 172.28.110.149(56320) trigram_test ggdbpadm: FATAL: 57P03: the database system is in recovery mode 2024-11-06 14:55:09.823 CET 172.28.110.149(56320) trigram_test ggdbpadm: LOCATION: ProcessStartupPacket, postmaster.c:2358 2024-11-06 14:55:09.875 CET : LOG: 00000: all server processes terminated; reinitializing 2024-11-06 14:55:09.875 CET : LOCATION: PostmasterStateMachine, postmaster.c:3950 2024-11-06 14:55:09.947 CET : LOG: 00000: database system was interrupted; last known up at 2024-11-06 14:52:20 CET 2024-11-06 14:55:09.947 CET : LOCATION: StartupXLOG, xlog.c:5113 2024-11-06 14:55:10.064 CET : LOG: 00000: database system was not properly shut down; automatic recovery in progress 2024-11-06 14:55:10.064 CET : LOCATION: InitWalRecovery, xlogrecovery.c:926 2024-11-06 14:55:10.072 CET : LOG: 00000: redo starts at 0/8C45FFF8 2024-11-06 14:55:10.072 CET : LOCATION: PerformWalRecovery, xlogrecovery.c:1689 2024-11-06 14:55:10.110 CET : LOG: 00000: invalid record length at 0/8C8BFA20: expected at least 24, got 0 2024-11-06 14:55:10.110 CET : LOCATION: ReadRecord, xlogrecovery.c:3137 2024-11-06 14:55:10.113 CET : LOG: 00000: redo done at 0/8C8BF908 system usage: CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.04 s 2024-11-06 14:55:10.113 CET : LOCATION: PerformWalRecovery, xlogrecovery.c:1827 2024-11-06 14:55:10.119 CET : LOG: 00000: checkpoint starting: end-of-recovery immediate wait 2024-11-06 14:55:10.119 CET : LOCATION: LogCheckpointStart, xlog.c:6251 2024-11-06 14:55:10.181 CET : LOG: 00000: checkpoint complete: wrote 957 buffers (5.8%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.041 s, sync=0.011 s, total=0.063 s; sync files=303, longest=0.005 s, average=0.001 s; distance=4478 kB, estimate=4478 kB; lsn=0/8C8BFA20, redo lsn=0/8C8BFA20 2024-11-06 14:55:10.181 CET : LOCATION: LogCheckpointEnd, xlog.c:6351 2024-11-06 14:55:10.337 CET : LOG: 00000: database system is ready to accept connections 2024-11-06 14:55:10.337 CET : LOCATION: process_pm_child_exit, postmaster.c:3110
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > When extending a varchar column with a gist index with a custom signal > length the Postgres server crashes with a segmentation fault. What this shows is that CompareOpclassOptions has never been tested at all except with null inputs. It's trying to call array_eq without providing an FmgrInfo. regards, tom lane
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Alexander Korotkov
Date:
On Wed, Nov 6, 2024 at 7:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > PG Bug reporting form <noreply@postgresql.org> writes: > > When extending a varchar column with a gist index with a custom signal > > length the Postgres server crashes with a segmentation fault. > > What this shows is that CompareOpclassOptions has never been tested > at all except with null inputs. It's trying to call array_eq > without providing an FmgrInfo. Thank you for inviting me to the thread. I'm on and will provide a fix shortly. ------ Regards, Alexander Korotkov Supabase
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Tom Lane
Date:
Alexander Korotkov <aekorotkov@gmail.com> writes: > On Thu, Nov 7, 2024 at 8:47 AM Tender Wang <tndrwang@gmail.com> wrote: >> Based on Tom's analysis, I provide a POC patch. I'm not sure if it is right >> to use DEFAULT_COLLATION_OID in the patch. > Thank you. But I'm not sure about DEFAULT_COLLATION_OID. I don't quite trust that either. But since we only care about equality, wouldn't it be OK to use C_COLLATION_OID? > Therefore, we can compare two text[] just with datumIsEqual(). > Attached patch implements this. I think this is nonsense. What about toasted datums, or even just short-header ones? The one coming from an on-disk tuple is pretty likely to be short-header for plausible sizes of the options, but the one we just constructed in memory will not be. regards, tom lane
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Tender Wang
Date:
Alexander Korotkov <aekorotkov@gmail.com> 于2024年11月8日周五 10:22写道:
On Fri, Nov 8, 2024 at 4:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > On Thu, Nov 7, 2024 at 8:47 AM Tender Wang <tndrwang@gmail.com> wrote:
> >> Based on Tom's analysis, I provide a POC patch. I'm not sure if it is right
> >> to use DEFAULT_COLLATION_OID in the patch.
>
> > Thank you. But I'm not sure about DEFAULT_COLLATION_OID.
>
> I don't quite trust that either. But since we only care about
> equality, wouldn't it be OK to use C_COLLATION_OID?
>
> > Therefore, we can compare two text[] just with datumIsEqual().
> > Attached patch implements this.
>
> I think this is nonsense. What about toasted datums, or even
> just short-header ones? The one coming from an on-disk tuple
> is pretty likely to be short-header for plausible sizes of
> the options, but the one we just constructed in memory will
> not be.
You are correct. I quickly skim trough the sources and didn't find a
function which compares detoasted contents of Datums excluding
headers. So, yes, comparison using C-collation seems the most
reasonable option.
No objection from me.
Thanks,
Tender Wang
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Tom Lane
Date:
Alexander Korotkov <aekorotkov@gmail.com> writes: > On Fri, Nov 8, 2024 at 4:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think this is nonsense. What about toasted datums, or even >> just short-header ones? > You are correct. I quickly skim trough the sources and didn't find a > function which compares detoasted contents of Datums excluding > headers. Right, there isn't one AFAIK. For context, datumIsEqual() is mainly meant to support cases such as comparison of Const nodes in equal(), where a) false match is unacceptable, but false no-match will at worst cause some performance loss; b) the two inputs have probably come through similar paths (eg, both are fresh from the parser, or both were on-disk), so that any toasting effects will be the same. Also c) for many of the callers, it'd be unsafe to perform disk access to allow accurate comparison of out-of-line values. Problem (a) would be all right for this case, and (c) too, but not so much (b). You can certainly imagine writing a more aggressive routine that can cope with detoasting, but I don't think we've yet found a case where that seemed worth the trouble. Partial solutions such as "cope with short-header but not compressed values" have even narrower possible use-cases. > So, yes, comparison using C-collation seems the most > reasonable option. WFM. But I'm concerned that you couldn't build a test case where the comparison fails. Seems like we need one, in view of this bug having escaped detection for so long. regards, tom lane
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Alexander Korotkov
Date:
On Fri, Nov 8, 2024 at 7:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <aekorotkov@gmail.com> writes: > > So, yes, comparison using C-collation seems the most > > reasonable option. > > WFM. But I'm concerned that you couldn't build a test case > where the comparison fails. Seems like we need one, in view > of this bug having escaped detection for so long. AFAICS, the only use case of CheckIndexCompatible() is ALTER TYPE. In that case we only serialize and deserialize opclass options without any involvement of opclass or data type. So, with the current usage scenario CompareOpclassOptions() is only a paranoid check, but CheckIndexCompatible() has to do it due to its general semantic. Thus, if we have to build a test case where CompareOpclassOptions() fails, I guess we need to write a new module for src/test/modules, which would call CheckIndexCompatible() under different circumstances. ------ Regards, Alexander Korotkov Supabase
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Tom Lane
Date:
Alexander Korotkov <aekorotkov@gmail.com> writes: > On Fri, Nov 8, 2024 at 7:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> WFM. But I'm concerned that you couldn't build a test case >> where the comparison fails. Seems like we need one, in view >> of this bug having escaped detection for so long. > AFAICS, the only use case of CheckIndexCompatible() is ALTER TYPE. In > that case we only serialize and deserialize opclass options without > any involvement of opclass or data type. So, with the current usage > scenario CompareOpclassOptions() is only a paranoid check, but > CheckIndexCompatible() has to do it due to its general semantic. OK. Well, let's settle for having a test case that at least runs the code, even if only the success path is taken. With the release freeze bearing down on us, do you want to wait till after the releases to fix this? It seems non-urgent given how long it took for anyone to notice. regards, tom lane
Re: BUG #18692: Segmentation fault when extending a varchar column with a gist index with custom signal length
From
Alexander Korotkov
Date:
On Fri, Nov 8, 2024 at 4:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <aekorotkov@gmail.com> writes: > > On Fri, Nov 8, 2024 at 7:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> WFM. But I'm concerned that you couldn't build a test case > >> where the comparison fails. Seems like we need one, in view > >> of this bug having escaped detection for so long. > > > AFAICS, the only use case of CheckIndexCompatible() is ALTER TYPE. In > > that case we only serialize and deserialize opclass options without > > any involvement of opclass or data type. So, with the current usage > > scenario CompareOpclassOptions() is only a paranoid check, but > > CheckIndexCompatible() has to do it due to its general semantic. > > OK. Well, let's settle for having a test case that at least runs > the code, even if only the success path is taken. > > With the release freeze bearing down on us, do you want to wait > till after the releases to fix this? It seems non-urgent given > how long it took for anyone to notice. Thank you for noticing this. I also think there is no urgency. And I will wait till the next release. ------ Regards, Alexander Korotkov Supabase