Thread: BUG #17926: Segfault in SELECT
The following bug has been logged on the website: Bug reference: 17926 Logged by: Erik Oomen Email address: erik@nib4.nl PostgreSQL version: 15.2 Operating system: Debian11 bullseye arm64 Description: We are facing an issue where a SELECT can cause a segfault, unfortunately the backtrace is not useful: Reading symbols from /lib/aarch64-linux-gnu/libffi.so.7... Reading symbols from /root/.cache/debuginfod_client/b07109d62d007af8c6bffe3d76561e03711a1a6c/debuginfo... 0x0000ffff88d1311c in __GI_epoll_pwait (epfd=4, events=0xaaab109dd330, maxevents=1, timeout=timeout@entry=-1, set=set@entry=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:42 Download failed: Invalid argument. Continuing without source file ./misc/../sysdeps/unix/sysv/linux/epoll_pwait.c. 42 ../sysdeps/unix/sysv/linux/epoll_pwait.c: No such file or directory. (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00010000017fedec in ?? () (gdb) bt #0 0x00010000017fedec in ?? () #1 0x0000ffff017febd0 in ?? () #2 0x0000aaab10af7fa8 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) The query uses a 3 partitions of a table where all fields have an brin index. This query will segfault: SELECT count(*) FROM dw WHERE (dw.ts >= '2022-09-01' AND dw.ts <= '2022-10-30') AND dw.source='type3' AND dw.customer='123.456'; The query should return 0 because the customer does not exist. Removing the dw.customer or dw.source constraint will make the segfault not occur. Also using 'set enable_bitmapscan to off' will not trigger the segfault. Details: PostgreSQL 15.2 (Debian 15.2-1.pgdg110+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
> On 8 May 2023, at 22:12, PG Bug reporting form <noreply@postgresql.org> wrote: > The query uses a 3 partitions of a table where all fields have an brin > index. One index per column or a multi-colum index? > This query will segfault: > SELECT count(*) > FROM dw > WHERE (dw.ts >= '2022-09-01' AND dw.ts <= '2022-10-30') > AND dw.source='type3' AND dw.customer='123.456'; > > The query should return 0 because the customer does not exist. > Removing the dw.customer or dw.source constraint will make the segfault not > occur. > Also using 'set enable_bitmapscan to off' will not trigger the segfault. I was unable to reproduce this, can you share the table schema with the index creations (or a version of the schema which reproduce the problem). Have you tried recreating the indexes, in case there is corruption, or tried it on a table which has the same schema but no index? -- Daniel Gustafsson
> Op 9 mei 2023, om 10:38 heeft Daniel Gustafsson <daniel@yesql.se> het volgende geschreven: > >> On 8 May 2023, at 22:12, PG Bug reporting form <noreply@postgresql.org> wrote: > >> The query uses a 3 partitions of a table where all fields have an brin >> index. > > One index per column or a multi-colum index? One index per column. However, if I add a multicolumn index (on ts and source) it will not crash. > >> This query will segfault: >> SELECT count(*) >> FROM dw >> WHERE (dw.ts >= '2022-09-01' AND dw.ts <= '2022-10-30') >> AND dw.source='type3' AND dw.customer='123.456'; >> >> The query should return 0 because the customer does not exist. >> Removing the dw.customer or dw.source constraint will make the segfault not >> occur. >> Also using 'set enable_bitmapscan to off' will not trigger the segfault. > > I was unable to reproduce this, can you share the table schema with the index > creations (or a version of the schema which reproduce the problem). > create table if not exists dw ( activity_id bigserial, activity text not null, ts timestamptz not null, customer text, impact int default 0, feature_1 text, feature_2 text, feature_3 jsonb, link text, source text, source_id text, environment text ) partition by range(ts); CREATE INDEX IF NOT EXISTS index_activity ON dw USING brin(activity); CREATE INDEX IF NOT EXISTS index_customer ON dw USING brin(customer); CREATE INDEX IF NOT EXISTS index_source ON dw USING brin(source); CREATE INDEX IF NOT EXISTS index_source_id ON dw USING brin(source_id); CREATE INDEX IF NOT EXISTS index_ts ON dw USING brin(ts); CREATE INDEX IF NOT EXISTS index_environment ON dw USING brin(environment); > Have you tried recreating the indexes, in case there is corruption, or tried it > on a table which has the same schema but no index? It still segfaults after recreating the indexes, Without index no segfault. > > -- > Daniel Gustafsson > > Erik.
On 2023-May-08, PG Bug reporting form wrote: > Program received signal SIGSEGV, Segmentation fault. > 0x00010000017fedec in ?? () > (gdb) bt > #0 0x00010000017fedec in ?? () > #1 0x0000ffff017febd0 in ?? () > #2 0x0000aaab10af7fa8 in ?? () > Backtrace stopped: previous frame inner to this frame (corrupt stack?) > > The query uses a 3 partitions of a table where all fields have an brin > index. > This query will segfault: > SELECT count(*) > FROM dw > WHERE (dw.ts >= '2022-09-01' AND dw.ts <= '2022-10-30') > AND dw.source='type3' AND dw.customer='123.456'; > > The query should return 0 because the customer does not exist. > Removing the dw.customer or dw.source constraint will make the segfault not > occur. > Also using 'set enable_bitmapscan to off' will not trigger the segfault. I'm not sure how to go about debugging this problem -- without a stack trace, we don't know where to look. I tried to reproduce it without success, but I didn't try to add any data. Which partitions do you have, and how much data? Does EXPLAIN (without ANALYZE) work, and if so what does it report? Would it be possible for you to run the query under 'rr record'? There are some instructions here: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Recording_Postgres_using_rr_Record_and_Replay_Framework One possibility is that the index structure is corrupted in some way. pageinspect's functions be useful, but I suppose you'd have to scan the whole index in order to find what went wrong. We don't have any tooling for that ... I guess amcheck support would be a nice addition. I guess the other option is that things are failing at optimizer or executor setup time. Having one index per column is unusual. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Las navajas y los monos deben estar siempre distantes" (Germán Poo)