Re: BUG #17704: Monitoring Queries trigger 'Signal 7: Bus Error' After Creating Hash Partitioned Table - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #17704: Monitoring Queries trigger 'Signal 7: Bus Error' After Creating Hash Partitioned Table
Date
Msg-id 6A41AC1A-18D8-4678-975D-2438722F08FC@anarazel.de
Whole thread Raw
In response to BUG #17704: Monitoring Queries trigger 'Signal 7: Bus Error' After Creating Hash Partitioned Table  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: Re: BUG #17704: Monitoring Queries trigger 'Signal 7: Bus Error' After Creating Hash Partitioned Table
List pgsql-bugs
Hi,

On December 1, 2022 1:23:06 PM PST, PG Bug reporting form <noreply@postgresql.org> wrote:
>The following bug has been logged on the website:
>
>Bug reference:      17704
>Logged by:          Garrett Everding
>Email address:      garrett@rave.io
>PostgreSQL version: 14.6
>Operating system:   Ubuntu 22.04
>Description:
>
>Hi,
>
>I'm not sure if this is the right place to post this and if this _isn't_ a
>bug and is instead a misconfiguration I can repost somewhere else.

Misconfiguration shouldn't trigger this. It's a bug somewhere, but not yet obvious where (could be one of your
extensions).


>Today I rolled out a new hash partitioned table with 1024 partitions to our
>production DB (PostgreSQL 14.6 (Ubuntu 14.6-1.pgdg22.04+1) on
>aarch64-unknown-linux-gnu, compiled by gcc (Ubuntu 11.3.0-1ubuntu1~22.04)
>11.3.0, 64-bit). Previously we had been using pganalyze2 and pgmetrics with
>no issue. Starting with the deployment of the new partitioned table anytime
>pgmetrics or pganalyze2 would run we would get a SIGBUS and postgresql would
>crash.
>
>```
>2022-12-01 17:48:29 UTC [915447]: user=,db=,app=,client= LOG:  server
>process (PID 1341581) was terminated by signal 7: Bus error
>2022-12-01 17:48:29 UTC [915447]: user=,db=,app=,client= DETAIL:  Failed
>process was running: SET lock_timeout TO '100ms';SET statement_timeout TO
>'5s';
>        select
>          (extract(epoch from now()) * 1e9)::int8 as epoch_ns,
>          quote_ident(table_schema)||'.'||quote_ident(table_name) as
>tag_table,
>          md5((array_agg((c.*)::text order by ordinal_position))::text)
>        from (
>         SELECT current_database()::information_schema.sql_identifier AS
>table_catalog,
>            nc.nspname::information_schema.sql_identifier AS table_schema,
>            c.relname::information_schema.sql_identifier AS table_name,
>            a.attname::information_schema.sql_identifier AS column_name,
>            a.attnum::information_schema.cardinal_number AS
>ordinal_position,
>            pg_get_expr(ad.adbin,
>ad.adrelid)::information_schema.character_data AS column_default,
>                CASE
>                    WHEN a.attnotnull OR t.typtype = 'd'::"char" AND
>t.typnotnull THEN 'NO'::text
>                    ELSE 'YES'::text
>                END::information_schema.yes_or_no AS is_nullable,
>                CASE
>                    WHEN t.typtype = 'd'::"char" THEN
>                    CASE
>                        WHEN bt.typelem <> 0::oid AND bt.typlen
>2022-12-01 17:48:29 UTC [915447]: user=,db=,app=,client= LOG:  terminating
>any other active server processes
>```
>2022-12-01 18:21:43 UTC [915447]: user=,db=,app=,client= LOG:  server
>process (PID 1347091) was terminated by signal 7: Bus error
>2022-12-01 18:21:43 UTC [915447]: user=,db=,app=,client= DETAIL:  Failed
>process was running:
>        SELECT
>          current_database() AS db, schemaname, tablename, reltuples::bigint
>AS tups, relpages::bigint AS pages, otta,
>          ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN
>0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
>          CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END
>AS wastedpages,
>          CASE WHEN relpages < otta THEN 0 ELSE
>bs*(sml.relpages-otta)::bigint END AS wastedbytes,
>          CASE WHEN relpages < otta THEN '0 bytes'::text ELSE
>(bs*(relpages-otta))::bigint::text || ' bytes' END AS wastedsize,
>          iname, ituples::bigint AS itups, ipages::bigint AS ipages,
>iotta,
>          ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE
>ipages/iotta::numeric END,1) AS ibloat,
>          CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS
>wastedipages,
>          CASE WHEN ipages < iotta THEN 0 ELSE (bs*(ipages-iotta))::bigint
>END AS wastedibytes,
>          CASE WHEN ipages < iotta THEN '0 bytes' ELSE
>(bs*(ipages-iotta))::bigint::text || ' bytes' END AS wastedisize,
>          CASE WHEN relpages < ott
>2022-12-01 18:21:43 UTC [915447]: user=,db=,app=,client= LOG:  terminating
>any other active server processes
>```

Unfortunately that's not pointing anywhere obvious. We'll need a backtrace for the crash at least.

I'm on my phone rn, so no link. But there's an article on the wiki with information about collecting a backtrace.

Regards,

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #17704: Monitoring Queries trigger 'Signal 7: Bus Error' After Creating Hash Partitioned Table
Next
From: Tom Lane
Date:
Subject: Re: Re: BUG #17704: Monitoring Queries trigger 'Signal 7: Bus Error' After Creating Hash Partitioned Table