Why is lorikeet so unstable in v14 branch only? - Mailing list pgsql-hackers

From Tom Lane
Subject Why is lorikeet so unstable in v14 branch only?
Date
Msg-id 136102.1648320427@sss.pgh.pa.us
Whole thread Raw
Responses Re: Why is lorikeet so unstable in v14 branch only?
List pgsql-hackers
I chanced to notice that buildfarm member lorikeet has been
failing an awful lot lately in the v14 branch, but hardly
at all in other branches.  Here's a log extract from its
latest run [1]:

2022-03-26 06:31:47.245 EDT [623eeb93.d202:131] pg_regress/inherit LOG:  statement: create table mlparted_tab (a int, b
char,c text) partition by list (a); 
2022-03-26 06:31:47.247 EDT [623eeb93.d202:132] pg_regress/inherit LOG:  statement: create table mlparted_tab_part1
partitionof mlparted_tab for values in (1); 
2022-03-26 06:31:47.254 EDT [623eeb93.d203:60] pg_regress/vacuum LOG:  statement: VACUUM FULL pg_class;
2022-03-26 06:31:47.258 EDT [623eeb92.d201:90] pg_regress/typed_table LOG:  statement: SELECT a.attname,
      pg_catalog.format_type(a.atttypid, a.atttypmod),
      (SELECT pg_catalog.pg_get_expr(d.adbin, d.adrelid, true)
       FROM pg_catalog.pg_attrdef d
       WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
      a.attnotnull,
      (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t
       WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> t.typcollation) AS attcollation,
      a.attidentity,
      a.attgenerated
    FROM pg_catalog.pg_attribute a
    WHERE a.attrelid = '21770' AND a.attnum > 0 AND NOT a.attisdropped
    ORDER BY a.attnum;
*** starting debugger for pid 53762, tid 10536
2022-03-26 06:32:02.158 EDT [623eeb6c.d0c2:4] LOG:  server process (PID 53762) exited with exit code 127
2022-03-26 06:32:02.158 EDT [623eeb6c.d0c2:5] DETAIL:  Failed process was running: create table mlparted_tab_part1
partitionof mlparted_tab for values in (1); 
2022-03-26 06:32:02.158 EDT [623eeb6c.d0c2:6] LOG:  terminating any other active server processes

The failures are not all exactly like this one, but they're mostly in
CREATE TABLE operations nearby to this one.  I speculate what is happening
is that the "VACUUM FULL pg_class" is triggering some misbehavior in
concurrent partitioned-table creation.  The lack of failures in other
branches could be due to changes in the relative timing of the "vacuum"
and "inherit" test scripts.

Any chance we could get a stack trace from one of these crashes?

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2022-03-26%2010%3A17%3A22



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Pointer subtraction with a null pointer
Next
From: Tom Lane
Date:
Subject: Re: Pointer subtraction with a null pointer