Thread: Buildfarm animal grassquit is failing

Buildfarm animal grassquit is failing

From
Tom Lane
Date:
grassquit has been failing to run the regression tests for the last
few days, since [1]:

# +++ regress check in src/test/regress +++
# using temp instance on port 6880 with PID 766305
ERROR:  stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack
depthlimit is adequate. 
ERROR:  stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack
depthlimit is adequate. 
# command failed: "psql" -X -q -c "CREATE DATABASE \\"regression\\" TEMPLATE=template0 LOCALE='C'" -c "ALTER DATABASE
\\"regression\\"SET lc_messages TO 'C';ALTER DATABASE \\"regression\\" SET lc_monetary TO 'C';ALTER DATABASE
\\"regression\\"SET lc_numeric TO 'C';ALTER DATABASE \\"regression\\" SET lc_time TO 'C';ALTER DATABASE
\\"regression\\"SET bytea_output TO 'hex';ALTER DATABASE \\"regression\\" SET timezone_abbreviations TO 'Default';"
"postgres"

This seems to be happening in all branches, so I doubt it
has anything to do with recent PG commits.  Instead, I notice
that grassquit seems to have been updated to a newer Debian
release: 6.1.0-6-amd64 works, 6.3.0-2-amd64 doesn't.

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2023-07-23%2011%3A05%3A50



Re: Buildfarm animal grassquit is failing

From
Andres Freund
Date:
Hi,

On 2023-07-25 20:10:06 -0400, Tom Lane wrote:
> grassquit has been failing to run the regression tests for the last
> few days, since [1]:
> 
> # +++ regress check in src/test/regress +++
> # using temp instance on port 6880 with PID 766305
> ERROR:  stack depth limit exceeded
> HINT:  Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack
depthlimit is adequate.
 
> ERROR:  stack depth limit exceeded
> HINT:  Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack
depthlimit is adequate.
 
> # command failed: "psql" -X -q -c "CREATE DATABASE \\"regression\\" TEMPLATE=template0 LOCALE='C'" -c "ALTER DATABASE
\\"regression\\"SET lc_messages TO 'C';ALTER DATABASE \\"regression\\" SET lc_monetary TO 'C';ALTER DATABASE
\\"regression\\"SET lc_numeric TO 'C';ALTER DATABASE \\"regression\\" SET lc_time TO 'C';ALTER DATABASE
\\"regression\\"SET bytea_output TO 'hex';ALTER DATABASE \\"regression\\" SET timezone_abbreviations TO 'Default';"
"postgres"
> 
> This seems to be happening in all branches, so I doubt it
> has anything to do with recent PG commits.  Instead, I notice
> that grassquit seems to have been updated to a newer Debian
> release: 6.1.0-6-amd64 works, 6.3.0-2-amd64 doesn't.

Indeed, the automated updates had been stuck, and I fixed that a few days
ago. I missed grassquit's failures unfortunately.

I think I know what the issue is - I have seen them locally. Newer versions of
gcc and clang (or libasan?) default to enabling "stack use after return"
checks - for some reason that implies using a shadow stack *sometimes*. Which
can confuse our stack depth checking terribly, not so surprisingly.

I've been meaning to look into what we could do to fix that, but I haven't
found the cycles...

For now I added :detect_stack_use_after_return=0 to ASAN_OPTIONS, which I
think should fix the issue.

Greetings,

Andres Freund