Hi,
On 2023-07-25 20:10:06 -0400, Tom Lane wrote:
> grassquit has been failing to run the regression tests for the last
> few days, since [1]:
>
> # +++ regress check in src/test/regress +++
> # using temp instance on port 6880 with PID 766305
> ERROR: stack depth limit exceeded
> HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack
depthlimit is adequate.
> ERROR: stack depth limit exceeded
> HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack
depthlimit is adequate.
> # command failed: "psql" -X -q -c "CREATE DATABASE \\"regression\\" TEMPLATE=template0 LOCALE='C'" -c "ALTER DATABASE
\\"regression\\"SET lc_messages TO 'C';ALTER DATABASE \\"regression\\" SET lc_monetary TO 'C';ALTER DATABASE
\\"regression\\"SET lc_numeric TO 'C';ALTER DATABASE \\"regression\\" SET lc_time TO 'C';ALTER DATABASE
\\"regression\\"SET bytea_output TO 'hex';ALTER DATABASE \\"regression\\" SET timezone_abbreviations TO 'Default';"
"postgres"
>
> This seems to be happening in all branches, so I doubt it
> has anything to do with recent PG commits. Instead, I notice
> that grassquit seems to have been updated to a newer Debian
> release: 6.1.0-6-amd64 works, 6.3.0-2-amd64 doesn't.
Indeed, the automated updates had been stuck, and I fixed that a few days
ago. I missed grassquit's failures unfortunately.
I think I know what the issue is - I have seen them locally. Newer versions of
gcc and clang (or libasan?) default to enabling "stack use after return"
checks - for some reason that implies using a shadow stack *sometimes*. Which
can confuse our stack depth checking terribly, not so surprisingly.
I've been meaning to look into what we could do to fix that, but I haven't
found the cycles...
For now I added :detect_stack_use_after_return=0 to ASAN_OPTIONS, which I
think should fix the issue.
Greetings,
Andres Freund