BUG #15367: Crash in pg_fe_scram_free when using foreign tables - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Date
Msg-id 153626613985.23143.4743626885618266803@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      15367
Logged by:          Jeremy Evans
Email address:      code@jeremyevans.net
PostgreSQL version: 10.5
Operating system:   OpenBSD 6.3 amd64
Description:

One of the postgres processes in our production environment crashed with the
following backtrace:

(gdb) bt
#0  thrkill () at -:3
#1  0x000004a14ca47fbe in _libc_abort () at
/usr/src/lib/libc/stdlib/abort.c:51
#2  0x000004a14c9fd029 in wrterror (d=Variable "d" is not available.
) at /usr/src/lib/libc/stdlib/malloc.c:288
#3  0x000004a14c9fd34f in ofree (argpool=Variable "argpool" is not
available.
) at /usr/src/lib/libc/stdlib/malloc.c:1298
#4  0x000004a14c9fd109 in free (ptr=0x4a115aec398) at
/usr/src/lib/libc/stdlib/malloc.c:1416
#5  0x000004a09a7301e8 in pg_fe_scram_free () from
/usr/local/lib/libpq.so.6.10
#6  0x000004a09a73561f in pqPacketSend () from
/usr/local/lib/libpq.so.6.10
#7  0x000004a09a731507 in PQfinish () from /usr/local/lib/libpq.so.6.10
#8  0x000004a074dc5549 in GetConnection () from
/usr/local/lib/postgresql/postgres_fdw.so
#9  0x000004a074dbc02a in postgresBeginForeignScan () from
/usr/local/lib/postgresql/postgres_fdw.so
#10 0x0000049e62eab412 in ExecInitForeignScan () from
/usr/local/bin/postgres
#11 0x0000049e62e89ace in ExecInitNode () from /usr/local/bin/postgres
#12 0x0000049e62e9aae8 in ExecInitHashJoin () from /usr/local/bin/postgres
#13 0x0000049e62e89b3e in ExecInitNode () from /usr/local/bin/postgres
#14 0x0000049e62ea7a06 in ExecInitSort () from /usr/local/bin/postgres
#15 0x0000049e62e89aee in ExecInitNode () from /usr/local/bin/postgres
#16 0x0000049e62e85379 in standard_ExecutorStart () from
/usr/local/bin/postgres
#17 0x0000049e62fc4ffb in PortalStart () from /usr/local/bin/postgres
#18 0x0000049e62fc43d5 in exec_simple_query () from
/usr/local/bin/postgres
#19 0x0000049e62fc250b in PostgresMain () from /usr/local/bin/postgres
#20 0x0000049e62f473c5 in PostmasterMain () from /usr/local/bin/postgres
#21 0x0000049e62ec986b in main () from /usr/local/bin/postgres

The PostgreSQL logfile showed:

postgres(64978) in free(): bogus pointer (double free?) 0x4a115aec398
2018-09-06 12:01:52.202 PDT [45953] LOG:  server process (PID 64978) was
terminated by signal 6: Abort trap

We are using foreign tables to other databases in the same cluster, and all
databases are set to require SCRAM authentication.  We've been using SCRAM
authentication for about 2 months, and foreign tables for a few years
without problems.

PostgreSQL version: PostgreSQL 10.5 on x86_64-unknown-openbsd6.3, compiled
by OpenBSD clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM
5.0.1), 64-bit

PostgreSQL was installed via OpenBSD ports, using the 6.3-stable branch.

PostgreSQL configuration file changes:

 DateStyle                  | ISO, MDY           | configuration file
 default_text_search_config | pg_catalog.english | configuration file
 dynamic_shared_memory_type | posix              | configuration file
 lc_messages                | C                  | configuration file
 lc_monetary                | C                  | configuration file
 lc_numeric                 | C                  | configuration file
 lc_time                    | C                  | configuration file
 listen_addresses           | *                  | configuration file
 log_timezone               | US/Pacific         | configuration file
 max_connections            | 100                | configuration file
 max_stack_depth            | 2MB                | environment variable
 password_encryption        | scram-sha-256      | configuration file
 shared_buffers             | 128MB              | configuration file
 ssl                        | on                 | configuration file
 TimeZone                   | US/Pacific         | configuration file

Operating system: OpenBSD 6.3 amd64, with OpenBSD syspatches up to 016:
OpenBSD bethpage.bsa.ca.gov 6.3 GENERIC.MP#8 amd64

As the backtrace shows the error was during the closing of a connection,
this appears to be unrelated to any specific query.  The backtrace also
shows withis is due to the use of foreign tables, and thus likely
independent of what client program was used to connect.  In case it does
matter, it was using libpq via the ruby-pg library, and no connection pool
or load balancer was being used.

Nothing unusual in the PostgreSQL logs other than the the crash and the
typical crash recovery.

Hardware is a HP DL380p Gen8 server with 256GB of ECC memory and 2 800GB
SSDs in hardware RAID-1, with 2 Xeon E5-2670 CPUs. Due to the crash, this is
unlikely to be disk/RAID related.  General memory corruption is also
unlikely as the memory is ECC.

If necessary I can build a debug version of PostgreSQL and try using that in
production so I can get a better backtrace if it crashes again. However,
considering that the crash is rare in my environment, it's unlikely I will
be able to produce a better backtrace for the error quickly.


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #15366: Rails issues with postgres 10
Next
From: Tom Lane
Date:
Subject: Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables