Thread: auto_explain causes cluster crash if pg_ctl reload is used (not pg_ctl restart)

Tried on both
PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
and
PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit

shared_preload_libraries = 'pg_stat_statements,auto_explain'
custom_variable_classes = 'auto_explain'
auto_explain.log_min_duration = '10s'
auto_explain.log_analyze = true
auto_explain.log_buffers = true

I was testing to see if any of the settings above got applied after
issuing a "pg_ctl reload" rather than a restart, and I was surprised
to see that I could crash my db cluster. I realize that the docs say
to issue a restart, but the crash seems a tad user-unfriendly.

Any other details I should provide?

Re: auto_explain causes cluster crash if pg_ctl reload is used (not pg_ctl restart)

From
Heikki Linnakangas
Date:
On 25.10.2011 18:42, bricklen wrote:
> Tried on both
> PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc
> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
> and
> PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc
> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
>
> shared_preload_libraries = 'pg_stat_statements,auto_explain'
> custom_variable_classes = 'auto_explain'
> auto_explain.log_min_duration = '10s'
> auto_explain.log_analyze = true
> auto_explain.log_buffers = true
>
> I was testing to see if any of the settings above got applied after
> issuing a "pg_ctl reload" rather than a restart, and I was surprised
> to see that I could crash my db cluster. I realize that the docs say
> to issue a restart, but the crash seems a tad user-unfriendly.
>
> Any other details I should provide?

A backtrace from the core dump would be help a lot. Can you get one?
More precise instructions on how to reproduce this would also be nice.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
On Tue, Oct 25, 2011 at 9:22 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 25.10.2011 18:42, bricklen wrote:
>>
>> Tried on both
>> PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc
>> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
>> and
>> PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc
>> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
>>
>> shared_preload_libraries = 'pg_stat_statements,auto_explain'
>> custom_variable_classes = 'auto_explain'
>> auto_explain.log_min_duration = '10s'
>> auto_explain.log_analyze = true
>> auto_explain.log_buffers = true
>>
>> I was testing to see if any of the settings above got applied after
>> issuing a "pg_ctl reload" rather than a restart, and I was surprised
>> to see that I could crash my db cluster. I realize that the docs say
>> to issue a restart, but the crash seems a tad user-unfriendly.
>>
>> Any other details I should provide?
>
> A backtrace from the core dump would be help a lot. Can you get one? More
> precise instructions on how to reproduce this would also be nice.

I tried generating a core dump according to the steps at
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
but either I was doing it incorrectly, or it wasn't dumping core
anywhere I could find.

The steps to reproduce are pretty simple.

Set the following in the postgresql.conf file:

shared_preload_libraries = 'pg_stat_statements,auto_explain'
custom_variable_classes = 'auto_explain'
auto_explain.log_min_duration = '10s'
auto_explain.log_analyze = true
auto_explain.log_buffers = true

As the postgres user, issue "pg_ctl reload"

pg_ctl status will now show that there is no running postmaster.  I
won't have access to those servers for at least another couple of
hours, but if there are some steps I should try to get a core dump I'm
willing to try them when the servers are free again.

The servers are running on CentOS 5.7 and CentOS 5.6, with the
packages from http://yum.pgrpms.org/9.0/redhat/
bricklen <bricklen@gmail.com> writes:
> The steps to reproduce are pretty simple.

> Set the following in the postgresql.conf file:

> shared_preload_libraries = 'pg_stat_statements,auto_explain'
> custom_variable_classes = 'auto_explain'
> auto_explain.log_min_duration = '10s'
> auto_explain.log_analyze = true
> auto_explain.log_buffers = true

> As the postgres user, issue "pg_ctl reload"

> pg_ctl status will now show that there is no running postmaster.

This looks like the same thing as bug #6097, which is fixed in 9.0.5.

            regards, tom lane
On Tue, Oct 25, 2011 at 2:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> bricklen <bricklen@gmail.com> writes:
>> The steps to reproduce are pretty simple.
>
>> Set the following in the postgresql.conf file:
>
>> shared_preload_libraries =3D 'pg_stat_statements,auto_explain'
>> custom_variable_classes =3D 'auto_explain'
>> auto_explain.log_min_duration =3D '10s'
>> auto_explain.log_analyze =3D true
>> auto_explain.log_buffers =3D true
>
>> As the postgres user, issue "pg_ctl reload"
>
>> pg_ctl status will now show that there is no running postmaster.
>
> This looks like the same thing as bug #6097, which is fixed in 9.0.5.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0regards, tom lane
>

Yup, you nailed it. I upgraded to 9.0.5 and the earlier changes no
longer trigger the crash.

Thanks!