Re: BUG #16833: postgresql 13.1 process crash every hour - Mailing list pgsql-bugs

From Alex F
Subject Re: BUG #16833: postgresql 13.1 process crash every hour
Date
Msg-id CAGbr_zXBK08XdeusNBJrF-sEP9tYSToc7o1wKphgSu2gWu+PaA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16833: postgresql 13.1 process crash every hour  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: BUG #16833: postgresql 13.1 process crash every hour  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs
Dear Peter,
First of all thanks for your input with the upcoming fix. Anyway application shouldn't crash with segfault, just log error.

Another point that I should mention - amcheck extension and "magic" query which can help us to find a broken index. Without mentioned queries it was absolutely unclear why the application crashed.

Is it possible to extend the error log which can help to understand what exactly went wrong?
For example, if error log look like this:
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= LOG:  server process (PID 22273) was terminated by signal 11: Segmentation fault
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= DETAIL:  Failed process was running: REFRESH MATERIALIZED VIEW CONCURRENTLY project.product_master_mv
 ***CAUSED BY violated for index "name_original_idx_s"***
e.g. trace marked with *** symbols can really help user to understand issue root cause and significantly decrease database recovery time.
In my case I had to create a separate VM, create a database from scratch and recover it from pg_dump. Unfortunately mentioned actions took a significant downtime.

In case of master-standby configuration WAL replication does not save standby servers from broken objects (broken index in described case).
Please advice is it possible to use logical replication here? From my understanding logical replication shouldn't push broken objects on standby.

Thanks for your support!
сб, 15 мая 2021 г. в 03:10, Peter Geoghegan <pg@bowt.ie>:
On Fri, May 14, 2021 at 1:13 PM Alex F <phoedos16@gmail.com> wrote:
> Thanks for your support!

I just pushed a commit that adds hardening that will be sufficient to
prevent this being a hard crash. Of course the index should not become
corrupt in the first place, but at least in Postgres 13.4 the same
scenario will result in an error rather than in a hard crash.

Thanks
--
Peter Geoghegan

pgsql-bugs by date:

Previous
From: Eric Thinnes
Date:
Subject: Re: Segmentation fault when calling BlessTupleDesc in a C function in parallel on PostgreSQL-(12.6, 12.7, 13.2, 13.3)
Next
From: Yura Sokolov
Date:
Subject: Re: BUG #17005: Enhancement request: Improve walsender throughput by aggregating multiple messages in one send