Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Support for N synchronous standby servers - take 2 |
Date | |
Msg-id | CAD21AoDdC9Ek3E=_dfLi-yHLDYZpitid_Z=b+Z62uqATHbaZJQ@mail.gmail.com Whole thread Raw |
In response to | Re: Support for N synchronous standby servers - take 2 (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Support for N synchronous standby servers - take 2
|
List | pgsql-hackers |
On Wed, Mar 30, 2016 at 11:43 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Tue, Mar 29, 2016 at 5:36 PM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >> I personally don't think it needs such a survive measure. It is >> very small syntax and the parser reads very short text. If the >> parser failes in such mode, something more serious should have >> occurred. >> >> At Tue, 29 Mar 2016 16:51:02 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in <CAHGQGwFth8pnYhaLBx0nF8o4qmwctdzEOcWRqEu7HOwgdJGa3g@mail.gmail.com> >>> On Tue, Mar 29, 2016 at 4:23 PM, Kyotaro HORIGUCHI >>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >>> > Hello, >>> > >>> > At Mon, 28 Mar 2016 18:38:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoAJMDV1EUKMfeyaV24arx4pzUjGHYbY4ZxzKpkiCUvh0Q@mail.gmail.com> >>> > sawada.mshk> On Mon, Mar 28, 2016 at 5:50 PM, Kyotaro HORIGUCHI >>> >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >>> > As mentioned in my comment, SQL parser converts yy_fatal_error >>> > into ereport(ERROR), which can be caught by the upper PG_TRY (by >>> > #define'ing fprintf). So it is doable if you mind exit(). >>> >>> I'm afraid that your idea doesn't work in postmaster. Because ereport(ERROR) is >>> implicitly promoted to ereport(FATAL) in postmaster. IOW, when an internal >>> flex fatal error occurs, postmaster just exits instead of jumping out of parser. >> >> If The ERROR may be LOG or DEBUG2 either, if we think the parser >> fatal erros are recoverable. guc-file.l is doing so. >> >>> ISTM that, when an internal flex fatal error occurs, it's >>> better to elog(FATAL) and terminate the problematic >>> process. This might lead to the server crash (e.g., if >>> postmaster emits a FATAL error, it and its all child processes >>> will exit soon). But probably we can live with this because the >>> fatal error basically rarely happens. >> >> I agree to this >> >>> OTOH, if we make the process keep running even after it gets an internal >>> fatal error (like Sawada's patch or your idea do), this might cause more >>> serious problem. Please imagine the case where one walsender gets the fatal >>> error (e.g., because of OOM), abandon new setting value of >>> synchronous_standby_names, and keep running with the previous setting value. >>> OTOH, the other walsender processes successfully parse the setting and >>> keep running with new setting. In this case, the inconsistency of the setting >>> which each walsender is based on happens. This completely will mess up the >>> synchronous replication. >> >> On the other hand, guc-file.l seems ignoring parser errors under >> normal operation, even though it may cause similar inconsistency, >> if any.. >> >> | LOG: received SIGHUP, reloading configuration files >> | LOG: input in flex scanner failed at file "/home/horiguti/data/data_work/postgresql.conf" line 1 >> | LOG: configuration file "/home/horiguti/data/data_work/postgresql.conf" contains errors; no changes were applied >> >>> Therefore, I think that it's better to make the problematic process exit >>> with FATAL error rather than ignore the error and keep it running. >> >> +1. Restarting walsender would be far less harmful than keeping >> it running in doubtful state. >> >> Sould I wait for the next version or have a look on the latest? >> > > Attached latest patch incorporate some review comments so far, and is > rebased against current HEAD. > Sorry I attached wrong patch. Attached patch is correct patch. Regards, -- Masahiko Sawada
Attachment
pgsql-hackers by date: