Home > mailing lists

Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: Support for N synchronous standby servers - take 2
Date	March 30, 2016 14:56:27
Msg-id	CAD21AoDdC9Ek3E=_dfLi-yHLDYZpitid_Z=b+Z62uqATHbaZJQ@mail.gmail.com Whole thread
In response to	Re: Support for N synchronous standby servers - take 2 (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses	Re: Support for N synchronous standby servers - take 2
List	pgsql-hackers

Tree view

On Wed, Mar 30, 2016 at 11:43 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Tue, Mar 29, 2016 at 5:36 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> I personally don't think it needs such a survive measure. It is
>> very small syntax and the parser reads very short text. If the
>> parser failes in such mode, something more serious should have
>> occurred.
>>
>> At Tue, 29 Mar 2016 16:51:02 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in
<CAHGQGwFth8pnYhaLBx0nF8o4qmwctdzEOcWRqEu7HOwgdJGa3g@mail.gmail.com>
>>> On Tue, Mar 29, 2016 at 4:23 PM, Kyotaro HORIGUCHI
>>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>> > Hello,
>>> >
>>> > At Mon, 28 Mar 2016 18:38:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoAJMDV1EUKMfeyaV24arx4pzUjGHYbY4ZxzKpkiCUvh0Q@mail.gmail.com>
>>> > sawada.mshk> On Mon, Mar 28, 2016 at 5:50 PM, Kyotaro HORIGUCHI
>>> >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>> > As mentioned in my comment, SQL parser converts yy_fatal_error
>>> > into ereport(ERROR), which can be caught by the upper PG_TRY (by
>>> > #define'ing fprintf). So it is doable if you mind exit().
>>>
>>> I'm afraid that your idea doesn't work in postmaster. Because ereport(ERROR) is
>>> implicitly promoted to ereport(FATAL) in postmaster. IOW, when an internal
>>> flex fatal error occurs, postmaster just exits instead of jumping out of parser.
>>
>> If The ERROR may be LOG or DEBUG2 either, if we think the parser
>> fatal erros are recoverable. guc-file.l is doing so.
>>
>>> ISTM that, when an internal flex fatal error occurs, it's
>>> better to elog(FATAL) and terminate the problematic
>>> process. This might lead to the server crash (e.g., if
>>> postmaster emits a FATAL error, it and its all child processes
>>> will exit soon). But probably we can live with this because the
>>> fatal error basically rarely happens.
>>
>> I agree to this
>>
>>> OTOH, if we make the process keep running even after it gets an internal
>>> fatal error (like Sawada's patch or your idea do), this might cause more
>>> serious problem. Please imagine the case where one walsender gets the fatal
>>> error (e.g., because of OOM), abandon new setting value of
>>> synchronous_standby_names, and keep running with the previous setting value.
>>> OTOH, the other walsender processes successfully parse the setting and
>>> keep running with new setting. In this case, the inconsistency of the setting
>>> which each walsender is based on happens. This completely will mess up the
>>> synchronous replication.
>>
>> On the other hand, guc-file.l seems ignoring parser errors under
>> normal operation, even though it may cause similar inconsistency,
>> if any..
>>
>> | LOG:  received SIGHUP, reloading configuration files
>> | LOG:  input in flex scanner failed at file "/home/horiguti/data/data_work/postgresql.conf" line 1
>> | LOG:  configuration file "/home/horiguti/data/data_work/postgresql.conf" contains errors; no changes were applied
>>
>>> Therefore, I think that it's better to make the problematic process exit
>>> with FATAL error rather than ignore the error and keep it running.
>>
>> +1. Restarting walsender would be far less harmful than keeping
>> it running in doubtful state.
>>
>> Sould I wait for the next version or have a look on the latest?
>>
>
> Attached latest patch incorporate some review comments so far, and is
> rebased against current HEAD.
>

Sorry I attached wrong patch.
Attached patch is correct patch.

Regards,

--
Masahiko Sawada

Attachment

multi_sync_replication_v21.patch

pgsql-hackers by date:

From: Masahiko Sawada
Date: 30 March 2016, 14:44:27
Subject: Re: Support for N synchronous standby servers - take 2

From: Aleksander Alekseev
Date: 30 March 2016, 14:59:55
Subject: Re: WIP: Access method extendability

Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

Attachment

Previous

Next