Thread: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

Hi,

"pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

How to reproduce:
1.Start PostgreSQL
2.Create slot
3.Specify --slot option to pg_receivexlog and start it
4.Stop PostgreSQL

Example commands for reproduce:
1.pg_ctl start
2.psql -c "select pg_create_physical_replication_slot('test');"
3.pg_receivexlog --slot='test' -D ./testdir
4.pg_ctl stop

This happen cause --slot option set a flush location to feedback messages,
but it only flush when wal has switched, so walsender wait for the WAL have been replicated forever.

When --slot option is not specified, 'invalid' is set to flush location and it use write location to check replication
soit can stop the process propley.
 

So I thought set 'invalid' to flush location as well in this case, this problem will be solved.
Does --slot option need to set flush location?
Thought?

Regards,

--
Furuya Osamu

On Fri, Nov 14, 2014 at 7:22 PM,  <furuyao@pm.nttdata.co.jp> wrote:
> Hi,
>
> "pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
> These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

I could reproduce this problem. At normal shutdown, walsender keeps waiting
for the last WAL record to be replicated and flushed in pg_receivexlog. But
pg_receivexlog issues sync command only when WAL file is switched. Thus,
since pg_receivexlog may never flush the last WAL record, walsender may have
to keep waiting infinitely.

pg_recvlogical handles this problem by calling fsync() when it receives the
request of immediate reply from the server. That is, at shutdown, walsender
sends the request, pg_receivexlog receives it, flushes the last WAL record,
and sends the flush location back to the server. Since walsender can see that
the last WAL record is successfully flushed in pg_receivexlog, it can
exit cleanly.

One idea to the problem is to introduce the same logic as pg_recvlogical has,
to pg_receivexlog. Thought?

Regards,

-- 
Fujii Masao



On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:
> On Fri, Nov 14, 2014 at 7:22 PM,  <furuyao@pm.nttdata.co.jp> wrote:
> > Hi,
> >
> > "pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
> > These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.
> 
> I could reproduce this problem. At normal shutdown, walsender keeps waiting
> for the last WAL record to be replicated and flushed in pg_receivexlog. But
> pg_receivexlog issues sync command only when WAL file is switched. Thus,
> since pg_receivexlog may never flush the last WAL record, walsender may have
> to keep waiting infinitely.

Right.

> pg_recvlogical handles this problem by calling fsync() when it receives the
> request of immediate reply from the server. That is, at shutdown, walsender
> sends the request, pg_receivexlog receives it, flushes the last WAL record,
> and sends the flush location back to the server. Since walsender can see that
> the last WAL record is successfully flushed in pg_receivexlog, it can
> exit cleanly.
> 
> One idea to the problem is to introduce the same logic as pg_recvlogical has,
> to pg_receivexlog. Thought?

Sounds sane to me. Are you looking into doing that?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



On Sat, Nov 15, 2014 at 3:42 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:
>> On Fri, Nov 14, 2014 at 7:22 PM,  <furuyao@pm.nttdata.co.jp> wrote:
>> > "pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
>> > These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.
>>
>> I could reproduce this problem. At normal shutdown, walsender keeps waiting
>> for the last WAL record to be replicated and flushed in pg_receivexlog. But
>> pg_receivexlog issues sync command only when WAL file is switched. Thus,
>> since pg_receivexlog may never flush the last WAL record, walsender may have
>> to keep waiting infinitely.
>
> Right.
It is surprising that nobody complained about that before,
pg_receivexlog has been released two years ago.

>> pg_recvlogical handles this problem by calling fsync() when it receives the
>> request of immediate reply from the server. That is, at shutdown, walsender
>> sends the request, pg_receivexlog receives it, flushes the last WAL record,
>> and sends the flush location back to the server. Since walsender can see that
>> the last WAL record is successfully flushed in pg_receivexlog, it can
>> exit cleanly.
>>
>> One idea to the problem is to introduce the same logic as pg_recvlogical has,
>> to pg_receivexlog. Thought?
>
> Sounds sane to me. Are you looking into doing that?
Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.
--
Michael

Attachment
On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Nov 15, 2014 at 3:42 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:
>>> On Fri, Nov 14, 2014 at 7:22 PM,  <furuyao@pm.nttdata.co.jp> wrote:
>>> > "pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
>>> > These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.
>>>
>>> I could reproduce this problem. At normal shutdown, walsender keeps waiting
>>> for the last WAL record to be replicated and flushed in pg_receivexlog. But
>>> pg_receivexlog issues sync command only when WAL file is switched. Thus,
>>> since pg_receivexlog may never flush the last WAL record, walsender may have
>>> to keep waiting infinitely.
>>
>> Right.
> It is surprising that nobody complained about that before,
> pg_receivexlog has been released two years ago.

It's not so surprising because the problem can happen only when
replication slot is specified, i.e., the version is 9.4 or later.

>>> pg_recvlogical handles this problem by calling fsync() when it receives the
>>> request of immediate reply from the server. That is, at shutdown, walsender
>>> sends the request, pg_receivexlog receives it, flushes the last WAL record,
>>> and sends the flush location back to the server. Since walsender can see that
>>> the last WAL record is successfully flushed in pg_receivexlog, it can
>>> exit cleanly.
>>>
>>> One idea to the problem is to introduce the same logic as pg_recvlogical has,
>>> to pg_receivexlog. Thought?
>>
>> Sounds sane to me. Are you looking into doing that?
> Yep, sounds a good thing to do if master requested answer from the
> client in the keepalive message. Something like the patch attached
> would make the deal.

Isn't it better to do this only when replication slot is used?

Regards,

-- 
Fujii Masao



On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Yep, sounds a good thing to do if master requested answer from the
>> client in the keepalive message. Something like the patch attached
>> would make the deal.
>
> Isn't it better to do this only when replication slot is used?
Makes sense. What about a check using reportFlushPosition then?
--
Michael

Attachment
On Mon, Nov 17, 2014 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> Yep, sounds a good thing to do if master requested answer from the
>>> client in the keepalive message. Something like the patch attached
>>> would make the deal.
>>
>> Isn't it better to do this only when replication slot is used?
> Makes sense. What about a check using reportFlushPosition then?

Sounds reasonable. Thanks for updating the patch!
But the patch could not already be applied to the master cleanly
because c4f99d2 heavily changed the code that the patch also touches...
I rewrote the patch and pushed it to both master and REL9_4_STABLE.
Anyway, thanks!

Regards,

-- 
Fujii Masao



On Wed, Nov 19, 2014 at 6:16 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Nov 17, 2014 at 12:30 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
>>> <michael.paquier@gmail.com> wrote:
>>>> Yep, sounds a good thing to do if master requested answer from the
>>>> client in the keepalive message. Something like the patch attached
>>>> would make the deal.
>>>
>>> Isn't it better to do this only when replication slot is used?
>> Makes sense. What about a check using reportFlushPosition then?
>
> Sounds reasonable. Thanks for updating the patch!
> But the patch could not already be applied to the master cleanly
> because c4f99d2 heavily changed the code that the patch also touches...
> I rewrote the patch and pushed it to both master and REL9_4_STABLE.
> Anyway, thanks!

Is this:

+       if (reportFlushPosition && lastFlushPosition < blockpos &&
+           walfile != 1)

really correct? Shouldn't that walfile test be against -1 (minus one)?


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



On Wed, Nov 19, 2014 at 5:15 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Wed, Nov 19, 2014 at 6:16 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Mon, Nov 17, 2014 at 12:30 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
>>>> <michael.paquier@gmail.com> wrote:
>>>>> Yep, sounds a good thing to do if master requested answer from the
>>>>> client in the keepalive message. Something like the patch attached
>>>>> would make the deal.
>>>>
>>>> Isn't it better to do this only when replication slot is used?
>>> Makes sense. What about a check using reportFlushPosition then?
>>
>> Sounds reasonable. Thanks for updating the patch!
>> But the patch could not already be applied to the master cleanly
>> because c4f99d2 heavily changed the code that the patch also touches...
>> I rewrote the patch and pushed it to both master and REL9_4_STABLE.
>> Anyway, thanks!
>
> Is this:
>
> +       if (reportFlushPosition && lastFlushPosition < blockpos &&
> +           walfile != 1)
>
> really correct? Shouldn't that walfile test be against -1 (minus one)?

Ooops... You're right. That's really mistake... Just fixed and pushed. Thanks!

Regards,

-- 
Fujii Masao