Thread: [HACKERS] Server Crashes if try to provide slot_name='none' at the time ofcreating subscription.
[HACKERS] Server Crashes if try to provide slot_name='none' at the time ofcreating subscription.
From
tushar
Date:
Hi, Server Crashes if we try to provide slot_name='none' at the time of creating subscription - postgres=# create subscription sub2 connection 'dbname=postgres port=5000 user=centos password=f' publication abc with (slot_name='none'); NOTICE: synchronized table states server closed the connection unexpectedly This probably means the server terminated abnormally before or while processingthe request. The connection to the server was lost. Attempting reset: Failed. !> -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Masahiko Sawada
Date:
On Mon, May 15, 2017 at 6:41 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: > Hi, > > Server Crashes if we try to provide slot_name='none' at the time of creating > subscription - > > postgres=# create subscription sub2 connection 'dbname=postgres port=5000 > user=centos password=f' publication abc with (slot_name='none'); > NOTICE: synchronized table states > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > !> > Thank you for reporting. I think create_slot and enabled should be set to false forcibly when slot_name = 'none'. Attached patch fixes it, more test and regression test case are needed though. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Kuntal Ghosh
Date:
On Mon, May 15, 2017 at 4:00 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Mon, May 15, 2017 at 6:41 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: >> Hi, >> >> Server Crashes if we try to provide slot_name='none' at the time of creating >> subscription - >> >> postgres=# create subscription sub2 connection 'dbname=postgres port=5000 >> user=centos password=f' publication abc with (slot_name='none'); >> NOTICE: synchronized table states >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> The connection to the server was lost. Attempting reset: Failed. >> !> >> > > Thank you for reporting. > I think create_slot and enabled should be set to false forcibly when > slot_name = 'none'. Attached patch fixes it, more test and regression > test case are needed though. > I think the patch doesn't solve the problem completely. For example, postgres=# create subscription sub3 connection 'dbname=postgres port=5432 user=edb password=edb' publication abc with (slot_name='none',create_slot=true); ERROR: slot_name = NONE and create slot are mutually exclusive options postgres=# create subscription sub3 connection 'dbname=postgres port=5432 user=edb password=edb' publication abc with (create_slot=true,slot_name='none'); ERROR: could not connect to the publisher: could not connect to server: Connection refused Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432"? Changing the order of subscription parameter changes the output. I think the modifications should be done at the end of the function. Attached a patch for the same. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Masahiko Sawada
Date:
On Mon, May 15, 2017 at 8:06 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Mon, May 15, 2017 at 4:00 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> On Mon, May 15, 2017 at 6:41 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: >>> Hi, >>> >>> Server Crashes if we try to provide slot_name='none' at the time of creating >>> subscription - >>> >>> postgres=# create subscription sub2 connection 'dbname=postgres port=5000 >>> user=centos password=f' publication abc with (slot_name='none'); >>> NOTICE: synchronized table states >>> server closed the connection unexpectedly >>> This probably means the server terminated abnormally >>> before or while processing the request. >>> The connection to the server was lost. Attempting reset: Failed. >>> !> >>> >> >> Thank you for reporting. >> I think create_slot and enabled should be set to false forcibly when >> slot_name = 'none'. Attached patch fixes it, more test and regression >> test case are needed though. >> > I think the patch doesn't solve the problem completely. For example, > postgres=# create subscription sub3 connection 'dbname=postgres > port=5432 user=edb password=edb' publication abc with > (slot_name='none',create_slot=true); > ERROR: slot_name = NONE and create slot are mutually exclusive options > postgres=# create subscription sub3 connection 'dbname=postgres > port=5432 user=edb password=edb' publication abc with > (create_slot=true,slot_name='none'); > ERROR: could not connect to the publisher: could not connect to > server: Connection refused > Is the server running locally and accepting > connections on Unix domain socket "/tmp/.s.PGSQL.5432"? > > Changing the order of subscription parameter changes the output. I > think the modifications should be done at the end of the function. > Attached a patch for the same. > Yes, you're patch fixes it correctly. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Kuntal Ghosh
Date:
On Mon, May 15, 2017 at 4:39 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: While testing with logical replication, I've found that the server hangs if we create a subscription to the same server. For example, $ ./pg_ctl -D Test start -o "-p 5432" $ ./psql -p 5432 postgres -c "CREATE PUBLICATION abc for all tables with (publish='delete');" $ ./psql -p 5432 postgres -c "create subscription sub connection 'dbname=postgres port=5432 user=edb password=edb' publication abc with (slot_name='abcslot');" NOTICE: synchronized table states LOG: logical decoding found initial starting point at 0/162DF18 DETAIL: Waiting for transactions (approximately 1) older than 560 to end. And, it hangs. Is this an expected behavior? -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Masahiko Sawada
Date:
On Mon, May 15, 2017 at 8:22 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Mon, May 15, 2017 at 4:39 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > While testing with logical replication, I've found that the server > hangs if we create a subscription to the same server. For example, > > $ ./pg_ctl -D Test start -o "-p 5432" > $ ./psql -p 5432 postgres -c "CREATE PUBLICATION abc for all tables > with (publish='delete');" > $ ./psql -p 5432 postgres -c "create subscription sub connection > 'dbname=postgres port=5432 user=edb password=edb' publication abc with > (slot_name='abcslot');" > NOTICE: synchronized table states > LOG: logical decoding found initial starting point at 0/162DF18 > DETAIL: Waiting for transactions (approximately 1) older than 560 to end. > > And, it hangs. Is this an expected behavior? I guess this is what we're discussing on [1]. Petr answered on that, === Yes that's result of how logical replication slots work, the transaction that needs to finish is your transaction. It can be worked around by creating the slot manually via the SQL interface for example and create the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . === [1] https://www.postgresql.org/message-id/20170426165954.GK14000%40momjian.us Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Kuntal Ghosh
Date:
On Mon, May 15, 2017 at 5:04 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Mon, May 15, 2017 at 8:22 PM, Kuntal Ghosh > <kuntalghosh.2007@gmail.com> wrote: >> On Mon, May 15, 2017 at 4:39 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> While testing with logical replication, I've found that the server >> hangs if we create a subscription to the same server. For example, >> >> $ ./pg_ctl -D Test start -o "-p 5432" >> $ ./psql -p 5432 postgres -c "CREATE PUBLICATION abc for all tables >> with (publish='delete');" >> $ ./psql -p 5432 postgres -c "create subscription sub connection >> 'dbname=postgres port=5432 user=edb password=edb' publication abc with >> (slot_name='abcslot');" >> NOTICE: synchronized table states >> LOG: logical decoding found initial starting point at 0/162DF18 >> DETAIL: Waiting for transactions (approximately 1) older than 560 to end. >> >> And, it hangs. Is this an expected behavior? > > I guess this is what we're discussing on [1]. > Petr answered on that, > === > Yes that's result of how logical replication slots work, the transaction > that needs to finish is your transaction. It can be worked around by > creating the slot manually via the SQL interface for example and create > the subscription using WITH (NOCREATE SLOT, SLOT NAME = 'your slot') . > === > > [1] https://www.postgresql.org/message-id/20170426165954.GK14000%40momjian.us > Ohh I see. Thank you. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Masahiko Sawada
Date:
On Mon, May 15, 2017 at 8:09 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Mon, May 15, 2017 at 8:06 PM, Kuntal Ghosh > <kuntalghosh.2007@gmail.com> wrote: >> On Mon, May 15, 2017 at 4:00 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>> On Mon, May 15, 2017 at 6:41 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: >>>> Hi, >>>> >>>> Server Crashes if we try to provide slot_name='none' at the time of creating >>>> subscription - >>>> >>>> postgres=# create subscription sub2 connection 'dbname=postgres port=5000 >>>> user=centos password=f' publication abc with (slot_name='none'); >>>> NOTICE: synchronized table states >>>> server closed the connection unexpectedly >>>> This probably means the server terminated abnormally >>>> before or while processing the request. >>>> The connection to the server was lost. Attempting reset: Failed. >>>> !> >>>> >>> >>> Thank you for reporting. >>> I think create_slot and enabled should be set to false forcibly when >>> slot_name = 'none'. Attached patch fixes it, more test and regression >>> test case are needed though. >>> >> I think the patch doesn't solve the problem completely. For example, >> postgres=# create subscription sub3 connection 'dbname=postgres >> port=5432 user=edb password=edb' publication abc with >> (slot_name='none',create_slot=true); >> ERROR: slot_name = NONE and create slot are mutually exclusive options >> postgres=# create subscription sub3 connection 'dbname=postgres >> port=5432 user=edb password=edb' publication abc with >> (create_slot=true,slot_name='none'); >> ERROR: could not connect to the publisher: could not connect to >> server: Connection refused >> Is the server running locally and accepting >> connections on Unix domain socket "/tmp/.s.PGSQL.5432"? >> >> Changing the order of subscription parameter changes the output. I >> think the modifications should be done at the end of the function. >> Attached a patch for the same. >> > > Yes, you're patch fixes it correctly. > I've updated Kuntal's patch, added regression test for option combination and updated documentation. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Kuntal Ghosh
Date:
Added to open item lists. On Tue, May 16, 2017 at 6:35 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Mon, May 15, 2017 at 8:09 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> On Mon, May 15, 2017 at 8:06 PM, Kuntal Ghosh >> <kuntalghosh.2007@gmail.com> wrote: >>> On Mon, May 15, 2017 at 4:00 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>>> On Mon, May 15, 2017 at 6:41 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: >>>>> Hi, >>>>> >>>>> Server Crashes if we try to provide slot_name='none' at the time of creating >>>>> subscription - >>>>> >>>>> postgres=# create subscription sub2 connection 'dbname=postgres port=5000 >>>>> user=centos password=f' publication abc with (slot_name='none'); >>>>> NOTICE: synchronized table states >>>>> server closed the connection unexpectedly >>>>> This probably means the server terminated abnormally >>>>> before or while processing the request. >>>>> The connection to the server was lost. Attempting reset: Failed. >>>>> !> >>>>> >>>> >>>> Thank you for reporting. >>>> I think create_slot and enabled should be set to false forcibly when >>>> slot_name = 'none'. Attached patch fixes it, more test and regression >>>> test case are needed though. >>>> >>> I think the patch doesn't solve the problem completely. For example, >>> postgres=# create subscription sub3 connection 'dbname=postgres >>> port=5432 user=edb password=edb' publication abc with >>> (slot_name='none',create_slot=true); >>> ERROR: slot_name = NONE and create slot are mutually exclusive options >>> postgres=# create subscription sub3 connection 'dbname=postgres >>> port=5432 user=edb password=edb' publication abc with >>> (create_slot=true,slot_name='none'); >>> ERROR: could not connect to the publisher: could not connect to >>> server: Connection refused >>> Is the server running locally and accepting >>> connections on Unix domain socket "/tmp/.s.PGSQL.5432"? >>> >>> Changing the order of subscription parameter changes the output. I >>> think the modifications should be done at the end of the function. >>> Attached a patch for the same. >>> >> >> Yes, you're patch fixes it correctly. >> > > I've updated Kuntal's patch, added regression test for option > combination and updated documentation. > > Regards, > > -- > Masahiko Sawada > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
tushar
Date:
On 05/16/2017 06:35 AM, Masahiko Sawada wrote: > I've updated Kuntal's patch, added regression test for option > combination and updated documentation. While testing the patch - I found that after dump/restore , we are getting an error in the log file once we enable the subscription \\create subscription postgres=# CREATE SUBSCRIPTION m1 CONNECTION 'dbname=postgres port=5000 ' PUBLICATION qdd WITH (slot_name='none'); NOTICE: synchronized table states CREATE SUBSCRIPTION \\take the dump [centos@centos-cpula bin]$ ./pg_dump -Fp -p 9000 postgres > /tmp/d.c \\check the syntax [centos@centos-cpula bin]$ cat /tmp/d.c |grep 'create subsc*' -i CREATE SUBSCRIPTION m1 CONNECTION 'dbname=postgres port=5000 ' PUBLICATION qdd WITH (connect = false, slot_name = ''); \\execute this same syntax against a new database postgres=# create database test; CREATE DATABASE postgres=# \c test You are now connected to database "test" as user "centos". test=# CREATE SUBSCRIPTION m1 CONNECTION 'dbname=postgres port=5000 ' PUBLICATION qdd WITH (connect = false, slot_name = ''); WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables CREATE SUBSCRIPTION test=# alter subscription m1 refresh publication ; ERROR: ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions test=# alter subscription m1 enable ; ALTER SUBSCRIPTION Check the message in log file 017-05-16 14:04:48.373 BST [18219] LOG: logical replication apply for subscription m1 started 2017-05-16 14:04:48.381 BST [18219] ERROR: could not start WAL streaming: ERROR: replication slot name "" is too short 2017-05-16 14:04:48.382 BST [17843] LOG: worker process: logical replication worker for subscription 16386 (PID 18219) exited with exit code 1 2017-05-16 14:04:53.388 BST [17850] LOG: starting logical replication worker for subscription "m1" 2017-05-16 14:04:53.396 BST [18224] LOG: logical replication apply for subscription m1 started 2017-05-16 14:04:53.403 BST [18224] ERROR: could not start WAL streaming: ERROR: replication slot name "" is too short Is this error message (ERROR: replication slot name "" is too short ) is expected now ? -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Masahiko Sawada
Date:
On Tue, May 16, 2017 at 10:06 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: > On 05/16/2017 06:35 AM, Masahiko Sawada wrote: >> >> I've updated Kuntal's patch, added regression test for option >> combination and updated documentation. > > While testing the patch - I found that after dump/restore , we are getting > an error in the log file once we enable the subscription > > \\create subscription > > postgres=# CREATE SUBSCRIPTION m1 CONNECTION 'dbname=postgres port=5000 ' > PUBLICATION qdd WITH (slot_name='none'); > NOTICE: synchronized table states > CREATE SUBSCRIPTION > > \\take the dump > [centos@centos-cpula bin]$ ./pg_dump -Fp -p 9000 postgres > /tmp/d.c > \\check the syntax > [centos@centos-cpula bin]$ cat /tmp/d.c |grep 'create subsc*' -i > CREATE SUBSCRIPTION m1 CONNECTION 'dbname=postgres port=5000 ' PUBLICATION > qdd WITH (connect = false, slot_name = ''); > \\execute this same syntax against a new database > postgres=# create database test; > CREATE DATABASE > postgres=# \c test > You are now connected to database "test" as user "centos". > test=# CREATE SUBSCRIPTION m1 CONNECTION 'dbname=postgres port=5000 ' > PUBLICATION qdd WITH (connect = false, slot_name = ''); > WARNING: tables were not subscribed, you will have to run ALTER > SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables > CREATE SUBSCRIPTION > > test=# alter subscription m1 refresh publication ; > ERROR: ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled > subscriptions > test=# alter subscription m1 enable ; > ALTER SUBSCRIPTION > > Check the message in log file > > 017-05-16 14:04:48.373 BST [18219] LOG: logical replication apply for > subscription m1 started > 2017-05-16 14:04:48.381 BST [18219] ERROR: could not start WAL streaming: > ERROR: replication slot name "" is too short > 2017-05-16 14:04:48.382 BST [17843] LOG: worker process: logical > replication worker for subscription 16386 (PID 18219) exited with exit code > 1 > 2017-05-16 14:04:53.388 BST [17850] LOG: starting logical replication > worker for subscription "m1" > 2017-05-16 14:04:53.396 BST [18224] LOG: logical replication apply for > subscription m1 started > 2017-05-16 14:04:53.403 BST [18224] ERROR: could not start WAL streaming: > ERROR: replication slot name "" is too short > > Is this error message (ERROR: replication slot name "" is too short ) is > expected now ? > I think there are two bugs; pg_dump should dump slot_name = NONE instead of '' and subscription should not be created if given slot name is invalid. The validation check for replication slot name is done when creating it actually but I think it's more safer to check when CREATE SUBSCRIPTION. The bug related to setting slot_name = NONE should be fixed by attached 001 patch, and 002 patch prevents to be specified invalid replication slot name when CREATE SUBSCRIPTION and ALTER SUBSCRIPTION SET. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Peter Eisentraut
Date:
On 5/16/17 22:21, Masahiko Sawada wrote: > I think there are two bugs; pg_dump should dump slot_name = NONE > instead of '' and subscription should not be created if given slot > name is invalid. The validation check for replication slot name is > done when creating it actually but I think it's more safer to check > when CREATE SUBSCRIPTION. The bug related to setting slot_name = NONE > should be fixed by attached 001 patch, and 002 patch prevents to be > specified invalid replication slot name when CREATE SUBSCRIPTION and > ALTER SUBSCRIPTION SET. I have worked through these issues and came up with slightly different patches. The issue in pg_dump should have been checking PQgetisnull() before reading the slot name. That is now fixed. The issue with slot_name = NONE causing a crash was fixed by adding additional error checking. I did not change it so that slot_name = NONE would change the defaults of enabled and create_slot. I think it's confusing if options have dependencies like that. In any case, creating a subscription with slot_name = NONE is probably not useful anyway, so as long as it doesn't crash, we don't need to make it excessively user-friendly. I don't think the subscription side should check the validity of the replication slot name. That is the responsibility of the publication side. The rules may well be different if you replicate between different versions or different build options. This is currently working correctly: If the publication side doesn't like the slot you specify, either because the name is invalid or the slot doesn't exist, you get an appropriate error message. Please check whether everything is working OK for you now. I think this open item is closed now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Server Crashes if try to provide slot_name='none' atthe time of creating subscription.
From
Masahiko Sawada
Date:
On Thu, May 18, 2017 at 10:33 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 5/16/17 22:21, Masahiko Sawada wrote: >> I think there are two bugs; pg_dump should dump slot_name = NONE >> instead of '' and subscription should not be created if given slot >> name is invalid. The validation check for replication slot name is >> done when creating it actually but I think it's more safer to check >> when CREATE SUBSCRIPTION. The bug related to setting slot_name = NONE >> should be fixed by attached 001 patch, and 002 patch prevents to be >> specified invalid replication slot name when CREATE SUBSCRIPTION and >> ALTER SUBSCRIPTION SET. > > I have worked through these issues and came up with slightly different > patches. > > The issue in pg_dump should have been checking PQgetisnull() before > reading the slot name. That is now fixed. Agreed. > > The issue with slot_name = NONE causing a crash was fixed by adding > additional error checking. I did not change it so that slot_name = NONE > would change the defaults of enabled and create_slot. I think it's > confusing if options have dependencies like that. In any case, creating > a subscription with slot_name = NONE is probably not useful anyway, so > as long as it doesn't crash, we don't need to make it excessively > user-friendly. Also agreed. > I don't think the subscription side should check the validity of the > replication slot name. That is the responsibility of the publication > side. The rules may well be different if you replicate between > different versions or different build options. This is currently > working correctly: If the publication side doesn't like the slot you > specify, either because the name is invalid or the slot doesn't exist, > you get an appropriate error message. Yeah, now I understood. Agreed. > Please check whether everything is working OK for you now. I think this > open item is closed now. I've checked these changes, everything is working fine. Thank you! Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center