Thread: Fix a test case in 035_standby_logical_decoding.pl
Hi hackers, In 035_standby_logical_decoding.pl, I think that the check in the following test case should be performed on the standby node, instead of the primary node, as the slot is created on the standby node. The test currently passes because it only checks the return value of psql. It might be more appropriate to check the error message. Please see the attached patch. ``` $node_primary->safe_psql('postgres', 'CREATE DATABASE otherdb'); is( $node_primary->psql( 'otherdb', "SELECT lsn FROM pg_logical_slot_peek_changes('behaves_ok_activeslot', NULL, NULL) ORDER BY lsn DESC LIMIT 1;" ), 3, 'replaying logical slot from another database fails'); ``` The regress log: psql:<stdin>:1: ERROR: replication slot "behaves_ok_activeslot" does not exist [11:23:21.859](0.086s) ok 8 - replaying logical slot from another database fails Regards, Shi Yu
Attachment
Hi, On 4/27/23 10:11 AM, Yu Shi (Fujitsu) wrote: > Hi hackers, > > In 035_standby_logical_decoding.pl, I think that the check in the following test > case should be performed on the standby node, instead of the primary node, as > the slot is created on the standby node. Oh right, the current test is not done on the right node, thanks! > The test currently passes because it > only checks the return value of psql. It might be more appropriate to check the > error message. Agree, and it's consistent with what is being done in 006_logical_decoding.pl. > Please see the attached patch. > + +($result, $stdout, $stderr) = $node_standby->psql( 'otherdb', "SELECT lsn FROM pg_logical_slot_peek_changes('behaves_ok_activeslot', NULL, NULL) ORDER BY lsn DESC LIMIT 1;" - ), - 3, - 'replaying logical slot from another database fails'); + ); +ok( $stderr =~ + m/replication slot "behaves_ok_activeslot" was not created in this database/, + "replaying logical slot from another database fails"); That does look good to me. Nit: I wonder if while at it (as it was already there) we could not remove the " ORDER BY lsn DESC LIMIT 1" part of it. It does not change anything regarding the test but looks more appropriate to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Apr 27, 2023 4:47 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On 4/27/23 10:11 AM, Yu Shi (Fujitsu) wrote: > > Hi hackers, > > > > In 035_standby_logical_decoding.pl, I think that the check in the following test > > case should be performed on the standby node, instead of the primary node, > as > > the slot is created on the standby node. > > Oh right, the current test is not done on the right node, thanks! > > > The test currently passes because it > > only checks the return value of psql. It might be more appropriate to check the > > error message. > > Agree, and it's consistent with what is being done in 006_logical_decoding.pl. > > > Please see the attached patch. > > > > + > +($result, $stdout, $stderr) = $node_standby->psql( > 'otherdb', > "SELECT lsn FROM pg_logical_slot_peek_changes('behaves_ok_activeslot', > NULL, NULL) ORDER BY lsn DESC LIMIT 1;" > - ), > - 3, > - 'replaying logical slot from another database fails'); > + ); > +ok( $stderr =~ > + m/replication slot "behaves_ok_activeslot" was not created in this > database/, > + "replaying logical slot from another database fails"); > > > That does look good to me. > > Nit: I wonder if while at it (as it was already there) we could not remove the " > ORDER BY lsn DESC LIMIT 1" part of it. > It does not change anything regarding the test but looks more appropriate to > me. > Thanks for your reply. I agree with you and I removed it in the attached patch. Regards, Shi Yu
Attachment
On Thu, Apr 27, 2023 at 2:16 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > On 4/27/23 10:11 AM, Yu Shi (Fujitsu) wrote: > > Hi hackers, > > > > In 035_standby_logical_decoding.pl, I think that the check in the following test > > case should be performed on the standby node, instead of the primary node, as > > the slot is created on the standby node. > > Oh right, the current test is not done on the right node, thanks! > > > The test currently passes because it > > only checks the return value of psql. It might be more appropriate to check the > > error message. > > Agree, and it's consistent with what is being done in 006_logical_decoding.pl. > > > Please see the attached patch. > > > > + > +($result, $stdout, $stderr) = $node_standby->psql( > 'otherdb', > "SELECT lsn FROM pg_logical_slot_peek_changes('behaves_ok_activeslot', NULL, NULL) ORDER BY lsn DESC LIMIT 1;" > - ), > - 3, > - 'replaying logical slot from another database fails'); > + ); > +ok( $stderr =~ > + m/replication slot "behaves_ok_activeslot" was not created in this database/, > + "replaying logical slot from another database fails"); > > > That does look good to me. > I agree that that the check should be done on standby but how does it make a difference to check the error message or return value? Won't it be the same for both primary and standby? > Nit: I wonder if while at it (as it was already there) we could not remove the " ORDER BY lsn DESC LIMIT 1" part of it. > It does not change anything regarding the test but looks more appropriate to me. > It may not make a difference as this is anyway an error case but it looks more logical to LIMIT by 1 as you are fetching a single LSN value and it will be consistent with other tests in this file and the test case in the file 006_logical_decoding.pl. BTW, in the same test, I see it uses wait_for_catchup() in one place and wait_for_replay_catchup() in another place after Insert. Isn't it better to use wait_for_replay_catchup in both places? -- With Regards, Amit Kapila.
Hi, On 4/27/23 11:53 AM, Amit Kapila wrote: > On Thu, Apr 27, 2023 at 2:16 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: >> >> On 4/27/23 10:11 AM, Yu Shi (Fujitsu) wrote: >>> Hi hackers, >>> >>> In 035_standby_logical_decoding.pl, I think that the check in the following test >>> case should be performed on the standby node, instead of the primary node, as >>> the slot is created on the standby node. >> >> Oh right, the current test is not done on the right node, thanks! >> >>> The test currently passes because it >>> only checks the return value of psql. It might be more appropriate to check the >>> error message. >> >> Agree, and it's consistent with what is being done in 006_logical_decoding.pl. >> >>> Please see the attached patch. >>> >> >> + >> +($result, $stdout, $stderr) = $node_standby->psql( >> 'otherdb', >> "SELECT lsn FROM pg_logical_slot_peek_changes('behaves_ok_activeslot', NULL, NULL) ORDER BY lsn DESC LIMIT1;" >> - ), >> - 3, >> - 'replaying logical slot from another database fails'); >> + ); >> +ok( $stderr =~ >> + m/replication slot "behaves_ok_activeslot" was not created in this database/, >> + "replaying logical slot from another database fails"); >> >> >> That does look good to me. >> > > I agree that that the check should be done on standby but how does it > make a difference to check the error message or return value? Won't it > be the same for both primary and standby? > Yes that would be the same. I think the original idea from Shi-san (please correct me If I'm wrong) was "while at it" let's also make this test on the right node even better. >> Nit: I wonder if while at it (as it was already there) we could not remove the " ORDER BY lsn DESC LIMIT 1" part of it. >> It does not change anything regarding the test but looks more appropriate to me. >> > > It may not make a difference as this is anyway an error case but it > looks more logical to LIMIT by 1 as you are fetching a single LSN > value and it will be consistent with other tests in this file and the > test case in the file 006_logical_decoding.pl. > yeah I think it all depends how one see this test (sort of test a failure or try to get a result and see if it fails). That was a Nit so I don't have a strong opinion on this one. > BTW, in the same test, I see it uses wait_for_catchup() in one place > and wait_for_replay_catchup() in another place after Insert. Isn't it > better to use wait_for_replay_catchup in both places? > They are both using the 'replay' mode so both are perfectly correct but for consistency (and as they don't use the same "target_lsn" ('write' vs 'flush')) I think that using wait_for_replay_catchup() in both places (which is using the 'flush' target lsn) is a good idea. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Apr 27, 2023 at 4:07 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > On 4/27/23 11:53 AM, Amit Kapila wrote: > > On Thu, Apr 27, 2023 at 2:16 PM Drouvot, Bertrand > > <bertranddrouvot.pg@gmail.com> wrote: > >> > >> On 4/27/23 10:11 AM, Yu Shi (Fujitsu) wrote: > >>> Hi hackers, > >>> > >>> In 035_standby_logical_decoding.pl, I think that the check in the following test > >>> case should be performed on the standby node, instead of the primary node, as > >>> the slot is created on the standby node. > >> > >> Oh right, the current test is not done on the right node, thanks! > >> > >>> The test currently passes because it > >>> only checks the return value of psql. It might be more appropriate to check the > >>> error message. > >> > >> Agree, and it's consistent with what is being done in 006_logical_decoding.pl. > >> > >>> Please see the attached patch. > >>> > >> > >> + > >> +($result, $stdout, $stderr) = $node_standby->psql( > >> 'otherdb', > >> "SELECT lsn FROM pg_logical_slot_peek_changes('behaves_ok_activeslot', NULL, NULL) ORDER BY lsn DESC LIMIT1;" > >> - ), > >> - 3, > >> - 'replaying logical slot from another database fails'); > >> + ); > >> +ok( $stderr =~ > >> + m/replication slot "behaves_ok_activeslot" was not created in this database/, > >> + "replaying logical slot from another database fails"); > >> > >> > >> That does look good to me. > >> > > > > I agree that that the check should be done on standby but how does it > > make a difference to check the error message or return value? Won't it > > be the same for both primary and standby? > > > > Yes that would be the same. I think the original idea from Shi-san (please correct me If I'm wrong) > was "while at it" let's also make this test on the right node even better. > Fair enough. Let''s do it that way then. > >> Nit: I wonder if while at it (as it was already there) we could not remove the " ORDER BY lsn DESC LIMIT 1" part ofit. > >> It does not change anything regarding the test but looks more appropriate to me. > >> > > > > It may not make a difference as this is anyway an error case but it > > looks more logical to LIMIT by 1 as you are fetching a single LSN > > value and it will be consistent with other tests in this file and the > > test case in the file 006_logical_decoding.pl. > > > > yeah I think it all depends how one see this test (sort of test a failure or try to get > a result and see if it fails). That was a Nit so I don't have a strong opinion on this one. > I feel let's be consistent here and keep it as it is. > > BTW, in the same test, I see it uses wait_for_catchup() in one place > > and wait_for_replay_catchup() in another place after Insert. Isn't it > > better to use wait_for_replay_catchup in both places? > > > > They are both using the 'replay' mode so both are perfectly correct but for consistency (and as > they don't use the same "target_lsn" ('write' vs 'flush')) I think that using wait_for_replay_catchup() > in both places (which is using the 'flush' target lsn) is a good idea. > Yeah, let's use wait_for_replay_catchup() at both places. -- With Regards, Amit Kapila.
On Thu, Apr 27, 2023 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Apr 27, 2023 at 4:07 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: > > > > On 4/27/23 11:53 AM, Amit Kapila wrote: > > > On Thu, Apr 27, 2023 at 2:16 PM Drouvot, Bertrand > > > <bertranddrouvot.pg@gmail.com> wrote: > > >> > > >> On 4/27/23 10:11 AM, Yu Shi (Fujitsu) wrote: > > >>> Hi hackers, > > >>> > > >>> In 035_standby_logical_decoding.pl, I think that the check in the following > test > > >>> case should be performed on the standby node, instead of the primary > node, as > > >>> the slot is created on the standby node. > > >> > > >> Oh right, the current test is not done on the right node, thanks! > > >> > > >>> The test currently passes because it > > >>> only checks the return value of psql. It might be more appropriate to check > the > > >>> error message. > > >> > > >> Agree, and it's consistent with what is being done in > 006_logical_decoding.pl. > > >> > > >>> Please see the attached patch. > > >>> > > >> > > >> + > > >> +($result, $stdout, $stderr) = $node_standby->psql( > > >> 'otherdb', > > >> "SELECT lsn FROM > pg_logical_slot_peek_changes('behaves_ok_activeslot', NULL, NULL) ORDER BY > lsn DESC LIMIT 1;" > > >> - ), > > >> - 3, > > >> - 'replaying logical slot from another database fails'); > > >> + ); > > >> +ok( $stderr =~ > > >> + m/replication slot "behaves_ok_activeslot" was not created in this > database/, > > >> + "replaying logical slot from another database fails"); > > >> > > >> > > >> That does look good to me. > > >> > > > > > > I agree that that the check should be done on standby but how does it > > > make a difference to check the error message or return value? Won't it > > > be the same for both primary and standby? > > > > > > > Yes that would be the same. I think the original idea from Shi-san (please > correct me If I'm wrong) > > was "while at it" let's also make this test on the right node even better. > > > > Fair enough. Let''s do it that way then. > > > >> Nit: I wonder if while at it (as it was already there) we could not remove the > " ORDER BY lsn DESC LIMIT 1" part of it. > > >> It does not change anything regarding the test but looks more appropriate > to me. > > >> > > > > > > It may not make a difference as this is anyway an error case but it > > > looks more logical to LIMIT by 1 as you are fetching a single LSN > > > value and it will be consistent with other tests in this file and the > > > test case in the file 006_logical_decoding.pl. > > > > > > > yeah I think it all depends how one see this test (sort of test a failure or try to > get > > a result and see if it fails). That was a Nit so I don't have a strong opinion on > this one. > > > > I feel let's be consistent here and keep it as it is. > > > > BTW, in the same test, I see it uses wait_for_catchup() in one place > > > and wait_for_replay_catchup() in another place after Insert. Isn't it > > > better to use wait_for_replay_catchup in both places? > > > > > > > They are both using the 'replay' mode so both are perfectly correct but for > consistency (and as > > they don't use the same "target_lsn" ('write' vs 'flush')) I think that using > wait_for_replay_catchup() > > in both places (which is using the 'flush' target lsn) is a good idea. > > > > Yeah, let's use wait_for_replay_catchup() at both places. > Thanks for your reply. Your suggestions make sense to me. Please see the attached patch. Regards, Shi Yu