Thread: Improve error detections in TAP tests by spreading safe_psql

Improve error detections in TAP tests by spreading safe_psql

From
Michael Paquier
Date:
Hi all,

This is a follow-up of the discussion which happened here after Tom
has committed fb57f40:
https://www.postgresql.org/message-id/20190828012439.GA1965@paquier.xyz

I have reviewed the TAP tests, and we have much more spots where it
is better to use PostgresNode::safe_psql instead PostgresNode::psql so
as the test dies immediately if there is a failure on a query which
should never fail.

Attached are the spots I have found.  Any thoughts or opinions?
Thanks,
--
Michael

Attachment

Re: Improve error detections in TAP tests by spreading safe_psql

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> This is a follow-up of the discussion which happened here after Tom
> has committed fb57f40:
> https://www.postgresql.org/message-id/20190828012439.GA1965@paquier.xyz
> I have reviewed the TAP tests, and we have much more spots where it
> is better to use PostgresNode::safe_psql instead PostgresNode::psql so
> as the test dies immediately if there is a failure on a query which
> should never fail.

After a brief review of node->psql calls in the TAP tests, I'm
of the opinion that what we should do is revert fb57f40 and instead
change PostgresNode::psql so that on_error_die defaults to 1, then
fix the *very* small number of callers that need it to be zero.
Otherwise we're just going to be fighting this same fire forevermore.
Moreover, that's going to lead to a much smaller patch.

> Attached are the spots I have found.  Any thoughts or opinions?

Well, for instance, you missed SSLServer.pm, in which every single
one of the psql calls is wrong.

If we go this route we'd have to back-patch the change, else it'd
be a serious hazard for test case back-patching.  But the buildfarm
would, more or less by definition, find any oversights --- so that
doesn't scare me much.

            regards, tom lane



Re: Improve error detections in TAP tests by spreading safe_psql

From
Tom Lane
Date:
I wrote:
> After a brief review of node->psql calls in the TAP tests, I'm
> of the opinion that what we should do is revert fb57f40 and instead
> change PostgresNode::psql so that on_error_die defaults to 1, then
> fix the *very* small number of callers that need it to be zero.
> Otherwise we're just going to be fighting this same fire forevermore.
> Moreover, that's going to lead to a much smaller patch.

Hmm .. I experimented with doing it this way, as attached, and I guess
I have to take back the assertion that it'd be a much smaller patch.
I found 47 calls that'd need to be changed to set on_error_die to 0,
whereas your patch is touching 44 calls (though I think it missed some).
I still think this is a safer, less bug-prone way to proceed though.

The attached patch just follows a mechanical rule of "set on_error_die
to 0 in exactly those calls where the surrounding code verifies the
exit code is what it expects".  That leads to a lot of code that
looks like, say,

 # Insert should work on standby
 is( $node_standby->psql(
         'postgres',
-        qq{insert into testtab select generate_series(1,1000), 'foo';}),
+        qq{insert into testtab select generate_series(1,1000), 'foo';},
+        on_error_die => 0),
     0,
     'INSERT succeeds with truncated relation FSM');

I have to wonder if it isn't better to just drop the is() test
and let PostgresNode::psql issue the failure.  The only thing
the is() is really buying us is the ability to label the test
with an ID string, which is helpful, but um ... couldn't that
just be a comment?  Or we could think about adding another
optional parameter:

$node_standby->psql(
    'postgres',
    qq{insert into testtab select generate_series(1,1000), 'foo';},
     test_name => 'INSERT succeeds with truncated relation FSM');

            regards, tom lane

diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index 71dbfd2..8760804 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -19,7 +19,7 @@ $node->start();

 # Verify that log output gets to the file

-$node->psql('postgres', 'SELECT 1/0');
+$node->psql('postgres', 'SELECT 1/0', on_error_die => 0);

 my $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');

@@ -75,7 +75,7 @@ chomp $lfname;

 # Verify that log output gets to this file, too

-$node->psql('postgres', 'fee fi fo fum');
+$node->psql('postgres', 'fee fi fo fum', on_error_die => 0);

 my $second_logfile;
 for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f064ea1..53f4824 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -3274,6 +3274,7 @@ $node->psql(
     'postgres',
     "CREATE COLLATION testing FROM \"C\"; DROP COLLATION testing;",
     on_error_stop => 0,
+    on_error_die  => 0,
     stderr        => \$collation_check_stderr);

 if ($collation_check_stderr !~ /ERROR: /)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index b82d3f6..7cc1c88 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -487,12 +487,10 @@ INSERT INTO seeded_random(seed, rand, val) VALUES
 }

 # check that all runs generated the same 4 values
-my ($ret, $out, $err) = $node->psql('postgres',
+my $out = $node->safe_psql('postgres',
     'SELECT seed, rand, val, COUNT(*) FROM seeded_random GROUP BY seed, rand, val'
 );

-ok($ret == 0,  "psql seeded_random count ok");
-ok($err eq '', "psql seeded_random count stderr is empty");
 ok($out =~ /\b$seed\|uniform\|1\d\d\d\|2/,
     "psql seeded_random count uniform");
 ok( $out =~ /\b$seed\|exponential\|2\d\d\d\|2/,
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3a3b0eb..b58304e 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -45,7 +45,10 @@ sub test_role

     $status_string = 'success' if ($expected_res eq 0);

-    my $res = $node->psql('postgres', undef, extra_params => [ '-U', $role ]);
+    my $res = $node->psql(
+        'postgres', undef,
+        on_error_die => 0,
+        extra_params => [ '-U', $role ]);
     is($res, $expected_res,
         "authentication $status_string for method $method, role $role");
     return;
diff --git a/src/test/authentication/t/002_saslprep.pl b/src/test/authentication/t/002_saslprep.pl
index c4b335c..0a75736 100644
--- a/src/test/authentication/t/002_saslprep.pl
+++ b/src/test/authentication/t/002_saslprep.pl
@@ -42,7 +42,10 @@ sub test_login
     $status_string = 'success' if ($expected_res eq 0);

     $ENV{"PGPASSWORD"} = $password;
-    my $res = $node->psql('postgres', undef, extra_params => [ '-U', $role ]);
+    my $res = $node->psql(
+        'postgres', undef,
+        on_error_die => 0,
+        extra_params => [ '-U', $role ]);
     is($res, $expected_res,
         "authentication $status_string for role $role with password $password"
     );
diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index e3eb052..f5801d0 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -175,6 +175,7 @@ sub test_access
     my ($res, $stdoutres, $stderrres) = $node->psql(
         'postgres',
         "$server_check",
+        on_error_die => 0,
         extra_params => [
             '-XAtd',
             $node->connstr('postgres')
diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl
index daa2270..cb3298b 100644
--- a/src/test/ldap/t/001_auth.pl
+++ b/src/test/ldap/t/001_auth.pl
@@ -164,8 +164,10 @@ sub test_access
 {
     my ($node, $role, $expected_res, $test_name) = @_;

-    my $res =
-      $node->psql('postgres', 'SELECT 1', extra_params => [ '-U', $role ]);
+    my $res = $node->psql(
+        'postgres', 'SELECT 1',
+        on_error_die => 0,
+        extra_params => [ '-U', $role ]);
     is($res, $expected_res, $test_name);
     return;
 }
diff --git a/src/test/modules/commit_ts/t/002_standby.pl b/src/test/modules/commit_ts/t/002_standby.pl
index f376b59..8b7f8b9 100644
--- a/src/test/modules/commit_ts/t/002_standby.pl
+++ b/src/test/modules/commit_ts/t/002_standby.pl
@@ -51,9 +51,10 @@ $standby->poll_query_until('postgres',
 $standby->safe_psql('postgres', 'checkpoint');

 # This one should raise an error now
-my ($ret, $standby_ts_stdout, $standby_ts_stderr) = $standby->psql('postgres',
-    'select ts.* from pg_class, pg_xact_commit_timestamp(xmin) ts where relname = \'t10\''
-);
+my ($ret, $standby_ts_stdout, $standby_ts_stderr) = $standby->psql(
+    'postgres',
+    'select ts.* from pg_class, pg_xact_commit_timestamp(xmin) ts where relname = \'t10\'',
+    on_error_die => 0);
 is($ret, 3, 'standby errors when master turned feature off');
 is($standby_ts_stdout, '',
     "standby gives no value when master turned feature off");
diff --git a/src/test/modules/commit_ts/t/003_standby_2.pl b/src/test/modules/commit_ts/t/003_standby_2.pl
index 9165d50..10a5c78 100644
--- a/src/test/modules/commit_ts/t/003_standby_2.pl
+++ b/src/test/modules/commit_ts/t/003_standby_2.pl
@@ -40,8 +40,8 @@ $standby->restart;

 my ($psql_ret, $standby_ts_stdout, $standby_ts_stderr) = $standby->psql(
     'postgres',
-    qq{SELECT ts.* FROM pg_class, pg_xact_commit_timestamp(xmin) AS ts WHERE relname = 't10'}
-);
+    qq{SELECT ts.* FROM pg_class, pg_xact_commit_timestamp(xmin) AS ts WHERE relname = 't10'},
+    on_error_die => 0);
 is($psql_ret, 3, 'expect error when getting commit timestamp after restart');
 is($standby_ts_stdout, '', "standby does not return a value after restart");
 like(
diff --git a/src/test/modules/commit_ts/t/004_restart.pl b/src/test/modules/commit_ts/t/004_restart.pl
index bd4b943..95041a3 100644
--- a/src/test/modules/commit_ts/t/004_restart.pl
+++ b/src/test/modules/commit_ts/t/004_restart.pl
@@ -12,21 +12,27 @@ $node_master->start;

 my ($ret, $stdout, $stderr);

-($ret, $stdout, $stderr) =
-  $node_master->psql('postgres', qq[SELECT pg_xact_commit_timestamp('0');]);
+($ret, $stdout, $stderr) = $node_master->psql(
+    'postgres',
+    qq[SELECT pg_xact_commit_timestamp('0');],
+    on_error_die => 0);
 is($ret, 3, 'getting ts of InvalidTransactionId reports error');
 like(
     $stderr,
     qr/cannot retrieve commit timestamp for transaction/,
     'expected error from InvalidTransactionId');

-($ret, $stdout, $stderr) =
-  $node_master->psql('postgres', qq[SELECT pg_xact_commit_timestamp('1');]);
+($ret, $stdout, $stderr) = $node_master->psql(
+    'postgres',
+    qq[SELECT pg_xact_commit_timestamp('1');],
+    on_error_die => 0);
 is($ret,    0,  'getting ts of BootstrapTransactionId succeeds');
 is($stdout, '', 'timestamp of BootstrapTransactionId is null');

-($ret, $stdout, $stderr) =
-  $node_master->psql('postgres', qq[SELECT pg_xact_commit_timestamp('2');]);
+($ret, $stdout, $stderr) = $node_master->psql(
+    'postgres',
+    qq[SELECT pg_xact_commit_timestamp('2');],
+    on_error_die => 0);
 is($ret,    0,  'getting ts of FrozenTransactionId succeeds');
 is($stdout, '', 'timestamp of FrozenTransactionId is null');

@@ -102,8 +108,10 @@ LANGUAGE plpgsql;
 ));
 $node_master->safe_psql('postgres', 'CALL consume_xid(2000)');

-($ret, $stdout, $stderr) = $node_master->psql('postgres',
-    qq[SELECT pg_xact_commit_timestamp('$xid');]);
+($ret, $stdout, $stderr) = $node_master->psql(
+    'postgres',
+    qq[SELECT pg_xact_commit_timestamp('$xid');],
+    on_error_die => 0);
 is($ret, 3, 'no commit timestamp from enable tx when cts disabled');
 like(
     $stderr,
@@ -120,8 +128,10 @@ my $xid_disabled = $node_master->safe_psql(
 ]);

 # Should be inaccessible
-($ret, $stdout, $stderr) = $node_master->psql('postgres',
-    qq[SELECT pg_xact_commit_timestamp('$xid_disabled');]);
+($ret, $stdout, $stderr) = $node_master->psql(
+    'postgres',
+    qq[SELECT pg_xact_commit_timestamp('$xid_disabled');],
+    on_error_die => 0);
 is($ret, 3, 'no commit timestamp when disabled');
 like(
     $stderr,
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 270bd6c..bf7b3f7 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -32,8 +32,7 @@ PostgresNode - class representing PostgreSQL server instance
   my $cmdret = $node->psql('postgres', 'SELECT pg_sleep(60)',
       stdout => \$stdout, stderr => \$stderr,
       timeout => 30, timed_out => \$timed_out,
-      extra_params => ['--single-transaction'],
-      on_error_die => 1)
+      extra_params => ['--single-transaction'])
   print "Sleep timed out" if $timed_out;

   # Similar thing, more convenient in common cases
@@ -1332,10 +1331,12 @@ disabled.  That may be overridden by passing extra psql parameters.
 stdout and stderr are transformed to UNIX line endings if on Windows. Any
 trailing newline is removed.

-Dies on failure to invoke psql but not if psql exits with a nonzero
-return code (unless on_error_die specified).
+By default, dies with an informative error message if psql exits with a
+nonzero return code.  Set on_error_die to 0 to have the return code be
+returned, instead.

-If psql exits because of a signal, an exception is raised.
+If psql cannot be invoked, or it exits because of a signal,
+an exception is raised.

 =over

@@ -1357,10 +1358,11 @@ By default, the B<psql> method invokes the B<psql> program with ON_ERROR_STOP=1
 set, so SQL execution is stopped at the first error and exit code 2 is
 returned.  Set B<on_error_stop> to 0 to ignore errors instead.

-=item on_error_die => 0
+=item on_error_die => 1

-By default, this method returns psql's result code. Pass on_error_die to
-instead die with an informative message.
+By default, this method dies if psql returns a nonzero exit code.
+Set B<on_error_die> to 0 to return the exit code and let the caller
+handle errors.

 =item timeout => 'interval'

@@ -1381,18 +1383,18 @@ If given, it must be an array reference containing additional parameters to B<ps

 e.g.

+    $node->psql('postgres', $sql);
+
+dies with an informative message if $sql fails.
+
     my ($stdout, $stderr, $timed_out);
     my $cmdret = $node->psql('postgres', 'SELECT pg_sleep(60)',
-        stdout => \$stdout, stderr => \$stderr,
+        stdout => \$stdout, stderr => \$stderr, on_error_die => 0,
         timeout => 30, timed_out => \$timed_out,
         extra_params => ['--single-transaction'])

 will set $cmdret to undef and $timed_out to a true value.

-    $node->psql('postgres', $sql, on_error_die => 1);
-
-dies with an informative message if $sql fails.
-
 =cut

 sub psql
@@ -1424,7 +1426,7 @@ sub psql
     }

     $params{on_error_stop} = 1 unless defined $params{on_error_stop};
-    $params{on_error_die}  = 0 unless defined $params{on_error_die};
+    $params{on_error_die}  = 1 unless defined $params{on_error_die};

     push @psql_params, '-v', 'ON_ERROR_STOP=1' if $params{on_error_stop};
     push @psql_params, @{ $params{extra_params} }
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 3c743d7..ed11733 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -60,10 +60,18 @@ print "standby 2: $result\n";
 is($result, qq(1002), 'check streamed content on standby 2');

 # Check that only READ-only queries can run on standbys
-is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
-    3, 'read-only queries on standby 1');
-is($node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
-    3, 'read-only queries on standby 2');
+is( $node_standby_1->psql(
+        'postgres',
+        'INSERT INTO tab_int VALUES (1)',
+        on_error_die => 0),
+    3,
+    'read-only queries on standby 1');
+is( $node_standby_2->psql(
+        'postgres',
+        'INSERT INTO tab_int VALUES (1)',
+        on_error_die => 0),
+    3,
+    'read-only queries on standby 2');

 # Tests for connection parameter target_session_attrs
 note "testing connection parameter \"target_session_attrs\"";
@@ -94,8 +102,9 @@ sub test_target_session_attrs

     # The client used for the connection does not matter, only the backend
     # point does.
-    my ($ret, $stdout, $stderr) =
-      $node1->psql('postgres', 'SHOW port;',
+    my ($ret, $stdout, $stderr) = $node1->psql(
+        'postgres', 'SHOW port;',
+        on_error_die => 0,
         extra_params => [ '-d', $connstr ]);
     is( $status == $ret && $stdout eq $target_node->port,
         1,
@@ -138,26 +147,26 @@ my $connstr_db     = "$connstr_common replication=database dbname=postgres";
 # Test SHOW ALL
 my ($ret, $stdout, $stderr) = $node_master->psql(
     'postgres', 'SHOW ALL;',
-    on_error_die => 1,
+    on_error_die => 0,
     extra_params => [ '-d', $connstr_rep ]);
 ok($ret == 0, "SHOW ALL with replication role and physical replication");
 ($ret, $stdout, $stderr) = $node_master->psql(
     'postgres', 'SHOW ALL;',
-    on_error_die => 1,
+    on_error_die => 0,
     extra_params => [ '-d', $connstr_db ]);
 ok($ret == 0, "SHOW ALL with replication role and logical replication");

 # Test SHOW with a user-settable parameter
 ($ret, $stdout, $stderr) = $node_master->psql(
     'postgres', 'SHOW work_mem;',
-    on_error_die => 1,
+    on_error_die => 0,
     extra_params => [ '-d', $connstr_rep ]);
 ok( $ret == 0,
     "SHOW with user-settable parameter, replication role and physical replication"
 );
 ($ret, $stdout, $stderr) = $node_master->psql(
     'postgres', 'SHOW work_mem;',
-    on_error_die => 1,
+    on_error_die => 0,
     extra_params => [ '-d', $connstr_db ]);
 ok( $ret == 0,
     "SHOW with user-settable parameter, replication role and logical replication"
@@ -166,14 +175,14 @@ ok( $ret == 0,
 # Test SHOW with a superuser-settable parameter
 ($ret, $stdout, $stderr) = $node_master->psql(
     'postgres', 'SHOW primary_conninfo;',
-    on_error_die => 1,
+    on_error_die => 0,
     extra_params => [ '-d', $connstr_rep ]);
 ok( $ret == 0,
     "SHOW with superuser-settable parameter, replication role and physical replication"
 );
 ($ret, $stdout, $stderr) = $node_master->psql(
     'postgres', 'SHOW primary_conninfo;',
-    on_error_die => 1,
+    on_error_die => 0,
     extra_params => [ '-d', $connstr_db ]);
 ok( $ret == 0,
     "SHOW with superuser-settable parameter, replication role and logical replication"
@@ -190,7 +199,8 @@ $node_master->append_conf('postgresql.conf', "max_replication_slots = 4");
 $node_master->restart;
 is( $node_master->psql(
         'postgres',
-        qq[SELECT pg_create_physical_replication_slot('$slotname_1');]),
+        qq[SELECT pg_create_physical_replication_slot('$slotname_1');],
+        on_error_die => 0),
     0,
     'physical slot created on master');
 $node_standby_1->append_conf('postgresql.conf',
@@ -201,7 +211,8 @@ $node_standby_1->append_conf('postgresql.conf', "max_replication_slots = 4");
 $node_standby_1->restart;
 is( $node_standby_1->psql(
         'postgres',
-        qq[SELECT pg_create_physical_replication_slot('$slotname_2');]),
+        qq[SELECT pg_create_physical_replication_slot('$slotname_2');],
+        on_error_die => 0),
     0,
     'physical slot created on intermediate replica');
 $node_standby_2->append_conf('postgresql.conf',
diff --git a/src/test/recovery/t/006_logical_decoding.pl b/src/test/recovery/t/006_logical_decoding.pl
index c23cc4d..9d36ffb 100644
--- a/src/test/recovery/t/006_logical_decoding.pl
+++ b/src/test/recovery/t/006_logical_decoding.pl
@@ -95,8 +95,8 @@ $node_master->safe_psql('postgres', 'CREATE DATABASE otherdb');

 is( $node_master->psql(
         'otherdb',
-        "SELECT lsn FROM pg_logical_slot_peek_changes('test_slot', NULL, NULL) ORDER BY lsn DESC LIMIT 1;"
-    ),
+        "SELECT lsn FROM pg_logical_slot_peek_changes('test_slot', NULL, NULL) ORDER BY lsn DESC LIMIT 1;",
+        on_error_die => 0),
     3,
     'replaying logical slot from another database fails');

@@ -119,8 +119,12 @@ SKIP:
     $node_master->poll_query_until('otherdb',
         "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'otherdb_slot' AND active_pid IS NOT
NULL)"
     ) or die "slot never became active";
-    is($node_master->psql('postgres', 'DROP DATABASE otherdb'),
-        3, 'dropping a DB with active logical slots fails');
+    is( $node_master->psql(
+            'postgres',
+            'DROP DATABASE otherdb',
+            on_error_die => 0),
+        3,
+        'dropping a DB with active logical slots fails');
     $pg_recvlogical->kill_kill;
     is($node_master->slot('otherdb_slot')->{'slot_name'},
         undef, 'logical slot still exists');
@@ -130,8 +134,10 @@ $node_master->poll_query_until('otherdb',
     "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'otherdb_slot' AND active_pid IS NULL)"
 ) or die "slot never became inactive";

-is($node_master->psql('postgres', 'DROP DATABASE otherdb'),
-    0, 'dropping a DB with inactive logical slots succeeds');
+is( $node_master->psql(
+        'postgres', 'DROP DATABASE otherdb', on_error_die => 0),
+    0,
+    'dropping a DB with inactive logical slots succeeds');
 is($node_master->slot('otherdb_slot')->{'slot_name'},
     undef, 'logical slot was actually dropped with DB');

diff --git a/src/test/recovery/t/008_fsm_truncation.pl b/src/test/recovery/t/008_fsm_truncation.pl
index ddab464..551a883 100644
--- a/src/test/recovery/t/008_fsm_truncation.pl
+++ b/src/test/recovery/t/008_fsm_truncation.pl
@@ -91,6 +91,7 @@ $node_standby->restart;
 # Insert should work on standby
 is( $node_standby->psql(
         'postgres',
-        qq{insert into testtab select generate_series(1,1000), 'foo';}),
+        qq{insert into testtab select generate_series(1,1000), 'foo';},
+        on_error_die => 0),
     0,
     'INSERT succeeds with truncated relation FSM');
diff --git a/src/test/recovery/t/009_twophase.pl b/src/test/recovery/t/009_twophase.pl
index 1b748ad..d23e690 100644
--- a/src/test/recovery/t/009_twophase.pl
+++ b/src/test/recovery/t/009_twophase.pl
@@ -76,10 +76,16 @@ $cur_master->psql(
 $cur_master->stop;
 $cur_master->start;

-$psql_rc = $cur_master->psql('postgres', "COMMIT PREPARED 'xact_009_1'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "COMMIT PREPARED 'xact_009_1'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Commit prepared transaction after restart');

-$psql_rc = $cur_master->psql('postgres', "ROLLBACK PREPARED 'xact_009_2'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "ROLLBACK PREPARED 'xact_009_2'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Rollback prepared transaction after restart');

 ###############################################################################
@@ -104,10 +110,16 @@ $cur_master->psql(
 $cur_master->teardown_node;
 $cur_master->start;

-$psql_rc = $cur_master->psql('postgres', "COMMIT PREPARED 'xact_009_3'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "COMMIT PREPARED 'xact_009_3'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Commit prepared transaction after teardown');

-$psql_rc = $cur_master->psql('postgres', "ROLLBACK PREPARED 'xact_009_4'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "ROLLBACK PREPARED 'xact_009_4'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Rollback prepared transaction after teardown');

 ###############################################################################
@@ -131,7 +143,10 @@ $cur_master->psql(
 $cur_master->teardown_node;
 $cur_master->start;

-$psql_rc = $cur_master->psql('postgres', "COMMIT PREPARED 'xact_009_5'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "COMMIT PREPARED 'xact_009_5'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Replay several transactions with same GID');

 ###############################################################################
@@ -157,7 +172,8 @@ $psql_rc = $cur_master->psql(
     INSERT INTO t_009_tbl VALUES (16, 'issued to ${cur_master_name}');
     -- This prepare can fail due to conflicting GID or locks conflicts if
     -- replay did not fully cleanup its state on previous commit.
-    PREPARE TRANSACTION 'xact_009_7';");
+    PREPARE TRANSACTION 'xact_009_7';",
+    on_error_die => 0);
 is($psql_rc, '0', "Cleanup of shared memory state for 2PC commit");

 $cur_master->psql('postgres', "COMMIT PREPARED 'xact_009_7'");
@@ -223,8 +239,10 @@ $cur_master_name = $cur_master->name;

 # because london is not running at this point, we can't use syncrep commit
 # on this command
-$psql_rc = $cur_master->psql('postgres',
-    "SET synchronous_commit = off; COMMIT PREPARED 'xact_009_10'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "SET synchronous_commit = off; COMMIT PREPARED 'xact_009_10'",
+    on_error_die => 0);
 is($psql_rc, '0', "Restore of prepared transaction on promoted standby");

 # restart old master as new standby
@@ -355,10 +373,16 @@ $cur_master->psql(
 $cur_master->teardown_node;
 $cur_master->start;

-$psql_rc = $cur_master->psql('postgres', "COMMIT PREPARED 'xact_009_14'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "COMMIT PREPARED 'xact_009_14'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Commit prepared transaction after teardown');

-$psql_rc = $cur_master->psql('postgres', "ROLLBACK PREPARED 'xact_009_15'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "ROLLBACK PREPARED 'xact_009_15'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Rollback prepared transaction after teardown');

 ###############################################################################
@@ -382,10 +406,16 @@ $cur_master->psql(
 $cur_master->stop;
 $cur_master->start;

-$psql_rc = $cur_master->psql('postgres', "COMMIT PREPARED 'xact_009_16'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "COMMIT PREPARED 'xact_009_16'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Commit prepared transaction after restart');

-$psql_rc = $cur_master->psql('postgres', "ROLLBACK PREPARED 'xact_009_17'");
+$psql_rc = $cur_master->psql(
+    'postgres',
+    "ROLLBACK PREPARED 'xact_009_17'",
+    on_error_die => 0);
 is($psql_rc, '0', 'Rollback prepared transaction after restart');

 ###############################################################################
diff --git a/src/test/recovery/t/010_logical_decoding_timelines.pl
b/src/test/recovery/t/010_logical_decoding_timelines.pl
index e582b20..5555a94 100644
--- a/src/test/recovery/t/010_logical_decoding_timelines.pl
+++ b/src/test/recovery/t/010_logical_decoding_timelines.pl
@@ -83,8 +83,9 @@ $node_replica->start;

 # If we drop 'dropme' on the master, the standby should drop the
 # db and associated slot.
-is($node_master->psql('postgres', 'DROP DATABASE dropme'),
-    0, 'dropped DB with logical slot OK on master');
+is( $node_master->psql('postgres', 'DROP DATABASE dropme', on_error_die => 0),
+    0,
+    'dropped DB with logical slot OK on master');
 $node_master->wait_for_catchup($node_replica, 'replay',
     $node_master->lsn('insert'));
 is( $node_replica->safe_psql(
@@ -141,9 +142,10 @@ $node_replica->safe_psql('postgres',
     "INSERT INTO decoding(blah) VALUES ('after failover');");

 # Shouldn't be able to read from slot created after base backup
-($ret, $stdout, $stderr) = $node_replica->psql('postgres',
-    "SELECT data FROM pg_logical_slot_peek_changes('after_basebackup', NULL, NULL, 'include-xids', '0',
'skip-empty-xacts','1');" 
-);
+($ret, $stdout, $stderr) = $node_replica->psql(
+    'postgres',
+    "SELECT data FROM pg_logical_slot_peek_changes('after_basebackup', NULL, NULL, 'include-xids', '0',
'skip-empty-xacts','1');", 
+    on_error_die => 0);
 is($ret, 3, 'replaying from after_basebackup slot fails');
 like(
     $stderr,
@@ -154,7 +156,8 @@ like(
 ($ret, $stdout, $stderr) = $node_replica->psql(
     'postgres',
     "SELECT data FROM pg_logical_slot_peek_changes('before_basebackup', NULL, NULL, 'include-xids', '0',
'skip-empty-xacts','1');", 
-    timeout => 30);
+    on_error_die => 0,
+    timeout      => 30);
 is($ret, 0, 'replay from slot before_basebackup succeeds');

 my $final_expected_output_bb = q(BEGIN
diff --git a/src/test/recovery/t/012_subtransactions.pl b/src/test/recovery/t/012_subtransactions.pl
index 292cd40..fa13f22 100644
--- a/src/test/recovery/t/012_subtransactions.pl
+++ b/src/test/recovery/t/012_subtransactions.pl
@@ -167,7 +167,10 @@ is($psql_out, '-1', "Not visible");
 ($node_master, $node_standby) = ($node_standby, $node_master);
 $node_standby->enable_streaming($node_master);
 $node_standby->start;
-$psql_rc = $node_master->psql('postgres', "COMMIT PREPARED 'xact_012_1'");
+$psql_rc = $node_master->psql(
+    'postgres',
+    "COMMIT PREPARED 'xact_012_1'",
+    on_error_die => 0);
 is($psql_rc, '0',
     "Restore of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby"
 );
@@ -204,7 +207,10 @@ is($psql_out, '-1', "Not visible");
 ($node_master, $node_standby) = ($node_standby, $node_master);
 $node_standby->enable_streaming($node_master);
 $node_standby->start;
-$psql_rc = $node_master->psql('postgres', "ROLLBACK PREPARED 'xact_012_1'");
+$psql_rc = $node_master->psql(
+    'postgres',
+    "ROLLBACK PREPARED 'xact_012_1'",
+    on_error_die => 0);
 is($psql_rc, '0',
     "Rollback of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby"
 );
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 366a7a9..d7066c2 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -87,14 +87,16 @@ $node_publisher->safe_psql('postgres',

 is( $node_publisher->psql(
         'postgres',
-        "CREATE TEMPORARY TABLE tt1 AS SELECT 1 AS a; UPDATE tt1 SET a = 2;"),
+        "CREATE TEMPORARY TABLE tt1 AS SELECT 1 AS a; UPDATE tt1 SET a = 2;",
+        on_error_die => 0),
     0,
     'update to temporary table without replica identity with FOR ALL TABLES publication'
 );

 is( $node_publisher->psql(
         'postgres',
-        "CREATE UNLOGGED TABLE tu1 AS SELECT 1 AS a; UPDATE tu1 SET a = 2;"),
+        "CREATE UNLOGGED TABLE tu1 AS SELECT 1 AS a; UPDATE tu1 SET a = 2;",
+        on_error_die => 0),
     0,
     'update to unlogged table without replica identity with FOR ALL TABLES publication'
 );

Re: Improve error detections in TAP tests by spreading safe_psql

From
Michael Paquier
Date:
On Wed, Aug 28, 2019 at 12:19:08PM -0400, Tom Lane wrote:
> The attached patch just follows a mechanical rule of "set on_error_die
> to 0 in exactly those calls where the surrounding code verifies the
> exit code is what it expects".  That leads to a lot of code that
> looks like, say,
>
>  # Insert should work on standby
>  is( $node_standby->psql(
>          'postgres',
> -        qq{insert into testtab select generate_series(1,1000), 'foo';}),
> +        qq{insert into testtab select generate_series(1,1000), 'foo';},
> +        on_error_die => 0),
>      0,
>      'INSERT succeeds with truncated relation FSM');

I am fine with that approach.  There is an argument for dropping
safe_psql then?

> I have to wonder if it isn't better to just drop the is() test
> and let PostgresNode::psql issue the failure.  The only thing
> the is() is really buying us is the ability to label the test
> with an ID string, which is helpful, but um ... couldn't that
> just be a comment?

I got the same thought as you on this point about why the is() is
actually necessary if we know that it would fail.  An advantage of the
current code is that we are able to catch all errors in a given run at
once.  If the test dies immediately, you cannot catch that and this
needs repetitive retries if there is no dependency between each step
of the test.  An argument against back-patching the stuff changing the
default flag are tests which rely on the current behavior. This could
be annoying for some people doing advanced testing.
--
Michael

Attachment

Re: Improve error detections in TAP tests by spreading safe_psql

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Wed, Aug 28, 2019 at 12:19:08PM -0400, Tom Lane wrote:
>> The attached patch just follows a mechanical rule of "set on_error_die
>> to 0 in exactly those calls where the surrounding code verifies the
>> exit code is what it expects".

> I am fine with that approach.  There is an argument for dropping
> safe_psql then?

Well, it's useful if you just want the stdout back.  But its name
is a bit misleading if the default behavior of psql is just as
safe.  Not sure whether renaming it is worthwhile.

>> I have to wonder if it isn't better to just drop the is() test
>> and let PostgresNode::psql issue the failure.

> I got the same thought as you on this point about why the is() is
> actually necessary if we know that it would fail.  An advantage of the
> current code is that we are able to catch all errors in a given run at
> once.

Yeah, but only if the test cases are independent, which I think is
mostly not true in our TAP scripts.  Otherwise you're just looking
at cascading errors.

> An argument against back-patching the stuff changing the
> default flag are tests which rely on the current behavior. This could
> be annoying for some people doing advanced testing.

Yup, the tradeoff is people possibly having test scripts outside
our tree that would break, versus the possibility that we'll back-patch
test changes that don't behave as expected in back branches.  I don't
know if the former is true, but I'm afraid the latter is a certainty
if we don't back-patch.

            regards, tom lane



Re: Improve error detections in TAP tests by spreading safe_psql

From
Michael Paquier
Date:
On Wed, Aug 28, 2019 at 09:44:58PM -0400, Tom Lane wrote:
> Well, it's useful if you just want the stdout back.  But its name
> is a bit misleading if the default behavior of psql is just as
> safe.  Not sure whether renaming it is worthwhile.

It is not that complicated enough to capture stdout with
PostgresNode::psql either, so if we are making the default of one
(PostgresNode::psql) look like the other (PostgresNode::safe_psql),
then we lose no actual feature by dropping safe_psql.

> Yeah, but only if the test cases are independent, which I think is
> mostly not true in our TAP scripts.  Otherwise you're just looking
> at cascading errors.

Yep.  We have a couple of cases though where things are quite
independent, like the 2PC suite divided into sequences of different
transactions, but mostly there are dependencies between one step and
the follow-up ones.

> Yup, the tradeoff is people possibly having test scripts outside
> our tree that would break, versus the possibility that we'll back-patch
> test changes that don't behave as expected in back branches.  I don't
> know if the former is true, but I'm afraid the latter is a certainty
> if we don't back-patch.

Exactly, that's why I am wondering how wide breakages in those
out-of-the-tree tests would go as we have PostgresNode since 9.6.
Facing this choice makes me uneasy, which is why I would tend to only
do incompatible things on HEAD.  Another, safer, possibility would be
to introduce in back-branches an extra psql-like routine enforcing
errors which we use in our tests, to keep the former ones for
compatibility, and to drop the old ones on HEAD.  It is never fun to
face sudden errors on a system when doing a minor upgrade.
--
Michael

Attachment