Thread: Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
[ redirecting to -hackers ]

Peter <pmc@citylink.dinoex.sub.org> writes:
> But I just recognize something of interest (which I had taken for
> granted when importing the database): the flaw does NOT appear when
> accessing the database from the server's local system (with TCP and
> GSSAPI encryption active). Only from remote system.
> But then, if I go on the local system, and change the mtu:
> # ifconfig lo0 mtu 1500
> and restart the server, then I get the exact same errors locally.

Oh-ho, that is interesting.

Looking at our regression tests for gssenc, I observe that they
only try to transport a negligible amount of data (viz, single-row
boolean results).  So we'd not notice any problem that involved
multiple-packet messages.

I modified the kerberos test so that it tries a query with a less
negligible amount of data, and what I find is:

* passes on Fedora 30, with either default or 1500 mtu
* passes on FreeBSD 12.0 with default mtu
* FAILS on FreeBSD 12.0 with mtu = 1500

I haven't run it further to ground than that, but there's definitely
something fishy here.  Based on just these results one would be hard
pressed to say if it's our bug or FreeBSD's, but your report that you
don't see the failure with PG 11 makes it sound like our problem.

OTOH, I also find that there's some hysteresis in the behavior:
once it's failed, reverting the mtu to the default setting doesn't
necessarily make subsequent runs pass.  It's really hard to explain
that behavior if it's our bug.

I tested today's HEAD of our code with up-to-date FreeBSD 12.0-RELEASE-p12
running on amd64 bare metal, no jails or emulators or VIMAGE or anything.

Attached are proposed test patch, as well as client-side regression log
output from a failure.  (There's no evidence of distress in the
postmaster log, same as your report.)

            regards, tom lane

diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index e3eb052..0fc18e3 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -19,7 +19,7 @@ use Test::More;
 
 if ($ENV{with_gssapi} eq 'yes')
 {
-    plan tests => 12;
+    plan tests => 15;
 }
 else
 {
@@ -195,6 +195,29 @@ sub test_access
     return;
 }
 
+# As above, but test for an arbitrary query result.
+sub test_query
+{
+    my ($node, $role, $query, $expected, $gssencmode, $test_name) = @_;
+
+    # need to connect over TCP/IP for Kerberos
+    my ($res, $stdoutres, $stderrres) = $node->psql(
+        'postgres',
+        "$query",
+        extra_params => [
+            '-XAtd',
+            $node->connstr('postgres')
+              . " host=$host hostaddr=$hostaddr $gssencmode",
+            '-U',
+            $role
+        ]);
+
+    is($res, 0, $test_name);
+    like($stdoutres, $expected, $test_name);
+    is($stderrres, "", $test_name);
+    return;
+}
+
 unlink($node->data_dir . '/pg_hba.conf');
 $node->append_conf('pg_hba.conf',
     qq{host all all $hostaddr/32 gss map=mymap});
@@ -231,6 +254,15 @@ test_access(
     "gssencmode=require",
     "succeeds with GSS-encrypted access required with host hba");
 
+# Test that we can transport a reasonable amount of data.
+test_query(
+    $node,
+    "test1",
+    'SELECT * FROM generate_series(1, 100000);',
+    qr/^1\n.*\n1024\n.*\n9999\n.*\n100000$/s,
+    "gssencmode=require",
+    "transporting 100K lines works");
+
 unlink($node->data_dir . '/pg_hba.conf');
 $node->append_conf('pg_hba.conf',
     qq{hostgssenc all all $hostaddr/32 gss map=mymap});
1..15
# Checking port 56615
# Found port 56615
# setting up Kerberos
# Running: /usr/local/bin/krb5-config --version
# Running: /usr/local/sbin/kdb5_util create -s -P secret0
Loading random data
Initializing database '/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/krb5kdc/principal' for realm 'EXAMPLE.COM',
master key name 'K/M@EXAMPLE.COM'
# Running: /usr/local/sbin/kadmin.local -q addprinc -pw secret1 test1
WARNING: no policy specified for test1@EXAMPLE.COM; defaulting to no policy
Authenticating as principal test1/admin@EXAMPLE.COM with password.
Principal "test1@EXAMPLE.COM" created.
# Running: /usr/local/sbin/kadmin.local -q addprinc -randkey postgres/auth-test-localhost.postgresql.example.com
WARNING: no policy specified for postgres/auth-test-localhost.postgresql.example.com@EXAMPLE.COM; defaulting to no
policy
Authenticating as principal test1/admin@EXAMPLE.COM with password.
Principal "postgres/auth-test-localhost.postgresql.example.com@EXAMPLE.COM" created.
# Running: /usr/local/sbin/kadmin.local -q ktadd -k /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/krb5.keytab
postgres/auth-test-localhost.postgresql.example.com
Authenticating as principal test1/admin@EXAMPLE.COM with password.
Entry for principal postgres/auth-test-localhost.postgresql.example.com with kvno 2, encryption type
aes256-cts-hmac-sha1-96added to keytab WRFILE:/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/krb5.keytab. 
Entry for principal postgres/auth-test-localhost.postgresql.example.com with kvno 2, encryption type
aes128-cts-hmac-sha1-96added to keytab WRFILE:/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/krb5.keytab. 
# Running: /usr/local/sbin/krb5kdc -P /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/krb5kdc.pid
# setting up PostgreSQL instance
# Checking port 56616
# Found port 56616
Name: node
Data directory: /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata
Backup directory: /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/backup
Archive directory: /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/archives
Connection string: port=56616 host=/tmp/bB3RP5P9de
Log file: /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.log
# Running: initdb -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -A trust -N
The files belonging to this database system will be owned by user "tgl".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok

Sync to disk skipped.
The data directory might become corrupt if the operating system crashes.

Success. You can now start the database server using:

    pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l logfile start

# Running: /usr/home/tgl/pgsql/src/test/kerberos/../../../src/test/regress/pg_regress --config-auth
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata
### Starting node "node"
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.log-o --cluster-name=node start 
waiting for server to start.... done
server started
# Postmaster PID for node "node" is 1810
# running tests
### Restarting node "node"
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.logrestart 
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
# Postmaster PID for node "node" is 1821
ok 1 - fails without ticket
# Running: /usr/local/bin/kinit test1
Password for test1@EXAMPLE.COM:
ok 2 - fails without mapping
### Restarting node "node"
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.logrestart 
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
# Postmaster PID for node "node" is 1837
ok 3 - succeeds with mapping with default gssencmode and host hba
ok 4 - succeeds with GSS-encrypted access preferred with host hba
ok 5 - succeeds with GSS-encrypted access required with host hba
not ok 6 - transporting 100K lines works

#   Failed test 'transporting 100K lines works'
#   at t/001_auth.pl line 215.
#          got: '3'
#     expected: '0'
not ok 7 - transporting 100K lines works

#   Failed test 'transporting 100K lines works'
#   at t/001_auth.pl line 216.
#                   ''
#     doesn't match '(?^s:^1\n.*\n1024\n.*\n9999\n.*\n100000$)'
not ok 8 - transporting 100K lines works

#   Failed test 'transporting 100K lines works'
#   at t/001_auth.pl line 217.
#          got: 'psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle
# psql:<stdin>:1: message type 0x44 arrived from server while idle'
#     expected: ''
### Restarting node "node"
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.logrestart 
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
# Postmaster PID for node "node" is 1854
ok 9 - succeeds with GSS-encrypted access preferred and hostgssenc hba
ok 10 - succeeds with GSS-encrypted access required and hostgssenc hba
ok 11 - fails with GSS encryption disabled and hostgssenc hba
### Restarting node "node"
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.logrestart 
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
# Postmaster PID for node "node" is 1869
ok 12 - succeeds with GSS-encrypted access preferred and hostnogssenc hba, but no encryption
ok 13 - fails with GSS-encrypted access required and hostnogssenc hba
ok 14 - succeeds with GSS encryption disabled and hostnogssenc hba
### Restarting node "node"
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -l
/usr/home/tgl/pgsql/src/test/kerberos/tmp_check/log/001_auth_node.logrestart 
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
# Postmaster PID for node "node" is 1885
ok 15 - succeeds with include_realm=0 and defaults
### Stopping node "node" using mode immediate
# Running: pg_ctl -D /usr/home/tgl/pgsql/src/test/kerberos/tmp_check/t_001_auth_node_data/pgdata -m immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "node"
# Looks like you failed 3 tests of 15.

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
I wrote:
> I haven't run it further to ground than that, but there's definitely
> something fishy here.  Based on just these results one would be hard
> pressed to say if it's our bug or FreeBSD's, but your report that you
> don't see the failure with PG 11 makes it sound like our problem.

Ah, I have it: whoever wrote pg_GSS_read() failed to pay attention
to the fact that setting errno is a critical part of its API.
Sometimes it returns -1 while leaving errno in a state that causes
pqReadData to draw the wrong conclusions.  In particular that can
happen when it reads an incomplete packet, and that's very timing
dependent, which is why this is so ticklish to reproduce.

I'll have a patch in a little while.

            regards, tom lane



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI?)

From
Stephen Frost
Date:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I wrote:
> > I haven't run it further to ground than that, but there's definitely
> > something fishy here.  Based on just these results one would be hard
> > pressed to say if it's our bug or FreeBSD's, but your report that you
> > don't see the failure with PG 11 makes it sound like our problem.
>
> Ah, I have it: whoever wrote pg_GSS_read() failed to pay attention
> to the fact that setting errno is a critical part of its API.
> Sometimes it returns -1 while leaving errno in a state that causes
> pqReadData to draw the wrong conclusions.  In particular that can
> happen when it reads an incomplete packet, and that's very timing
> dependent, which is why this is so ticklish to reproduce.

Ah-hah.  Not sure if that was Robbie or myself (probably me, really,
since I rewrote a great deal of that code).  I agree that the regression
tests don't test with very much data, but I tested pushing quite a bit
of data through and didn't see any issues with my testing.  Apparently I
was getting pretty lucky. :/

> I'll have a patch in a little while.

That's fantastic, thanks!

Stephen

Attachment

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Ah-hah.  Not sure if that was Robbie or myself (probably me, really,
> since I rewrote a great deal of that code).  I agree that the regression
> tests don't test with very much data, but I tested pushing quite a bit
> of data through and didn't see any issues with my testing.  Apparently I
> was getting pretty lucky. :/

You were *very* lucky, because this code is absolutely full of mistakes
related to incomplete reads, inadequate or outright wrong error handling,
etc.

I was nearly done cleaning that up, when it sank into me that
fe-secure-gssapi.c uses static buffers for partially-read or
partially-encoded data.  That means that any client trying to use
multiple GSSAPI-encrypted connections is very likely to see breakage
due to different connections trying to use the same buffers concurrently.
I wonder whether that doesn't explain the complaints mentioned upthread
from the Ruby folks.

(be-secure-gssapi.c is coded identically, but there it's OK since
any backend only has one client connection to deal with.)

>> I'll have a patch in a little while.

> That's fantastic, thanks!

This is gonna take longer than I thought.

            regards, tom lane



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI?)

From
Peter
Date:
On Fri, Jan 10, 2020 at 12:59:09PM -0500, Tom Lane wrote:
! [ redirecting to -hackers ]

! I modified the kerberos test so that it tries a query with a less
! negligible amount of data, and what I find is:
! 
! * passes on Fedora 30, with either default or 1500 mtu
! * passes on FreeBSD 12.0 with default mtu
! * FAILS on FreeBSD 12.0 with mtu = 1500

So, it is not related to only VIMAGE @ R11.3, and -more important to
me- it is not only happening in my kitchen. Thank You very much :)

! OTOH, I also find that there's some hysteresis in the behavior:
! once it's failed, reverting the mtu to the default setting doesn't
! necessarily make subsequent runs pass.  It's really hard to explain
! that behavior if it's our bug.

That's affirmative. Made me go astray a few times when trying to
isolate it.

On Fri, Jan 10, 2020 at 02:25:22PM -0500, Tom Lane wrote:
! Ah, I have it: whoever wrote pg_GSS_read() failed to pay attention
! to the fact that setting errno is a critical part of its API.
! Sometimes it returns -1 while leaving errno in a state that causes

Wow, that's fast. My probability-guess this morning was either some
hard-coded 8192-byte buffer, or something taking an [EWOULDBLOCK] for
OK. Then I decided to not look into the code, as You will be much
faster anyway, and there are other pieces of software where I do
not have such a competent peer to talk to...

Anyway, thanks a lot!
PMc



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Stephen Frost
Date:
Greetings,

On Fri, Jan 10, 2020 at 15:58 Tom Lane <tgl@sss.pgh.pa.us> wrote:
Stephen Frost <sfrost@snowman.net> writes:
> Ah-hah.  Not sure if that was Robbie or myself (probably me, really,
> since I rewrote a great deal of that code).  I agree that the regression
> tests don't test with very much data, but I tested pushing quite a bit
> of data through and didn't see any issues with my testing.  Apparently I
> was getting pretty lucky. :/

You were *very* lucky, because this code is absolutely full of mistakes
related to incomplete reads, inadequate or outright wrong error handling,
etc.

I guess so..  I specifically remember running into problems transferring large data sets before I rewrote the code but after doing so it was reliable (for me anyway...).

I was nearly done cleaning that up, when it sank into me that
fe-secure-gssapi.c uses static buffers for partially-read or
partially-encoded data.  That means that any client trying to use
multiple GSSAPI-encrypted connections is very likely to see breakage
due to different connections trying to use the same buffers concurrently.

Ughhh. That’s a completely valid point and one I should have thought of.

I wonder whether that doesn't explain the complaints mentioned upthread
from the Ruby folks.

No- the segfault issue has been demonstrated to be able to reproduce without any PG code involved at all, and it also involved threads with only one connection, at least as I recall (on my phone atm).

(be-secure-gssapi.c is coded identically, but there it's OK since
any backend only has one client connection to deal with.)

Right...  I actually wrote the backend code first and then largely copied it to the front end, and then adjusted it, but obviously insufficiently as I had been thinking of just the one connection that the backend has to deal with.

Thanks!

Stephen

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
Here's a draft patch that cleans up all the logic errors I could find.

I also expanded the previous patch for the kerberos test so that it
verifies that we can upload a nontrivial amount of data to the server,
as well as download.

I also spent a fair amount of effort on removing cosmetic differences
between the comparable logic in be-secure-gssapi.c and fe-secure-gssapi.c,
such that the two files can now be diff'd to confirm that be_gssapi_write
and be_gssapi_read implement identical logic to pg_GSS_write and
pg_GSS_read.  (They did not, before :-(.)

This does not deal with the problem that libpq shouldn't be using
static data space for this purpose.  It seems reasonable to me to
leave that for a separate patch.

This passes tests for me, on my FreeBSD build with lo0 mtu = 1500.
It wouldn't hurt to get some more mileage on it though.  Peter,
I didn't follow how to set up the "packet radio speed" environment
that you mentioned, but if you could beat on this with that setup
it would surely be useful.

            regards, tom lane

diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
index b02d3dd..9a598e9 100644
--- a/src/backend/libpq/be-secure-gssapi.c
+++ b/src/backend/libpq/be-secure-gssapi.c
@@ -5,7 +5,6 @@
  *
  * Portions Copyright (c) 2018-2020, PostgreSQL Global Development Group
  *
- *
  * IDENTIFICATION
  *  src/backend/libpq/be-secure-gssapi.c
  *
@@ -28,10 +27,10 @@
  * Handle the encryption/decryption of data using GSSAPI.
  *
  * In the encrypted data stream on the wire, we break up the data
- * into packets where each packet starts with a sizeof(uint32)-byte
- * length (not allowed to be larger than the buffer sizes defined
- * below) and then the encrypted data of that length immediately
- * following.
+ * into packets where each packet starts with a uint32-size length
+ * word (in network byte order), then encrypted data of that length
+ * immediately following.  Decryption yields the same data stream
+ * that would appear when not using encryption.
  *
  * Encrypted data typically ends up being larger than the same data
  * unencrypted, so we use fixed-size buffers for handling the
@@ -42,8 +41,11 @@
  *
  * NOTE: The client and server have to agree on the max packet size,
  * because we have to pass an entire packet to GSSAPI at a time and we
- * don't want the other side to send arbitrairly huge packets as we
+ * don't want the other side to send arbitrarily huge packets as we
  * would have to allocate memory for them to then pass them to GSSAPI.
+ *
+ * Therefore, these two #define's are effectively part of the protocol
+ * spec and can't ever be changed.
  */
 #define PQ_GSS_SEND_BUFFER_SIZE 16384
 #define PQ_GSS_RECV_BUFFER_SIZE 16384
@@ -69,23 +71,31 @@ uint32        max_packet_size;    /* Maximum size we can encrypt and fit the
                                  * results into our output buffer */

 /*
- * Attempt to write len bytes of data from ptr along a GSSAPI-encrypted connection.
+ * Attempt to write len bytes of data from ptr to a GSSAPI-encrypted connection.
  *
- * Connection must be fully established (including authentication step) before
- * calling.  Returns the bytes actually consumed once complete.  Data is
- * internally buffered; in the case of an incomplete write, the amount of data we
- * processed (encrypted into our output buffer to be sent) will be returned.  If
- * an error occurs or we would block, a negative value is returned and errno is
- * set appropriately.
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
  *
- * To continue writing in the case of EWOULDBLOCK and similar, call this function
- * again with matching ptr and len parameters.
+ * Returns the number of data bytes consumed, or on failure, returns -1
+ * with errno set appropriately.  (For fatal errors, we may just elog and
+ * exit, if errno wouldn't be sufficient to describe the error.)  For
+ * retryable errors, caller should call again (passing the same data)
+ * once the socket is ready.
+ *
+ * Data is internally buffered; in the case of an incomplete write, the
+ * return value is the amount of caller data we consumed, not the amount
+ * physically sent.
  */
 ssize_t
 be_gssapi_write(Port *port, void *ptr, size_t len)
 {
+    OM_uint32    major,
+                minor;
+    gss_buffer_desc input,
+                output;
     size_t        bytes_to_encrypt = len;
     size_t        bytes_encrypted = 0;
+    gss_ctx_id_t gctx = port->gss->ctx;

     /*
      * Loop through encrypting data and sending it out until
@@ -95,13 +105,8 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
      */
     while (bytes_to_encrypt || PqGSSSendPointer)
     {
-        OM_uint32    major,
-                    minor;
-        gss_buffer_desc input,
-                    output;
         int            conf_state = 0;
         uint32        netlen;
-        pg_gssinfo *gss = port->gss;

         /*
          * Check if we have data in the encrypted output buffer that needs to
@@ -117,17 +122,14 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
             if (ret <= 0)
             {
                 /*
-                 * If we encrypted some data and it's in our output buffer,
-                 * but send() is saying that we would block, then tell the
-                 * caller how far we got with encrypting the data so that they
-                 * can call us again with whatever is left, at which point we
-                 * will try to send the remaining encrypted data first and
-                 * then move on to encrypting the rest of the data.
+                 * Report any successfully-consumed data; if there was none,
+                 * reflect the secure_raw_write result up to our caller.  In
+                 * the former case, we effectively assume that any interesting
+                 * failure condition will recur on the next try.
                  */
-                if (bytes_encrypted != 0 && (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR))
+                if (bytes_encrypted)
                     return bytes_encrypted;
-                else
-                    return ret;
+                return ret;
             }

             /*
@@ -151,14 +153,9 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
             return bytes_encrypted;

         /*
-         * max_packet_size is the maximum amount of unencrypted data that,
-         * when encrypted, will fit into our encrypted-data output buffer.
-         *
-         * If we are being asked to send more than max_packet_size unencrypted
-         * data, then we will loop and create multiple packets, each with
-         * max_packet_size unencrypted data encrypted in them (at least, until
-         * secure_raw_write returns a failure saying we would be blocked, at
-         * which point we will let the caller know how far we got).
+         * Check how much we are being asked to send, if it's too much, then
+         * we will have to loop and possibly be called multiple times to get
+         * through all the data.
          */
         if (bytes_to_encrypt > max_packet_size)
             input.length = max_packet_size;
@@ -171,7 +168,7 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
         output.length = 0;

         /* Create the next encrypted packet */
-        major = gss_wrap(&minor, gss->ctx, 1, GSS_C_QOP_DEFAULT,
+        major = gss_wrap(&minor, gctx, 1, GSS_C_QOP_DEFAULT,
                          &input, &conf_state, &output);
         if (major != GSS_S_COMPLETE)
             pg_GSS_error(FATAL, gettext_noop("GSSAPI wrap error"), major, minor);
@@ -189,7 +186,7 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
         bytes_encrypted += input.length;
         bytes_to_encrypt -= input.length;

-        /* 4 network-order length bytes, then payload */
+        /* 4 network-order bytes of length, then payload */
         netlen = htonl(output.length);
         memcpy(PqGSSSendBuffer + PqGSSSendPointer, &netlen, sizeof(uint32));
         PqGSSSendPointer += sizeof(uint32);
@@ -202,10 +199,15 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
 }

 /*
- * Read up to len bytes from a GSSAPI-encrypted connection into ptr.  Call
- * only after the connection has been fully established (i.e., GSSAPI
- * authentication is complete).  On success, returns the number of bytes
- * written into ptr; otherwise, returns -1 and sets errno appropriately.
+ * Read up to len bytes of data into ptr from a GSSAPI-encrypted connection.
+ *
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
+ *
+ * Returns the number of data bytes read, or on failure, returns -1
+ * with errno set appropriately.  (For fatal errors, we may just elog and
+ * exit, if errno wouldn't be sufficient to describe the error.)  For
+ * retryable errors, caller should call again once the socket is ready.
  */
 ssize_t
 be_gssapi_read(Port *port, void *ptr, size_t len)
@@ -217,22 +219,22 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
     ssize_t        ret;
     size_t        bytes_to_return = len;
     size_t        bytes_returned = 0;
-    int            conf_state = 0;
-    pg_gssinfo *gss = port->gss;
+    gss_ctx_id_t gctx = port->gss->ctx;

     /*
-     * The goal here is to read an incoming encrypted packet, one at a time,
-     * decrypt it into our out buffer, returning to the caller what they asked
-     * for, and then saving anything else for the next call.
+     * The plan here is to read one incoming encrypted packet into
+     * PqGSSRecvBuffer, decrypt it into PqGSSResultBuffer, and then dole out
+     * data from there to the caller.  When we exhaust the current input
+     * packet, read another.
      *
-     * First we look to see if we have unencrypted bytes available and, if so,
-     * copy those to the result.  If the caller asked for more than we had
-     * immediately available, then we try to read a packet off the wire and
-     * decrypt it.  If the read would block, then return the amount of
-     * unencrypted data we copied into the caller's ptr.
+     * If the caller asks for more bytes than one decrypted packet, then we
+     * should try to return all bytes asked for.  We only give up when
+     * secure_raw_read indicates read failure.
      */
     while (bytes_to_return)
     {
+        int            conf_state = 0;
+
         /* Check if we have data in our buffer that we can return immediately */
         if (PqGSSResultPointer < PqGSSResultLength)
         {
@@ -248,56 +250,46 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
             bytes_to_return -= bytes_to_copy;
             bytes_returned += bytes_to_copy;

-            /* Check if our result buffer is now empty and, if so, reset */
-            if (PqGSSResultPointer == PqGSSResultLength)
-                PqGSSResultPointer = PqGSSResultLength = 0;
-
             continue;
         }

+        /* Reset buffer pointers whenever result buffer is empty */
+        PqGSSResultPointer = PqGSSResultLength = 0;
+
         /*
-         * At this point, our output buffer should be empty with more bytes
-         * being requested to be read.  We are now ready to load the next
-         * packet and decrypt it (entirely) into our buffer.
+         * At this point, our output buffer is empty with more bytes being
+         * requested to be read.  We are now ready to load the next packet and
+         * decrypt it (entirely) into our output buffer.
          *
-         * If we get a partial read back while trying to read a packet off the
-         * wire then we return the number of unencrypted bytes we were able to
-         * copy (if any, if we didn't copy any, then we return whatever
-         * secure_raw_read returned when we called it; likely -1) into the
-         * caller's ptr and wait to be called again, until we get a full
-         * packet to decrypt.
+         * If we are unable to collect a full packet off the wire, return any
+         * bytes we already emitted, and wait to be called again.
          */

-        /* Check if we have the size of the packet already in our buffer. */
+        /* Collect the length if we haven't already */
         if (PqGSSRecvLength < sizeof(uint32))
         {
-            /*
-             * We were not able to get the length of the packet last time, so
-             * we need to do that first.
-             */
             ret = secure_raw_read(port, PqGSSRecvBuffer + PqGSSRecvLength,
                                   sizeof(uint32) - PqGSSRecvLength);
-            if (ret < 0)
+
+            /* If ret <= 0, secure_raw_read already set the correct errno */
+            if (ret <= 0)
                 return bytes_returned ? bytes_returned : ret;

             PqGSSRecvLength += ret;

-            /*
-             * If we only got part of the packet length, then return however
-             * many unencrypted bytes we copied to the caller and wait to be
-             * called again.
-             */
+            /* If we still haven't got the length, return to the caller */
             if (PqGSSRecvLength < sizeof(uint32))
-                return bytes_returned;
+            {
+                if (bytes_returned)
+                    return bytes_returned;
+                errno = EWOULDBLOCK;
+                return -1;
+            }
         }

-        /*
-         * We have the length of the next packet at this point, so pull it out
-         * and then read whatever we have left of the packet to read.
-         */
+        /* Decode the packet length and check for overlength packet */
         input.length = ntohl(*(uint32 *) PqGSSRecvBuffer);

-        /* Check for over-length packet */
         if (input.length > PQ_GSS_RECV_BUFFER_SIZE - sizeof(uint32))
             ereport(FATAL,
                     (errmsg("oversize GSSAPI packet sent by the client (%zu > %zu)",
@@ -310,37 +302,31 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
          */
         ret = secure_raw_read(port, PqGSSRecvBuffer + PqGSSRecvLength,
                               input.length - (PqGSSRecvLength - sizeof(uint32)));
-        if (ret < 0)
+        /* If ret <= 0, secure_raw_read already set the correct errno */
+        if (ret <= 0)
             return bytes_returned ? bytes_returned : ret;

         PqGSSRecvLength += ret;

-        /*
-         * If we got less than the rest of the packet then we need to return
-         * and be called again.  If we didn't have any bytes to return on this
-         * run then return -1 and set errno to EWOULDBLOCK.
-         */
+        /* If we don't yet have the whole packet, return to the caller */
         if (PqGSSRecvLength - sizeof(uint32) < input.length)
         {
-            if (!bytes_returned)
-            {
-                errno = EWOULDBLOCK;
-                return -1;
-            }
-
-            return bytes_returned;
+            if (bytes_returned)
+                return bytes_returned;
+            errno = EWOULDBLOCK;
+            return -1;
         }

         /*
          * We now have the full packet and we can perform the decryption and
-         * refill our output buffer, then loop back up to pass that back to
-         * the user.
+         * refill our output buffer, then loop back up to pass data back to
+         * the caller.
          */
         output.value = NULL;
         output.length = 0;
         input.value = PqGSSRecvBuffer + sizeof(uint32);

-        major = gss_unwrap(&minor, gss->ctx, &input, &output, &conf_state, NULL);
+        major = gss_unwrap(&minor, gctx, &input, &output, &conf_state, NULL);
         if (major != GSS_S_COMPLETE)
             pg_GSS_error(FATAL, gettext_noop("GSSAPI unwrap error"),
                          major, minor);
@@ -350,10 +336,9 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
                     (errmsg("incoming GSSAPI message did not use confidentiality")));

         memcpy(PqGSSResultBuffer, output.value, output.length);
-
         PqGSSResultLength = output.length;

-        /* Our buffer is now empty, reset it */
+        /* Our receive buffer is now empty, reset it */
         PqGSSRecvLength = 0;

         gss_release_buffer(&minor, &output);
@@ -379,37 +364,27 @@ read_or_wait(Port *port, ssize_t len)
      * Keep going until we either read in everything we were asked to, or we
      * error out.
      */
-    while (PqGSSRecvLength != len)
+    while (PqGSSRecvLength < len)
     {
         ret = secure_raw_read(port, PqGSSRecvBuffer + PqGSSRecvLength, len - PqGSSRecvLength);

         /*
-         * If we got back an error and it wasn't just EWOULDBLOCK/EAGAIN, then
-         * give up.
+         * If we got back an error and it wasn't just
+         * EWOULDBLOCK/EAGAIN/EINTR, then give up.
          */
-        if (ret < 0 && !(errno == EWOULDBLOCK || errno == EAGAIN))
+        if (ret < 0 &&
+            !(errno == EWOULDBLOCK || errno == EAGAIN || errno == EINTR))
             return -1;

         /*
          * Ok, we got back either a positive value, zero, or a negative result
-         * but EWOULDBLOCK or EAGAIN was set.
+         * indicating we should retry.
          *
-         * If it was zero or negative, then we try to wait on the socket to be
+         * If it was zero or negative, then we wait on the socket to be
          * readable again.
          */
         if (ret <= 0)
         {
-            /*
-             * If we got back less than zero, indicating an error, and that
-             * wasn't just a EWOULDBLOCK/EAGAIN, then give up.
-             */
-            if (ret < 0 && !(errno == EWOULDBLOCK || errno == EAGAIN))
-                return -1;
-
-            /*
-             * We got back either zero, or -1 with EWOULDBLOCK/EAGAIN, so wait
-             * on socket to be readable again.
-             */
             WaitLatchOrSocket(MyLatch,
                               WL_SOCKET_READABLE | WL_EXIT_ON_PM_DEATH,
                               port->sock, 0, WAIT_EVENT_GSS_OPEN_SERVER);
@@ -430,7 +405,7 @@ read_or_wait(Port *port, ssize_t len)
                 if (ret == 0)
                     return -1;
             }
-            else
+            if (ret < 0)
                 continue;
         }

@@ -561,6 +536,16 @@ secure_open_gssapi(Port *port)
             while (PqGSSSendStart != sizeof(uint32) + output.length)
             {
                 ret = secure_raw_write(port, PqGSSSendBuffer + PqGSSSendStart, sizeof(uint32) + output.length -
PqGSSSendStart);
+
+                /*
+                 * If we got back an error and it wasn't just
+                 * EWOULDBLOCK/EAGAIN/EINTR, then give up.
+                 */
+                if (ret < 0 &&
+                    !(errno == EWOULDBLOCK || errno == EAGAIN || errno == EINTR))
+                    return -1;
+
+                /* Wait and retry if we couldn't write yet */
                 if (ret <= 0)
                 {
                     WaitLatchOrSocket(MyLatch,
@@ -580,7 +565,7 @@ secure_open_gssapi(Port *port)

         /*
          * If we got back that the connection is finished being set up, now
-         * that's we've sent the last packet, exit our loop.
+         * that we've sent the last packet, exit our loop.
          */
         if (complete_next)
             break;
@@ -588,10 +573,11 @@ secure_open_gssapi(Port *port)

     /*
      * Determine the max packet size which will fit in our buffer, after
-     * accounting for the length
+     * accounting for the length.  be_gssapi_write will need this.
      */
     major = gss_wrap_size_limit(&minor, port->gss->ctx, 1, GSS_C_QOP_DEFAULT,
-                                PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32), &max_packet_size);
+                                PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32),
+                                &max_packet_size);

     if (GSS_ERROR(major))
         pg_GSS_error(FATAL, gettext_noop("GSSAPI size check error"),
diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c
index e937024..91a5930 100644
--- a/src/interfaces/libpq/fe-secure-gssapi.c
+++ b/src/interfaces/libpq/fe-secure-gssapi.c
@@ -18,6 +18,7 @@
 #include "libpq-int.h"
 #include "port/pg_bswap.h"

+
 /*
  * Require encryption support, as well as mutual authentication and
  * tamperproofing measures.
@@ -26,15 +27,28 @@
     GSS_C_SEQUENCE_FLAG | GSS_C_CONF_FLAG | GSS_C_INTEG_FLAG

 /*
- * We use fixed-size buffers for handling the encryption/decryption
- * which are larger than PQComm's buffer will typically be to minimize
- * the times where we have to make multiple packets and therefore sets
- * of recv/send calls for a single read/write call to us.
+ * Handle the encryption/decryption of data using GSSAPI.
+ *
+ * In the encrypted data stream on the wire, we break up the data
+ * into packets where each packet starts with a uint32-size length
+ * word (in network byte order), then encrypted data of that length
+ * immediately following.  Decryption yields the same data stream
+ * that would appear when not using encryption.
+ *
+ * Encrypted data typically ends up being larger than the same data
+ * unencrypted, so we use fixed-size buffers for handling the
+ * encryption/decryption which are larger than PQComm's buffer will
+ * typically be to minimize the times where we have to make multiple
+ * packets and therefore sets of recv/send calls for a single
+ * read/write call to us.
  *
  * NOTE: The client and server have to agree on the max packet size,
  * because we have to pass an entire packet to GSSAPI at a time and we
- * don't want the other side to send arbitrairly huge packets as we
+ * don't want the other side to send arbitrarily huge packets as we
  * would have to allocate memory for them to then pass them to GSSAPI.
+ *
+ * Therefore, these two #define's are effectively part of the protocol
+ * spec and can't ever be changed.
  */
 #define PQ_GSS_SEND_BUFFER_SIZE 16384
 #define PQ_GSS_RECV_BUFFER_SIZE 16384
@@ -48,8 +62,6 @@ static int    PqGSSSendStart;        /* Next index to send a byte in

 /* PqGSSRecvBuffer is for *encrypted* data */
 static char PqGSSRecvBuffer[PQ_GSS_RECV_BUFFER_SIZE];
-static int    PqGSSRecvPointer;    /* Next index to read a byte from
-                                 * PqGSSRecvBuffer */
 static int    PqGSSRecvLength;    /* End of data available in PqGSSRecvBuffer */

 /* PqGSSResultBuffer is for *unencrypted* data */
@@ -62,22 +74,31 @@ uint32        max_packet_size;    /* Maximum size we can encrypt and fit the
                                  * results into our output buffer */

 /*
- * Write len bytes of data from ptr along a GSSAPI-encrypted connection.  Note
- * that the connection must be already set up for GSSAPI encryption (i.e.,
- * GSSAPI transport negotiation is complete).  Returns len when all data has
- * been written; retry when errno is EWOULDBLOCK or similar with the same
- * values of ptr and len.  On non-socket failures, will log an error message.
+ * Attempt to write len bytes of data from ptr to a GSSAPI-encrypted connection.
+ *
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
+ *
+ * Returns the number of data bytes consumed, or on failure, returns -1
+ * with errno set appropriately.  If the errno indicates a non-retryable
+ * error, a message is put into conn->errorMessage.  For retryable errors,
+ * caller should call again (passing the same data) once the socket is ready.
+ *
+ * Data is internally buffered; in the case of an incomplete write, the
+ * return value is the amount of caller data we consumed, not the amount
+ * physically sent.
  */
 ssize_t
 pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
 {
-    gss_buffer_desc input,
-                output = GSS_C_EMPTY_BUFFER;
     OM_uint32    major,
                 minor;
+    gss_buffer_desc input,
+                output = GSS_C_EMPTY_BUFFER;
     ssize_t        ret = -1;
     size_t        bytes_to_encrypt = len;
     size_t        bytes_encrypted = 0;
+    gss_ctx_id_t gctx = conn->gctx;

     /*
      * Loop through encrypting data and sending it out until
@@ -101,25 +122,22 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
             ssize_t        amount = PqGSSSendPointer - PqGSSSendStart;

             ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendStart, amount);
-            if (ret < 0)
+            if (ret <= 0)
             {
                 /*
-                 * If we encrypted some data and it's in our output buffer,
-                 * but send() is saying that we would block, then tell the
-                 * client how far we got with encrypting the data so that they
-                 * can call us again with whatever is left, at which point we
-                 * will try to send the remaining encrypted data first and
-                 * then move on to encrypting the rest of the data.
+                 * Report any successfully-consumed data; if there was none,
+                 * reflect the pqsecure_raw_write result up to our caller.  In
+                 * the former case, we effectively assume that any interesting
+                 * failure condition will recur on the next try.
                  */
-                if (bytes_encrypted != 0 && (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR))
+                if (bytes_encrypted)
                     return bytes_encrypted;
-                else
-                    return ret;
+                return ret;
             }

             /*
-             * Partial write, move forward that far in our buffer and try
-             * again
+             * Check if this was a partial write, and if so, move forward that
+             * far in our buffer and try again.
              */
             if (ret != amount)
             {
@@ -152,18 +170,25 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
         output.value = NULL;
         output.length = 0;

-        /* Create the next encrypted packet */
-        major = gss_wrap(&minor, conn->gctx, 1, GSS_C_QOP_DEFAULT,
+        /*
+         * Create the next encrypted packet.  Any failure here is considered a
+         * hard failure, so we return -1 even though some caller data has been
+         * consumed.
+         */
+        major = gss_wrap(&minor, gctx, 1, GSS_C_QOP_DEFAULT,
                          &input, &conf_state, &output);
         if (major != GSS_S_COMPLETE)
         {
             pg_GSS_error(libpq_gettext("GSSAPI wrap error"), conn, major, minor);
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }
-        else if (conf_state == 0)
+
+        if (conf_state == 0)
         {
             printfPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("outgoing GSSAPI message would not use confidentiality\n"));
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }

@@ -173,6 +198,7 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
                               libpq_gettext("client tried to send oversize GSSAPI packet (%zu > %zu)\n"),
                               (size_t) output.length,
                               PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32));
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }

@@ -198,9 +224,14 @@ cleanup:

 /*
  * Read up to len bytes of data into ptr from a GSSAPI-encrypted connection.
- * Note that GSSAPI transport must already have been negotiated.  Returns the
- * number of bytes read into ptr; otherwise, returns -1.  Retry with the same
- * ptr and len when errno is EWOULDBLOCK or similar.
+ *
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
+ *
+ * Returns the number of data bytes read, or on failure, returns -1
+ * with errno set appropriately.  If the errno indicates a non-retryable
+ * error, a message is put into conn->errorMessage.  For retryable errors,
+ * caller should call again once the socket is ready.
  */
 ssize_t
 pg_GSS_read(PGconn *conn, void *ptr, size_t len)
@@ -209,24 +240,20 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
                 minor;
     gss_buffer_desc input = GSS_C_EMPTY_BUFFER,
                 output = GSS_C_EMPTY_BUFFER;
-    ssize_t        ret = 0;
+    ssize_t        ret;
     size_t        bytes_to_return = len;
     size_t        bytes_returned = 0;
+    gss_ctx_id_t gctx = conn->gctx;

     /*
-     * The goal here is to read an incoming encrypted packet, one at a time,
-     * decrypt it into our out buffer, returning to the caller what they asked
-     * for, and then saving anything else for the next call.
-     *
-     * We get a read request, we look if we have cleartext bytes available
-     * and, if so, copy those to the result, and then we try to decrypt the
-     * next packet.
-     *
-     * We should not try to decrypt the next packet until the read buffer is
-     * completely empty.
+     * The plan here is to read one incoming encrypted packet into
+     * PqGSSRecvBuffer, decrypt it into PqGSSResultBuffer, and then dole out
+     * data from there to the caller.  When we exhaust the current input
+     * packet, read another.
      *
      * If the caller asks for more bytes than one decrypted packet, then we
-     * should try to return all bytes asked for.
+     * should try to return all bytes asked for.  We only give up when
+     * pqsecure_raw_read indicates read failure.
      */
     while (bytes_to_return)
     {
@@ -247,52 +274,54 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
             bytes_to_return -= bytes_to_copy;
             bytes_returned += bytes_to_copy;

-            /* Check if our result buffer is now empty and, if so, reset */
-            if (PqGSSResultPointer == PqGSSResultLength)
-                PqGSSResultPointer = PqGSSResultLength = 0;
-
             continue;
         }

+        /* Reset buffer pointers whenever result buffer is empty */
+        PqGSSResultPointer = PqGSSResultLength = 0;
+
         /*
-         * At this point, our output buffer should be empty with more bytes
-         * being requested to be read.  We are now ready to load the next
-         * packet and decrypt it (entirely) into our buffer.
+         * At this point, our output buffer is empty with more bytes being
+         * requested to be read.  We are now ready to load the next packet and
+         * decrypt it (entirely) into our output buffer.
          *
-         * If we get a partial read back while trying to read a packet off the
-         * wire then we return back what bytes we were able to return and wait
-         * to be called again, until we get a full packet to decrypt.
+         * If we are unable to collect a full packet off the wire, return any
+         * bytes we already emitted, and wait to be called again.
          */

-        /* Check if we got a partial read just trying to get the length */
+        /* Collect the length if we haven't already */
         if (PqGSSRecvLength < sizeof(uint32))
         {
-            /* Try to get whatever of the length we still need */
             ret = pqsecure_raw_read(conn, PqGSSRecvBuffer + PqGSSRecvLength,
                                     sizeof(uint32) - PqGSSRecvLength);
-            if (ret < 0)
+
+            /* If ret <= 0, pqsecure_raw_read already set the correct errno */
+            if (ret <= 0)
                 return bytes_returned ? bytes_returned : ret;

             PqGSSRecvLength += ret;
+
+            /* If we still haven't got the length, return to the caller */
             if (PqGSSRecvLength < sizeof(uint32))
-                return bytes_returned;
+            {
+                if (bytes_returned)
+                    return bytes_returned;
+                errno = EWOULDBLOCK;
+                return -1;
+            }
         }

-        /*
-         * We should have the whole length at this point, so pull it out and
-         * then read whatever we have left of the packet
-         */
+        /* Decode the packet length and check for overlength packet */
         input.length = ntohl(*(uint32 *) PqGSSRecvBuffer);

-        /* Check for over-length packet */
         if (input.length > PQ_GSS_RECV_BUFFER_SIZE - sizeof(uint32))
         {
             printfPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("oversize GSSAPI packet sent by the server (%zu > %zu)\n"),
                               (size_t) input.length,
                               PQ_GSS_RECV_BUFFER_SIZE - sizeof(uint32));
-            ret = -1;
-            goto cleanup;
+            errno = EIO;        /* for lack of a better idea */
+            return -1;
         }

         /*
@@ -301,47 +330,55 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
          */
         ret = pqsecure_raw_read(conn, PqGSSRecvBuffer + PqGSSRecvLength,
                                 input.length - (PqGSSRecvLength - sizeof(uint32)));
-        if (ret < 0)
+        /* If ret <= 0, pqsecure_raw_read already set the correct errno */
+        if (ret <= 0)
             return bytes_returned ? bytes_returned : ret;

-        /*
-         * If we got less than the rest of the packet then we need to return
-         * and be called again.
-         */
         PqGSSRecvLength += ret;
+
+        /* If we don't yet have the whole packet, return to the caller */
         if (PqGSSRecvLength - sizeof(uint32) < input.length)
-            return bytes_returned ? bytes_returned : -1;
+        {
+            if (bytes_returned)
+                return bytes_returned;
+            errno = EWOULDBLOCK;
+            return -1;
+        }

         /*
          * We now have the full packet and we can perform the decryption and
-         * refill our output buffer, then loop back up to pass that back to
-         * the user.
+         * refill our output buffer, then loop back up to pass data back to
+         * the caller.  Note that error exits below here must take care of
+         * releasing the gss output buffer.
          */
         output.value = NULL;
         output.length = 0;
         input.value = PqGSSRecvBuffer + sizeof(uint32);

-        major = gss_unwrap(&minor, conn->gctx, &input, &output, &conf_state, NULL);
+        major = gss_unwrap(&minor, gctx, &input, &output, &conf_state, NULL);
         if (major != GSS_S_COMPLETE)
         {
             pg_GSS_error(libpq_gettext("GSSAPI unwrap error"), conn,
                          major, minor);
             ret = -1;
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }
-        else if (conf_state == 0)
+
+        if (conf_state == 0)
         {
             printfPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("incoming GSSAPI message did not use confidentiality\n"));
             ret = -1;
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }

         memcpy(PqGSSResultBuffer, output.value, output.length);
         PqGSSResultLength = output.length;

-        /* Our buffer is now empty, reset it */
-        PqGSSRecvPointer = PqGSSRecvLength = 0;
+        /* Our receive buffer is now empty, reset it */
+        PqGSSRecvLength = 0;

         gss_release_buffer(&minor, &output);
     }
@@ -365,10 +402,13 @@ static PostgresPollingStatusType
 gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
 {
     *ret = pqsecure_raw_read(conn, recv_buffer, length);
-    if (*ret < 0 && errno == EWOULDBLOCK)
-        return PGRES_POLLING_READING;
-    else if (*ret < 0)
-        return PGRES_POLLING_FAILED;
+    if (*ret < 0)
+    {
+        if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+            return PGRES_POLLING_READING;
+        else
+            return PGRES_POLLING_FAILED;
+    }

     /* Check for EOF */
     if (*ret == 0)
@@ -382,6 +422,13 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
             return PGRES_POLLING_READING;

         *ret = pqsecure_raw_read(conn, recv_buffer, length);
+        if (*ret < 0)
+        {
+            if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+                return PGRES_POLLING_READING;
+            else
+                return PGRES_POLLING_FAILED;
+        }
         if (*ret == 0)
             return PGRES_POLLING_FAILED;
     }
@@ -410,7 +457,7 @@ pqsecure_open_gss(PGconn *conn)
     /* Check for data that needs to be written */
     if (first)
     {
-        PqGSSSendPointer = PqGSSSendStart = PqGSSRecvPointer = PqGSSRecvLength = PqGSSResultPointer =
PqGSSResultLength= 0; 
+        PqGSSSendPointer = PqGSSSendStart = PqGSSRecvLength = PqGSSResultPointer = PqGSSResultLength = 0;
         first = 0;
     }

@@ -422,12 +469,17 @@ pqsecure_open_gss(PGconn *conn)
         ssize_t        amount = PqGSSSendPointer - PqGSSSendStart;

         ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendStart, amount);
-        if (ret < 0 && errno == EWOULDBLOCK)
-            return PGRES_POLLING_WRITING;
+        if (ret < 0)
+        {
+            if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+                return PGRES_POLLING_WRITING;
+            else
+                return PGRES_POLLING_FAILED;
+        }

-        if (ret != amount)
+        if (ret < amount)
         {
-            PqGSSSendStart += amount;
+            PqGSSSendStart += ret;
             return PGRES_POLLING_WRITING;
         }

@@ -536,7 +588,7 @@ pqsecure_open_gss(PGconn *conn)
                                  &output, NULL, NULL);

     /* GSS Init Sec Context uses the whole packet, so clear it */
-    PqGSSRecvPointer = PqGSSRecvLength = 0;
+    PqGSSRecvLength = 0;

     if (GSS_ERROR(major))
     {
@@ -559,7 +611,7 @@ pqsecure_open_gss(PGconn *conn)

         /*
          * Determine the max packet size which will fit in our buffer, after
-         * accounting for the length
+         * accounting for the length.  pg_GSS_write will need this.
          */
         major = gss_wrap_size_limit(&minor, conn->gctx, 1, GSS_C_QOP_DEFAULT,
                                     PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32), &max_packet_size);
diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index e3eb052..b3aeea9 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -19,7 +19,7 @@ use Test::More;

 if ($ENV{with_gssapi} eq 'yes')
 {
-    plan tests => 12;
+    plan tests => 18;
 }
 else
 {
@@ -166,15 +166,15 @@ $node->safe_psql('postgres', 'CREATE USER test1;');

 note "running tests";

+# Test connection success or failure, and if success, that query returns true.
 sub test_access
 {
-    my ($node, $role, $server_check, $expected_res, $gssencmode, $test_name)
-      = @_;
+    my ($node, $role, $query, $expected_res, $gssencmode, $test_name) = @_;

     # need to connect over TCP/IP for Kerberos
     my ($res, $stdoutres, $stderrres) = $node->psql(
         'postgres',
-        "$server_check",
+        "$query",
         extra_params => [
             '-XAtd',
             $node->connstr('postgres')
@@ -195,6 +195,29 @@ sub test_access
     return;
 }

+# As above, but test for an arbitrary query result.
+sub test_query
+{
+    my ($node, $role, $query, $expected, $gssencmode, $test_name) = @_;
+
+    # need to connect over TCP/IP for Kerberos
+    my ($res, $stdoutres, $stderrres) = $node->psql(
+        'postgres',
+        "$query",
+        extra_params => [
+            '-XAtd',
+            $node->connstr('postgres')
+              . " host=$host hostaddr=$hostaddr $gssencmode",
+            '-U',
+            $role
+        ]);
+
+    is($res, 0, $test_name);
+    like($stdoutres, $expected, $test_name);
+    is($stderrres, "", $test_name);
+    return;
+}
+
 unlink($node->data_dir . '/pg_hba.conf');
 $node->append_conf('pg_hba.conf',
     qq{host all all $hostaddr/32 gss map=mymap});
@@ -231,6 +254,27 @@ test_access(
     "gssencmode=require",
     "succeeds with GSS-encrypted access required with host hba");

+# Test that we can transport a reasonable amount of data.
+test_query(
+    $node,
+    "test1",
+    'SELECT * FROM generate_series(1, 100000);',
+    qr/^1\n.*\n1024\n.*\n9999\n.*\n100000$/s,
+    "gssencmode=require",
+    "receiving 100K lines works");
+
+test_query(
+    $node,
+    "test1",
+    "CREATE TABLE mytab (f1 int primary key);\n"
+      . "COPY mytab FROM STDIN;\n"
+      . join("\n", (1 .. 100000))
+      . "\n\\.\n"
+      . "SELECT COUNT(*) FROM mytab;",
+    qr/^100000$/s,
+    "gssencmode=require",
+    "sending 100K lines works");
+
 unlink($node->data_dir . '/pg_hba.conf');
 $node->append_conf('pg_hba.conf',
     qq{hostgssenc all all $hostaddr/32 gss map=mymap});

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
I wrote:
> Here's a draft patch that cleans up all the logic errors I could find.

So last night I was assuming that this problem just requires more careful
attention to what to return in the error exit paths.  In the light of
morning, though, I realize that the algorithms involved in
be-secure-gssapi.c and fe-secure-gssapi.c are just fundamentally wrong:

* On the read side, the code will keep looping until it gets a no-data
error from the underlying socket call.  This is silly.  In every or
almost every use, the caller's read length request corresponds to the
size of a buffer that's meant to be larger than typical messages, so
that betting that we're going to fill that buffer completely is the
wrong way to bet.  Meanwhile, it's fairly likely that the incoming
encrypted packet's length *does* correspond to some actual message
boundary; that would only not happen if the sender is forced to break
up a message, which ought to be a minority situation, else our buffer
size choices are too small.  So it's very likely that the looping just
results in doubling the number of read() calls that are made, with
half of them failing with EWOULDBLOCK.  What we should do instead is
return to the caller whenever we finish handing back the decrypted
contents of a packet.  We can do the read() on the next call, after
the caller's dealt with that data.

* On the write side, if the code encrypts some data and then gets
EWOULDBLOCK trying to write it, it will tell the caller that it
successfully wrote that data.  If that was all the data the caller
had to write (again, not so unlikely) this is a catastrophic
mistake, because the caller will be satisfied and will go to sleep,
rather than calling again to attempt another write.  What we *must*
do is to reflect the write failure verbatim whether or not we
encrypted some data.  We must remember how much data we encrypted
and then discount that much of the caller's supplied data next time.
There are hints in the existing comments that somebody understood
this at one point, but the code isn't acting that way today.

I expect that I can prove point B by hot-wiring pqsecure_raw_write
to randomly return EWOULDBLOCK (instead of making any write attempt)
every so often.  I think strace will be enough to confirm point A,
but haven't tried it yet.

            regards, tom lane



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
I wrote:
> So last night I was assuming that this problem just requires more careful
> attention to what to return in the error exit paths.  In the light of
> morning, though, I realize that the algorithms involved in
> be-secure-gssapi.c and fe-secure-gssapi.c are just fundamentally wrong:

Here's a revised patch that attempts to deal with those issues.
(Still doesn't touch the static-buffer issue, though.)

The 0002 patch isn't meant for commit, but testing with that gives me
a whole lot more confidence that the gssapi code deals with EWOULDBLOCK
correctly.

            regards, tom lane

diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
index b02d3dd..87cb6ea 100644
--- a/src/backend/libpq/be-secure-gssapi.c
+++ b/src/backend/libpq/be-secure-gssapi.c
@@ -5,7 +5,6 @@
  *
  * Portions Copyright (c) 2018-2020, PostgreSQL Global Development Group
  *
- *
  * IDENTIFICATION
  *  src/backend/libpq/be-secure-gssapi.c
  *
@@ -28,32 +27,36 @@
  * Handle the encryption/decryption of data using GSSAPI.
  *
  * In the encrypted data stream on the wire, we break up the data
- * into packets where each packet starts with a sizeof(uint32)-byte
- * length (not allowed to be larger than the buffer sizes defined
- * below) and then the encrypted data of that length immediately
- * following.
+ * into packets where each packet starts with a uint32-size length
+ * word (in network byte order), then encrypted data of that length
+ * immediately following.  Decryption yields the same data stream
+ * that would appear when not using encryption.
  *
  * Encrypted data typically ends up being larger than the same data
  * unencrypted, so we use fixed-size buffers for handling the
  * encryption/decryption which are larger than PQComm's buffer will
  * typically be to minimize the times where we have to make multiple
- * packets and therefore sets of recv/send calls for a single
- * read/write call to us.
+ * packets (and therefore multiple recv/send calls for a single
+ * read/write call to us).
  *
  * NOTE: The client and server have to agree on the max packet size,
  * because we have to pass an entire packet to GSSAPI at a time and we
- * don't want the other side to send arbitrairly huge packets as we
+ * don't want the other side to send arbitrarily huge packets as we
  * would have to allocate memory for them to then pass them to GSSAPI.
+ *
+ * Therefore, these two #define's are effectively part of the protocol
+ * spec and can't ever be changed.
  */
 #define PQ_GSS_SEND_BUFFER_SIZE 16384
 #define PQ_GSS_RECV_BUFFER_SIZE 16384

 /* PqGSSSendBuffer is for *encrypted* data */
 static char PqGSSSendBuffer[PQ_GSS_SEND_BUFFER_SIZE];
-static int    PqGSSSendPointer;    /* Next index to store a byte in
-                                 * PqGSSSendBuffer */
-static int    PqGSSSendStart;        /* Next index to send a byte in
+static int    PqGSSSendLength;    /* End of data available in PqGSSSendBuffer */
+static int    PqGSSSendNext;        /* Next index to send a byte from
                                  * PqGSSSendBuffer */
+static int    PqGSSSendConsumed;    /* Number of *unencrypted* bytes consumed for
+                                 * current contents of PqGSSSendBuffer */

 /* PqGSSRecvBuffer is for *encrypted* data */
 static char PqGSSRecvBuffer[PQ_GSS_RECV_BUFFER_SIZE];
@@ -61,73 +64,87 @@ static int    PqGSSRecvLength;    /* End of data available in PqGSSRecvBuffer */

 /* PqGSSResultBuffer is for *unencrypted* data */
 static char PqGSSResultBuffer[PQ_GSS_RECV_BUFFER_SIZE];
-static int    PqGSSResultPointer; /* Next index to read a byte from
-                                 * PqGSSResultBuffer */
 static int    PqGSSResultLength;    /* End of data available in PqGSSResultBuffer */
+static int    PqGSSResultNext;    /* Next index to read a byte from
+                                 * PqGSSResultBuffer */

 uint32        max_packet_size;    /* Maximum size we can encrypt and fit the
                                  * results into our output buffer */

 /*
- * Attempt to write len bytes of data from ptr along a GSSAPI-encrypted connection.
+ * Attempt to write len bytes of data from ptr to a GSSAPI-encrypted connection.
  *
- * Connection must be fully established (including authentication step) before
- * calling.  Returns the bytes actually consumed once complete.  Data is
- * internally buffered; in the case of an incomplete write, the amount of data we
- * processed (encrypted into our output buffer to be sent) will be returned.  If
- * an error occurs or we would block, a negative value is returned and errno is
- * set appropriately.
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
  *
- * To continue writing in the case of EWOULDBLOCK and similar, call this function
- * again with matching ptr and len parameters.
+ * On success, returns the number of data bytes consumed (possibly less than
+ * len).  On failure, returns -1 with errno set appropriately.  (For fatal
+ * errors, we may just elog and exit, if errno wouldn't be sufficient to
+ * describe the error.)  For retryable errors, caller should call again
+ * (passing the same data) once the socket is ready.
  */
 ssize_t
 be_gssapi_write(Port *port, void *ptr, size_t len)
 {
-    size_t        bytes_to_encrypt = len;
-    size_t        bytes_encrypted = 0;
+    OM_uint32    major,
+                minor;
+    gss_buffer_desc input,
+                output;
+    size_t        bytes_sent = 0;
+    size_t        bytes_to_encrypt;
+    size_t        bytes_encrypted;
+    gss_ctx_id_t gctx = port->gss->ctx;

     /*
-     * Loop through encrypting data and sending it out until
+     * When we get a failure, we must not tell the caller we have successfully
+     * transmitted everything, else it won't retry.  Hence a "success"
+     * (positive) return value must only count source bytes corresponding to
+     * fully-transmitted encrypted packets.  The amount of source data
+     * corresponding to the current partly-transmitted packet is remembered in
+     * PqGSSSendConsumed.  On a retry, the caller *must* be sending that data
+     * again, so if it offers a len less than that, something is wrong.
+     */
+    if (len < PqGSSSendConsumed)
+        elog(FATAL, "GSSAPI caller failed to retransmit all data needing to be retried");
+
+    /* Discount whatever source data we already encrypted. */
+    bytes_to_encrypt = len - PqGSSSendConsumed;
+    bytes_encrypted = PqGSSSendConsumed;
+
+    /*
+     * Loop through encrypting data and sending it out until it's all done or
      * secure_raw_write() complains (which would likely mean that the socket
      * is non-blocking and the requested send() would block, or there was some
-     * kind of actual error) and then return.
+     * kind of actual error).
      */
-    while (bytes_to_encrypt || PqGSSSendPointer)
+    while (bytes_to_encrypt || PqGSSSendLength)
     {
-        OM_uint32    major,
-                    minor;
-        gss_buffer_desc input,
-                    output;
         int            conf_state = 0;
         uint32        netlen;
-        pg_gssinfo *gss = port->gss;

         /*
          * Check if we have data in the encrypted output buffer that needs to
-         * be sent, and if so, try to send it.  If we aren't able to, return
-         * that back up to the caller.
+         * be sent (possibly left over from a previous call), and if so, try
+         * to send it.  If we aren't able to, return that fact back up to the
+         * caller.
          */
-        if (PqGSSSendPointer)
+        if (PqGSSSendLength)
         {
             ssize_t        ret;
-            ssize_t        amount = PqGSSSendPointer - PqGSSSendStart;
+            ssize_t        amount = PqGSSSendLength - PqGSSSendNext;

-            ret = secure_raw_write(port, PqGSSSendBuffer + PqGSSSendStart, amount);
+            ret = secure_raw_write(port, PqGSSSendBuffer + PqGSSSendNext, amount);
             if (ret <= 0)
             {
                 /*
-                 * If we encrypted some data and it's in our output buffer,
-                 * but send() is saying that we would block, then tell the
-                 * caller how far we got with encrypting the data so that they
-                 * can call us again with whatever is left, at which point we
-                 * will try to send the remaining encrypted data first and
-                 * then move on to encrypting the rest of the data.
+                 * Report any previously-sent data; if there was none, reflect
+                 * the secure_raw_write result up to our caller.  When there
+                 * was some, we're effectively assuming that any interesting
+                 * failure condition will recur on the next try.
                  */
-                if (bytes_encrypted != 0 && (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR))
-                    return bytes_encrypted;
-                else
-                    return ret;
+                if (bytes_sent)
+                    return bytes_sent;
+                return ret;
             }

             /*
@@ -136,29 +153,27 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
              */
             if (ret != amount)
             {
-                PqGSSSendStart += ret;
+                PqGSSSendNext += ret;
                 continue;
             }

+            /* We've successfully sent whatever data was in that packet. */
+            bytes_sent += PqGSSSendConsumed;
+
             /* All encrypted data was sent, our buffer is empty now. */
-            PqGSSSendPointer = PqGSSSendStart = 0;
+            PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0;
         }

         /*
          * Check if there are any bytes left to encrypt.  If not, we're done.
          */
         if (!bytes_to_encrypt)
-            return bytes_encrypted;
+            break;

         /*
-         * max_packet_size is the maximum amount of unencrypted data that,
-         * when encrypted, will fit into our encrypted-data output buffer.
-         *
-         * If we are being asked to send more than max_packet_size unencrypted
-         * data, then we will loop and create multiple packets, each with
-         * max_packet_size unencrypted data encrypted in them (at least, until
-         * secure_raw_write returns a failure saying we would be blocked, at
-         * which point we will let the caller know how far we got).
+         * Check how much we are being asked to send, if it's too much, then
+         * we will have to loop and possibly be called multiple times to get
+         * through all the data.
          */
         if (bytes_to_encrypt > max_packet_size)
             input.length = max_packet_size;
@@ -171,7 +186,7 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
         output.length = 0;

         /* Create the next encrypted packet */
-        major = gss_wrap(&minor, gss->ctx, 1, GSS_C_QOP_DEFAULT,
+        major = gss_wrap(&minor, gctx, 1, GSS_C_QOP_DEFAULT,
                          &input, &conf_state, &output);
         if (major != GSS_S_COMPLETE)
             pg_GSS_error(FATAL, gettext_noop("GSSAPI wrap error"), major, minor);
@@ -188,24 +203,34 @@ be_gssapi_write(Port *port, void *ptr, size_t len)

         bytes_encrypted += input.length;
         bytes_to_encrypt -= input.length;
+        PqGSSSendConsumed += input.length;

-        /* 4 network-order length bytes, then payload */
+        /* 4 network-order bytes of length, then payload */
         netlen = htonl(output.length);
-        memcpy(PqGSSSendBuffer + PqGSSSendPointer, &netlen, sizeof(uint32));
-        PqGSSSendPointer += sizeof(uint32);
+        memcpy(PqGSSSendBuffer + PqGSSSendLength, &netlen, sizeof(uint32));
+        PqGSSSendLength += sizeof(uint32);

-        memcpy(PqGSSSendBuffer + PqGSSSendPointer, output.value, output.length);
-        PqGSSSendPointer += output.length;
+        memcpy(PqGSSSendBuffer + PqGSSSendLength, output.value, output.length);
+        PqGSSSendLength += output.length;
     }

-    return bytes_encrypted;
+    /* If we get here, our counters should all match up. */
+    Assert(bytes_sent == len);
+    Assert(bytes_sent == bytes_encrypted);
+
+    return bytes_sent;
 }

 /*
- * Read up to len bytes from a GSSAPI-encrypted connection into ptr.  Call
- * only after the connection has been fully established (i.e., GSSAPI
- * authentication is complete).  On success, returns the number of bytes
- * written into ptr; otherwise, returns -1 and sets errno appropriately.
+ * Read up to len bytes of data into ptr from a GSSAPI-encrypted connection.
+ *
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
+ *
+ * Returns the number of data bytes read, or on failure, returns -1
+ * with errno set appropriately.  (For fatal errors, we may just elog and
+ * exit, if errno wouldn't be sufficient to describe the error.)  For
+ * retryable errors, caller should call again once the socket is ready.
  */
 ssize_t
 be_gssapi_read(Port *port, void *ptr, size_t len)
@@ -215,89 +240,85 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
     gss_buffer_desc input,
                 output;
     ssize_t        ret;
-    size_t        bytes_to_return = len;
     size_t        bytes_returned = 0;
-    int            conf_state = 0;
-    pg_gssinfo *gss = port->gss;
+    gss_ctx_id_t gctx = port->gss->ctx;

     /*
-     * The goal here is to read an incoming encrypted packet, one at a time,
-     * decrypt it into our out buffer, returning to the caller what they asked
-     * for, and then saving anything else for the next call.
-     *
-     * First we look to see if we have unencrypted bytes available and, if so,
-     * copy those to the result.  If the caller asked for more than we had
-     * immediately available, then we try to read a packet off the wire and
-     * decrypt it.  If the read would block, then return the amount of
-     * unencrypted data we copied into the caller's ptr.
+     * The plan here is to read one incoming encrypted packet into
+     * PqGSSRecvBuffer, decrypt it into PqGSSResultBuffer, and then dole out
+     * data from there to the caller.  When we exhaust the current input
+     * packet, read another.
      */
-    while (bytes_to_return)
+    while (bytes_returned < len)
     {
+        int            conf_state = 0;
+
         /* Check if we have data in our buffer that we can return immediately */
-        if (PqGSSResultPointer < PqGSSResultLength)
+        if (PqGSSResultNext < PqGSSResultLength)
         {
-            int            bytes_in_buffer = PqGSSResultLength - PqGSSResultPointer;
-            int            bytes_to_copy = bytes_in_buffer < len - bytes_returned ? bytes_in_buffer : len -
bytes_returned;
+            size_t        bytes_in_buffer = PqGSSResultLength - PqGSSResultNext;
+            size_t        bytes_to_copy = Min(bytes_in_buffer, len - bytes_returned);

             /*
-             * Copy the data from our output buffer into the caller's buffer,
-             * at the point where we last left off filling their buffer
+             * Copy the data from our result buffer into the caller's buffer,
+             * at the point where we last left off filling their buffer.
              */
-            memcpy((char *) ptr + bytes_returned, PqGSSResultBuffer + PqGSSResultPointer, bytes_to_copy);
-            PqGSSResultPointer += bytes_to_copy;
-            bytes_to_return -= bytes_to_copy;
+            memcpy((char *) ptr + bytes_returned, PqGSSResultBuffer + PqGSSResultNext, bytes_to_copy);
+            PqGSSResultNext += bytes_to_copy;
             bytes_returned += bytes_to_copy;

-            /* Check if our result buffer is now empty and, if so, reset */
-            if (PqGSSResultPointer == PqGSSResultLength)
-                PqGSSResultPointer = PqGSSResultLength = 0;
-
-            continue;
+            /*
+             * At this point, we've either filled the caller's buffer or
+             * emptied our result buffer.  Either way, return to caller.  In
+             * the second case, we could try to read another encrypted packet,
+             * but the odds are good that there isn't one available.  (If this
+             * isn't true, we chose too small a max packet size.)  In any
+             * case, there's no harm letting the caller process the data we've
+             * already returned.
+             */
+            break;
         }

+        /* Result buffer is empty, so reset buffer pointers */
+        PqGSSResultLength = PqGSSResultNext = 0;
+
         /*
-         * At this point, our output buffer should be empty with more bytes
-         * being requested to be read.  We are now ready to load the next
-         * packet and decrypt it (entirely) into our buffer.
-         *
-         * If we get a partial read back while trying to read a packet off the
-         * wire then we return the number of unencrypted bytes we were able to
-         * copy (if any, if we didn't copy any, then we return whatever
-         * secure_raw_read returned when we called it; likely -1) into the
-         * caller's ptr and wait to be called again, until we get a full
-         * packet to decrypt.
+         * Because we chose above to return immediately as soon as we emit
+         * some data, bytes_returned must be zero at this point.  Therefore
+         * the failure exits below can just return -1 without worrying about
+         * whether we already emitted some data.
+         */
+        Assert(bytes_returned == 0);
+
+        /*
+         * At this point, our result buffer is empty with more bytes being
+         * requested to be read.  We are now ready to load the next packet and
+         * decrypt it (entirely) into our result buffer.
          */

-        /* Check if we have the size of the packet already in our buffer. */
+        /* Collect the length if we haven't already */
         if (PqGSSRecvLength < sizeof(uint32))
         {
-            /*
-             * We were not able to get the length of the packet last time, so
-             * we need to do that first.
-             */
             ret = secure_raw_read(port, PqGSSRecvBuffer + PqGSSRecvLength,
                                   sizeof(uint32) - PqGSSRecvLength);
-            if (ret < 0)
-                return bytes_returned ? bytes_returned : ret;
+
+            /* If ret <= 0, secure_raw_read already set the correct errno */
+            if (ret <= 0)
+                return ret;

             PqGSSRecvLength += ret;

-            /*
-             * If we only got part of the packet length, then return however
-             * many unencrypted bytes we copied to the caller and wait to be
-             * called again.
-             */
+            /* If we still haven't got the length, return to the caller */
             if (PqGSSRecvLength < sizeof(uint32))
-                return bytes_returned;
+            {
+                errno = EWOULDBLOCK;
+                return -1;
+            }
         }

-        /*
-         * We have the length of the next packet at this point, so pull it out
-         * and then read whatever we have left of the packet to read.
-         */
+        /* Decode the packet length and check for overlength packet */
         input.length = ntohl(*(uint32 *) PqGSSRecvBuffer);

-        /* Check for over-length packet */
         if (input.length > PQ_GSS_RECV_BUFFER_SIZE - sizeof(uint32))
             ereport(FATAL,
                     (errmsg("oversize GSSAPI packet sent by the client (%zu > %zu)",
@@ -310,37 +331,29 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
          */
         ret = secure_raw_read(port, PqGSSRecvBuffer + PqGSSRecvLength,
                               input.length - (PqGSSRecvLength - sizeof(uint32)));
-        if (ret < 0)
-            return bytes_returned ? bytes_returned : ret;
+        /* If ret <= 0, secure_raw_read already set the correct errno */
+        if (ret <= 0)
+            return ret;

         PqGSSRecvLength += ret;

-        /*
-         * If we got less than the rest of the packet then we need to return
-         * and be called again.  If we didn't have any bytes to return on this
-         * run then return -1 and set errno to EWOULDBLOCK.
-         */
+        /* If we don't yet have the whole packet, return to the caller */
         if (PqGSSRecvLength - sizeof(uint32) < input.length)
         {
-            if (!bytes_returned)
-            {
-                errno = EWOULDBLOCK;
-                return -1;
-            }
-
-            return bytes_returned;
+            errno = EWOULDBLOCK;
+            return -1;
         }

         /*
          * We now have the full packet and we can perform the decryption and
-         * refill our output buffer, then loop back up to pass that back to
-         * the user.
+         * refill our result buffer, then loop back up to pass data back to
+         * the caller.
          */
         output.value = NULL;
         output.length = 0;
         input.value = PqGSSRecvBuffer + sizeof(uint32);

-        major = gss_unwrap(&minor, gss->ctx, &input, &output, &conf_state, NULL);
+        major = gss_unwrap(&minor, gctx, &input, &output, &conf_state, NULL);
         if (major != GSS_S_COMPLETE)
             pg_GSS_error(FATAL, gettext_noop("GSSAPI unwrap error"),
                          major, minor);
@@ -350,10 +363,9 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
                     (errmsg("incoming GSSAPI message did not use confidentiality")));

         memcpy(PqGSSResultBuffer, output.value, output.length);
-
         PqGSSResultLength = output.length;

-        /* Our buffer is now empty, reset it */
+        /* Our receive buffer is now empty, reset it */
         PqGSSRecvLength = 0;

         gss_release_buffer(&minor, &output);
@@ -379,37 +391,27 @@ read_or_wait(Port *port, ssize_t len)
      * Keep going until we either read in everything we were asked to, or we
      * error out.
      */
-    while (PqGSSRecvLength != len)
+    while (PqGSSRecvLength < len)
     {
         ret = secure_raw_read(port, PqGSSRecvBuffer + PqGSSRecvLength, len - PqGSSRecvLength);

         /*
-         * If we got back an error and it wasn't just EWOULDBLOCK/EAGAIN, then
-         * give up.
+         * If we got back an error and it wasn't just
+         * EWOULDBLOCK/EAGAIN/EINTR, then give up.
          */
-        if (ret < 0 && !(errno == EWOULDBLOCK || errno == EAGAIN))
+        if (ret < 0 &&
+            !(errno == EWOULDBLOCK || errno == EAGAIN || errno == EINTR))
             return -1;

         /*
          * Ok, we got back either a positive value, zero, or a negative result
-         * but EWOULDBLOCK or EAGAIN was set.
+         * indicating we should retry.
          *
-         * If it was zero or negative, then we try to wait on the socket to be
+         * If it was zero or negative, then we wait on the socket to be
          * readable again.
          */
         if (ret <= 0)
         {
-            /*
-             * If we got back less than zero, indicating an error, and that
-             * wasn't just a EWOULDBLOCK/EAGAIN, then give up.
-             */
-            if (ret < 0 && !(errno == EWOULDBLOCK || errno == EAGAIN))
-                return -1;
-
-            /*
-             * We got back either zero, or -1 with EWOULDBLOCK/EAGAIN, so wait
-             * on socket to be readable again.
-             */
             WaitLatchOrSocket(MyLatch,
                               WL_SOCKET_READABLE | WL_EXIT_ON_PM_DEATH,
                               port->sock, 0, WAIT_EVENT_GSS_OPEN_SERVER);
@@ -430,7 +432,7 @@ read_or_wait(Port *port, ssize_t len)
                 if (ret == 0)
                     return -1;
             }
-            else
+            if (ret < 0)
                 continue;
         }

@@ -460,7 +462,8 @@ secure_open_gssapi(Port *port)
                 minor;

     /* initialize state variables */
-    PqGSSSendPointer = PqGSSSendStart = PqGSSRecvLength = PqGSSResultPointer = PqGSSResultLength = 0;
+    PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0;
+    PqGSSRecvLength = PqGSSResultLength = PqGSSResultNext = 0;

     /*
      * Use the configured keytab, if there is one.  Unfortunately, Heimdal
@@ -542,7 +545,7 @@ secure_open_gssapi(Port *port)
          * Check if we have data to send and, if we do, make sure to send it
          * all
          */
-        if (output.length != 0)
+        if (output.length > 0)
         {
             uint32        netlen = htonl(output.length);

@@ -553,14 +556,27 @@ secure_open_gssapi(Port *port)
                                 PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32))));

             memcpy(PqGSSSendBuffer, (char *) &netlen, sizeof(uint32));
-            PqGSSSendPointer += sizeof(uint32);
+            PqGSSSendLength += sizeof(uint32);

-            memcpy(PqGSSSendBuffer + PqGSSSendPointer, output.value, output.length);
-            PqGSSSendPointer += output.length;
+            memcpy(PqGSSSendBuffer + PqGSSSendLength, output.value, output.length);
+            PqGSSSendLength += output.length;

-            while (PqGSSSendStart != sizeof(uint32) + output.length)
+            /* we don't bother with PqGSSSendConsumed here */
+
+            while (PqGSSSendNext < PqGSSSendLength)
             {
-                ret = secure_raw_write(port, PqGSSSendBuffer + PqGSSSendStart, sizeof(uint32) + output.length -
PqGSSSendStart);
+                ret = secure_raw_write(port, PqGSSSendBuffer + PqGSSSendNext,
+                                       PqGSSSendLength - PqGSSSendNext);
+
+                /*
+                 * If we got back an error and it wasn't just
+                 * EWOULDBLOCK/EAGAIN/EINTR, then give up.
+                 */
+                if (ret < 0 &&
+                    !(errno == EWOULDBLOCK || errno == EAGAIN || errno == EINTR))
+                    return -1;
+
+                /* Wait and retry if we couldn't write yet */
                 if (ret <= 0)
                 {
                     WaitLatchOrSocket(MyLatch,
@@ -569,18 +585,18 @@ secure_open_gssapi(Port *port)
                     continue;
                 }

-                PqGSSSendStart += ret;
+                PqGSSSendNext += ret;
             }

             /* Done sending the packet, reset our buffer */
-            PqGSSSendStart = PqGSSSendPointer = 0;
+            PqGSSSendLength = PqGSSSendNext = 0;

             gss_release_buffer(&minor, &output);
         }

         /*
          * If we got back that the connection is finished being set up, now
-         * that's we've sent the last packet, exit our loop.
+         * that we've sent the last packet, exit our loop.
          */
         if (complete_next)
             break;
@@ -588,10 +604,11 @@ secure_open_gssapi(Port *port)

     /*
      * Determine the max packet size which will fit in our buffer, after
-     * accounting for the length
+     * accounting for the length.  be_gssapi_write will need this.
      */
     major = gss_wrap_size_limit(&minor, port->gss->ctx, 1, GSS_C_QOP_DEFAULT,
-                                PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32), &max_packet_size);
+                                PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32),
+                                &max_packet_size);

     if (GSS_ERROR(major))
         pg_GSS_error(FATAL, gettext_noop("GSSAPI size check error"),
diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c
index e937024..054cf92 100644
--- a/src/interfaces/libpq/fe-secure-gssapi.c
+++ b/src/interfaces/libpq/fe-secure-gssapi.c
@@ -18,6 +18,7 @@
 #include "libpq-int.h"
 #include "port/pg_bswap.h"

+
 /*
  * Require encryption support, as well as mutual authentication and
  * tamperproofing measures.
@@ -26,116 +27,157 @@
     GSS_C_SEQUENCE_FLAG | GSS_C_CONF_FLAG | GSS_C_INTEG_FLAG

 /*
- * We use fixed-size buffers for handling the encryption/decryption
- * which are larger than PQComm's buffer will typically be to minimize
- * the times where we have to make multiple packets and therefore sets
- * of recv/send calls for a single read/write call to us.
+ * Handle the encryption/decryption of data using GSSAPI.
+ *
+ * In the encrypted data stream on the wire, we break up the data
+ * into packets where each packet starts with a uint32-size length
+ * word (in network byte order), then encrypted data of that length
+ * immediately following.  Decryption yields the same data stream
+ * that would appear when not using encryption.
+ *
+ * Encrypted data typically ends up being larger than the same data
+ * unencrypted, so we use fixed-size buffers for handling the
+ * encryption/decryption which are larger than PQComm's buffer will
+ * typically be to minimize the times where we have to make multiple
+ * packets (and therefore multiple recv/send calls for a single
+ * read/write call to us).
  *
  * NOTE: The client and server have to agree on the max packet size,
  * because we have to pass an entire packet to GSSAPI at a time and we
- * don't want the other side to send arbitrairly huge packets as we
+ * don't want the other side to send arbitrarily huge packets as we
  * would have to allocate memory for them to then pass them to GSSAPI.
+ *
+ * Therefore, these two #define's are effectively part of the protocol
+ * spec and can't ever be changed.
  */
 #define PQ_GSS_SEND_BUFFER_SIZE 16384
 #define PQ_GSS_RECV_BUFFER_SIZE 16384

 /* PqGSSSendBuffer is for *encrypted* data */
 static char PqGSSSendBuffer[PQ_GSS_SEND_BUFFER_SIZE];
-static int    PqGSSSendPointer;    /* Next index to store a byte in
-                                 * PqGSSSendBuffer */
-static int    PqGSSSendStart;        /* Next index to send a byte in
+static int    PqGSSSendLength;    /* End of data available in PqGSSSendBuffer */
+static int    PqGSSSendNext;        /* Next index to send a byte from
                                  * PqGSSSendBuffer */
+static int    PqGSSSendConsumed;    /* Number of *unencrypted* bytes consumed for
+                                 * current contents of PqGSSSendBuffer */

 /* PqGSSRecvBuffer is for *encrypted* data */
 static char PqGSSRecvBuffer[PQ_GSS_RECV_BUFFER_SIZE];
-static int    PqGSSRecvPointer;    /* Next index to read a byte from
-                                 * PqGSSRecvBuffer */
 static int    PqGSSRecvLength;    /* End of data available in PqGSSRecvBuffer */

 /* PqGSSResultBuffer is for *unencrypted* data */
 static char PqGSSResultBuffer[PQ_GSS_RECV_BUFFER_SIZE];
-static int    PqGSSResultPointer; /* Next index to read a byte from
-                                 * PqGSSResultBuffer */
 static int    PqGSSResultLength;    /* End of data available in PqGSSResultBuffer */
+static int    PqGSSResultNext;    /* Next index to read a byte from
+                                 * PqGSSResultBuffer */

 uint32        max_packet_size;    /* Maximum size we can encrypt and fit the
                                  * results into our output buffer */

 /*
- * Write len bytes of data from ptr along a GSSAPI-encrypted connection.  Note
- * that the connection must be already set up for GSSAPI encryption (i.e.,
- * GSSAPI transport negotiation is complete).  Returns len when all data has
- * been written; retry when errno is EWOULDBLOCK or similar with the same
- * values of ptr and len.  On non-socket failures, will log an error message.
+ * Attempt to write len bytes of data from ptr to a GSSAPI-encrypted connection.
+ *
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
+ *
+ * On success, returns the number of data bytes consumed (possibly less than
+ * len).  On failure, returns -1 with errno set appropriately.  If the errno
+ * indicates a non-retryable error, a message is put into conn->errorMessage.
+ * For retryable errors, caller should call again (passing the same data)
+ * once the socket is ready.
  */
 ssize_t
 pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
 {
-    gss_buffer_desc input,
-                output = GSS_C_EMPTY_BUFFER;
     OM_uint32    major,
                 minor;
+    gss_buffer_desc input,
+                output = GSS_C_EMPTY_BUFFER;
     ssize_t        ret = -1;
-    size_t        bytes_to_encrypt = len;
-    size_t        bytes_encrypted = 0;
+    size_t        bytes_sent = 0;
+    size_t        bytes_to_encrypt;
+    size_t        bytes_encrypted;
+    gss_ctx_id_t gctx = conn->gctx;
+
+    /*
+     * When we get a failure, we must not tell the caller we have successfully
+     * transmitted everything, else it won't retry.  Hence a "success"
+     * (positive) return value must only count source bytes corresponding to
+     * fully-transmitted encrypted packets.  The amount of source data
+     * corresponding to the current partly-transmitted packet is remembered in
+     * PqGSSSendConsumed.  On a retry, the caller *must* be sending that data
+     * again, so if it offers a len less than that, something is wrong.
+     */
+    if (len < PqGSSSendConsumed)
+    {
+        printfPQExpBuffer(&conn->errorMessage,
+                          "GSSAPI caller failed to retransmit all data needing to be retried\n");
+        errno = EINVAL;
+        return -1;
+    }
+
+    /* Discount whatever source data we already encrypted. */
+    bytes_to_encrypt = len - PqGSSSendConsumed;
+    bytes_encrypted = PqGSSSendConsumed;

     /*
-     * Loop through encrypting data and sending it out until
+     * Loop through encrypting data and sending it out until it's all done or
      * pqsecure_raw_write() complains (which would likely mean that the socket
      * is non-blocking and the requested send() would block, or there was some
-     * kind of actual error) and then return.
+     * kind of actual error).
      */
-    while (bytes_to_encrypt || PqGSSSendPointer)
+    while (bytes_to_encrypt || PqGSSSendLength)
     {
         int            conf_state = 0;
         uint32        netlen;

         /*
          * Check if we have data in the encrypted output buffer that needs to
-         * be sent, and if so, try to send it.  If we aren't able to, return
-         * that back up to the caller.
+         * be sent (possibly left over from a previous call), and if so, try
+         * to send it.  If we aren't able to, return that fact back up to the
+         * caller.
          */
-        if (PqGSSSendPointer)
+        if (PqGSSSendLength)
         {
             ssize_t        ret;
-            ssize_t        amount = PqGSSSendPointer - PqGSSSendStart;
+            ssize_t        amount = PqGSSSendLength - PqGSSSendNext;

-            ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendStart, amount);
-            if (ret < 0)
+            ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendNext, amount);
+            if (ret <= 0)
             {
                 /*
-                 * If we encrypted some data and it's in our output buffer,
-                 * but send() is saying that we would block, then tell the
-                 * client how far we got with encrypting the data so that they
-                 * can call us again with whatever is left, at which point we
-                 * will try to send the remaining encrypted data first and
-                 * then move on to encrypting the rest of the data.
+                 * Report any previously-sent data; if there was none, reflect
+                 * the pqsecure_raw_write result up to our caller.  When there
+                 * was some, we're effectively assuming that any interesting
+                 * failure condition will recur on the next try.
                  */
-                if (bytes_encrypted != 0 && (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR))
-                    return bytes_encrypted;
-                else
-                    return ret;
+                if (bytes_sent)
+                    return bytes_sent;
+                return ret;
             }

             /*
-             * Partial write, move forward that far in our buffer and try
-             * again
+             * Check if this was a partial write, and if so, move forward that
+             * far in our buffer and try again.
              */
             if (ret != amount)
             {
-                PqGSSSendStart += ret;
+                PqGSSSendNext += ret;
                 continue;
             }

+            /* We've successfully sent whatever data was in that packet. */
+            bytes_sent += PqGSSSendConsumed;
+
             /* All encrypted data was sent, our buffer is empty now. */
-            PqGSSSendPointer = PqGSSSendStart = 0;
+            PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0;
         }

         /*
          * Check if there are any bytes left to encrypt.  If not, we're done.
          */
         if (!bytes_to_encrypt)
-            return bytes_encrypted;
+            break;

         /*
          * Check how much we are being asked to send, if it's too much, then
@@ -152,18 +194,24 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
         output.value = NULL;
         output.length = 0;

-        /* Create the next encrypted packet */
-        major = gss_wrap(&minor, conn->gctx, 1, GSS_C_QOP_DEFAULT,
+        /*
+         * Create the next encrypted packet.  Any failure here is considered a
+         * hard failure, so we return -1 even if bytes_sent > 0.
+         */
+        major = gss_wrap(&minor, gctx, 1, GSS_C_QOP_DEFAULT,
                          &input, &conf_state, &output);
         if (major != GSS_S_COMPLETE)
         {
             pg_GSS_error(libpq_gettext("GSSAPI wrap error"), conn, major, minor);
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }
-        else if (conf_state == 0)
+
+        if (conf_state == 0)
         {
             printfPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("outgoing GSSAPI message would not use confidentiality\n"));
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }

@@ -173,22 +221,28 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
                               libpq_gettext("client tried to send oversize GSSAPI packet (%zu > %zu)\n"),
                               (size_t) output.length,
                               PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32));
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }

         bytes_encrypted += input.length;
         bytes_to_encrypt -= input.length;
+        PqGSSSendConsumed += input.length;

         /* 4 network-order bytes of length, then payload */
         netlen = htonl(output.length);
-        memcpy(PqGSSSendBuffer + PqGSSSendPointer, &netlen, sizeof(uint32));
-        PqGSSSendPointer += sizeof(uint32);
+        memcpy(PqGSSSendBuffer + PqGSSSendLength, &netlen, sizeof(uint32));
+        PqGSSSendLength += sizeof(uint32);

-        memcpy(PqGSSSendBuffer + PqGSSSendPointer, output.value, output.length);
-        PqGSSSendPointer += output.length;
+        memcpy(PqGSSSendBuffer + PqGSSSendLength, output.value, output.length);
+        PqGSSSendLength += output.length;
     }

-    ret = bytes_encrypted;
+    /* If we get here, our counters should all match up. */
+    Assert(bytes_sent == len);
+    Assert(bytes_sent == bytes_encrypted);
+
+    ret = bytes_sent;

 cleanup:
     if (output.value != NULL)
@@ -198,9 +252,14 @@ cleanup:

 /*
  * Read up to len bytes of data into ptr from a GSSAPI-encrypted connection.
- * Note that GSSAPI transport must already have been negotiated.  Returns the
- * number of bytes read into ptr; otherwise, returns -1.  Retry with the same
- * ptr and len when errno is EWOULDBLOCK or similar.
+ *
+ * The connection must be already set up for GSSAPI encryption (i.e., GSSAPI
+ * transport negotiation is complete).
+ *
+ * Returns the number of data bytes read, or on failure, returns -1
+ * with errno set appropriately.  If the errno indicates a non-retryable
+ * error, a message is put into conn->errorMessage.  For retryable errors,
+ * caller should call again once the socket is ready.
  */
 ssize_t
 pg_GSS_read(PGconn *conn, void *ptr, size_t len)
@@ -209,90 +268,94 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
                 minor;
     gss_buffer_desc input = GSS_C_EMPTY_BUFFER,
                 output = GSS_C_EMPTY_BUFFER;
-    ssize_t        ret = 0;
-    size_t        bytes_to_return = len;
+    ssize_t        ret;
     size_t        bytes_returned = 0;
+    gss_ctx_id_t gctx = conn->gctx;

     /*
-     * The goal here is to read an incoming encrypted packet, one at a time,
-     * decrypt it into our out buffer, returning to the caller what they asked
-     * for, and then saving anything else for the next call.
-     *
-     * We get a read request, we look if we have cleartext bytes available
-     * and, if so, copy those to the result, and then we try to decrypt the
-     * next packet.
-     *
-     * We should not try to decrypt the next packet until the read buffer is
-     * completely empty.
-     *
-     * If the caller asks for more bytes than one decrypted packet, then we
-     * should try to return all bytes asked for.
+     * The plan here is to read one incoming encrypted packet into
+     * PqGSSRecvBuffer, decrypt it into PqGSSResultBuffer, and then dole out
+     * data from there to the caller.  When we exhaust the current input
+     * packet, read another.
      */
-    while (bytes_to_return)
+    while (bytes_returned < len)
     {
         int            conf_state = 0;

         /* Check if we have data in our buffer that we can return immediately */
-        if (PqGSSResultPointer < PqGSSResultLength)
+        if (PqGSSResultNext < PqGSSResultLength)
         {
-            int            bytes_in_buffer = PqGSSResultLength - PqGSSResultPointer;
-            int            bytes_to_copy = bytes_in_buffer < len - bytes_returned ? bytes_in_buffer : len -
bytes_returned;
+            size_t        bytes_in_buffer = PqGSSResultLength - PqGSSResultNext;
+            size_t        bytes_to_copy = Min(bytes_in_buffer, len - bytes_returned);

             /*
-             * Copy the data from our output buffer into the caller's buffer,
-             * at the point where we last left off filling their buffer
+             * Copy the data from our result buffer into the caller's buffer,
+             * at the point where we last left off filling their buffer.
              */
-            memcpy((char *) ptr + bytes_returned, PqGSSResultBuffer + PqGSSResultPointer, bytes_to_copy);
-            PqGSSResultPointer += bytes_to_copy;
-            bytes_to_return -= bytes_to_copy;
+            memcpy((char *) ptr + bytes_returned, PqGSSResultBuffer + PqGSSResultNext, bytes_to_copy);
+            PqGSSResultNext += bytes_to_copy;
             bytes_returned += bytes_to_copy;

-            /* Check if our result buffer is now empty and, if so, reset */
-            if (PqGSSResultPointer == PqGSSResultLength)
-                PqGSSResultPointer = PqGSSResultLength = 0;
-
-            continue;
+            /*
+             * At this point, we've either filled the caller's buffer or
+             * emptied our result buffer.  Either way, return to caller.  In
+             * the second case, we could try to read another encrypted packet,
+             * but the odds are good that there isn't one available.  (If this
+             * isn't true, we chose too small a max packet size.)  In any
+             * case, there's no harm letting the caller process the data we've
+             * already returned.
+             */
+            break;
         }

+        /* Result buffer is empty, so reset buffer pointers */
+        PqGSSResultLength = PqGSSResultNext = 0;
+
         /*
-         * At this point, our output buffer should be empty with more bytes
-         * being requested to be read.  We are now ready to load the next
-         * packet and decrypt it (entirely) into our buffer.
-         *
-         * If we get a partial read back while trying to read a packet off the
-         * wire then we return back what bytes we were able to return and wait
-         * to be called again, until we get a full packet to decrypt.
+         * Because we chose above to return immediately as soon as we emit
+         * some data, bytes_returned must be zero at this point.  Therefore
+         * the failure exits below can just return -1 without worrying about
+         * whether we already emitted some data.
          */
+        Assert(bytes_returned == 0);

-        /* Check if we got a partial read just trying to get the length */
+        /*
+         * At this point, our result buffer is empty with more bytes being
+         * requested to be read.  We are now ready to load the next packet and
+         * decrypt it (entirely) into our result buffer.
+         */
+
+        /* Collect the length if we haven't already */
         if (PqGSSRecvLength < sizeof(uint32))
         {
-            /* Try to get whatever of the length we still need */
             ret = pqsecure_raw_read(conn, PqGSSRecvBuffer + PqGSSRecvLength,
                                     sizeof(uint32) - PqGSSRecvLength);
-            if (ret < 0)
-                return bytes_returned ? bytes_returned : ret;
+
+            /* If ret <= 0, pqsecure_raw_read already set the correct errno */
+            if (ret <= 0)
+                return ret;

             PqGSSRecvLength += ret;
+
+            /* If we still haven't got the length, return to the caller */
             if (PqGSSRecvLength < sizeof(uint32))
-                return bytes_returned;
+            {
+                errno = EWOULDBLOCK;
+                return -1;
+            }
         }

-        /*
-         * We should have the whole length at this point, so pull it out and
-         * then read whatever we have left of the packet
-         */
+        /* Decode the packet length and check for overlength packet */
         input.length = ntohl(*(uint32 *) PqGSSRecvBuffer);

-        /* Check for over-length packet */
         if (input.length > PQ_GSS_RECV_BUFFER_SIZE - sizeof(uint32))
         {
             printfPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("oversize GSSAPI packet sent by the server (%zu > %zu)\n"),
                               (size_t) input.length,
                               PQ_GSS_RECV_BUFFER_SIZE - sizeof(uint32));
-            ret = -1;
-            goto cleanup;
+            errno = EIO;        /* for lack of a better idea */
+            return -1;
         }

         /*
@@ -301,47 +364,53 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
          */
         ret = pqsecure_raw_read(conn, PqGSSRecvBuffer + PqGSSRecvLength,
                                 input.length - (PqGSSRecvLength - sizeof(uint32)));
-        if (ret < 0)
-            return bytes_returned ? bytes_returned : ret;
+        /* If ret <= 0, pqsecure_raw_read already set the correct errno */
+        if (ret <= 0)
+            return ret;

-        /*
-         * If we got less than the rest of the packet then we need to return
-         * and be called again.
-         */
         PqGSSRecvLength += ret;
+
+        /* If we don't yet have the whole packet, return to the caller */
         if (PqGSSRecvLength - sizeof(uint32) < input.length)
-            return bytes_returned ? bytes_returned : -1;
+        {
+            errno = EWOULDBLOCK;
+            return -1;
+        }

         /*
          * We now have the full packet and we can perform the decryption and
-         * refill our output buffer, then loop back up to pass that back to
-         * the user.
+         * refill our result buffer, then loop back up to pass data back to
+         * the caller.  Note that error exits below here must take care of
+         * releasing the gss output buffer.
          */
         output.value = NULL;
         output.length = 0;
         input.value = PqGSSRecvBuffer + sizeof(uint32);

-        major = gss_unwrap(&minor, conn->gctx, &input, &output, &conf_state, NULL);
+        major = gss_unwrap(&minor, gctx, &input, &output, &conf_state, NULL);
         if (major != GSS_S_COMPLETE)
         {
             pg_GSS_error(libpq_gettext("GSSAPI unwrap error"), conn,
                          major, minor);
             ret = -1;
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }
-        else if (conf_state == 0)
+
+        if (conf_state == 0)
         {
             printfPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("incoming GSSAPI message did not use confidentiality\n"));
             ret = -1;
+            errno = EIO;        /* for lack of a better idea */
             goto cleanup;
         }

         memcpy(PqGSSResultBuffer, output.value, output.length);
         PqGSSResultLength = output.length;

-        /* Our buffer is now empty, reset it */
-        PqGSSRecvPointer = PqGSSRecvLength = 0;
+        /* Our receive buffer is now empty, reset it */
+        PqGSSRecvLength = 0;

         gss_release_buffer(&minor, &output);
     }
@@ -365,10 +434,13 @@ static PostgresPollingStatusType
 gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
 {
     *ret = pqsecure_raw_read(conn, recv_buffer, length);
-    if (*ret < 0 && errno == EWOULDBLOCK)
-        return PGRES_POLLING_READING;
-    else if (*ret < 0)
-        return PGRES_POLLING_FAILED;
+    if (*ret < 0)
+    {
+        if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+            return PGRES_POLLING_READING;
+        else
+            return PGRES_POLLING_FAILED;
+    }

     /* Check for EOF */
     if (*ret == 0)
@@ -382,6 +454,13 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
             return PGRES_POLLING_READING;

         *ret = pqsecure_raw_read(conn, recv_buffer, length);
+        if (*ret < 0)
+        {
+            if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+                return PGRES_POLLING_READING;
+            else
+                return PGRES_POLLING_FAILED;
+        }
         if (*ret == 0)
             return PGRES_POLLING_FAILED;
     }
@@ -407,31 +486,37 @@ pqsecure_open_gss(PGconn *conn)
     gss_buffer_desc input = GSS_C_EMPTY_BUFFER,
                 output = GSS_C_EMPTY_BUFFER;

-    /* Check for data that needs to be written */
+    /* Initialize, if first time through */
     if (first)
     {
-        PqGSSSendPointer = PqGSSSendStart = PqGSSRecvPointer = PqGSSRecvLength = PqGSSResultPointer =
PqGSSResultLength= 0; 
+        PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0;
+        PqGSSRecvLength = PqGSSResultLength = PqGSSResultNext = 0;
         first = 0;
     }

     /*
      * Check if we have anything to send from a prior call and if so, send it.
      */
-    if (PqGSSSendPointer)
+    if (PqGSSSendLength)
     {
-        ssize_t        amount = PqGSSSendPointer - PqGSSSendStart;
+        ssize_t        amount = PqGSSSendLength - PqGSSSendNext;

-        ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendStart, amount);
-        if (ret < 0 && errno == EWOULDBLOCK)
-            return PGRES_POLLING_WRITING;
+        ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendNext, amount);
+        if (ret < 0)
+        {
+            if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+                return PGRES_POLLING_WRITING;
+            else
+                return PGRES_POLLING_FAILED;
+        }

-        if (ret != amount)
+        if (ret < amount)
         {
-            PqGSSSendStart += amount;
+            PqGSSSendNext += ret;
             return PGRES_POLLING_WRITING;
         }

-        PqGSSSendPointer = PqGSSSendStart = 0;
+        PqGSSSendLength = PqGSSSendNext = 0;
     }

     /*
@@ -536,7 +621,7 @@ pqsecure_open_gss(PGconn *conn)
                                  &output, NULL, NULL);

     /* GSS Init Sec Context uses the whole packet, so clear it */
-    PqGSSRecvPointer = PqGSSRecvLength = 0;
+    PqGSSRecvLength = 0;

     if (GSS_ERROR(major))
     {
@@ -544,7 +629,8 @@ pqsecure_open_gss(PGconn *conn)
                      conn, major, minor);
         return PGRES_POLLING_FAILED;
     }
-    else if (output.length == 0)
+
+    if (output.length == 0)
     {
         /*
          * We're done - hooray!  Kind of gross, but we need to disable SSL
@@ -553,20 +639,26 @@ pqsecure_open_gss(PGconn *conn)
 #ifdef USE_SSL
         conn->allow_ssl_try = false;
 #endif
+
+        /* Clean up */
         gss_release_cred(&minor, &conn->gcred);
         conn->gcred = GSS_C_NO_CREDENTIAL;
         conn->gssenc = true;

         /*
          * Determine the max packet size which will fit in our buffer, after
-         * accounting for the length
+         * accounting for the length.  pg_GSS_write will need this.
          */
         major = gss_wrap_size_limit(&minor, conn->gctx, 1, GSS_C_QOP_DEFAULT,
-                                    PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32), &max_packet_size);
+                                    PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32),
+                                    &max_packet_size);

         if (GSS_ERROR(major))
+        {
             pg_GSS_error(libpq_gettext("GSSAPI size check error"), conn,
                          major, minor);
+            return PGRES_POLLING_FAILED;
+        }

         return PGRES_POLLING_OK;
     }
@@ -583,14 +675,16 @@ pqsecure_open_gss(PGconn *conn)
     netlen = htonl(output.length);

     memcpy(PqGSSSendBuffer, (char *) &netlen, sizeof(uint32));
-    PqGSSSendPointer += sizeof(uint32);
+    PqGSSSendLength += sizeof(uint32);
+
+    memcpy(PqGSSSendBuffer + PqGSSSendLength, output.value, output.length);
+    PqGSSSendLength += output.length;

-    memcpy(PqGSSSendBuffer + PqGSSSendPointer, output.value, output.length);
-    PqGSSSendPointer += output.length;
+    /* We don't bother with PqGSSSendConsumed here */

     gss_release_buffer(&minor, &output);

-    /* Asked to be called again to write data */
+    /* Ask to be called again to write data */
     return PGRES_POLLING_WRITING;
 }

diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index e3eb052..b3aeea9 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -19,7 +19,7 @@ use Test::More;

 if ($ENV{with_gssapi} eq 'yes')
 {
-    plan tests => 12;
+    plan tests => 18;
 }
 else
 {
@@ -166,15 +166,15 @@ $node->safe_psql('postgres', 'CREATE USER test1;');

 note "running tests";

+# Test connection success or failure, and if success, that query returns true.
 sub test_access
 {
-    my ($node, $role, $server_check, $expected_res, $gssencmode, $test_name)
-      = @_;
+    my ($node, $role, $query, $expected_res, $gssencmode, $test_name) = @_;

     # need to connect over TCP/IP for Kerberos
     my ($res, $stdoutres, $stderrres) = $node->psql(
         'postgres',
-        "$server_check",
+        "$query",
         extra_params => [
             '-XAtd',
             $node->connstr('postgres')
@@ -195,6 +195,29 @@ sub test_access
     return;
 }

+# As above, but test for an arbitrary query result.
+sub test_query
+{
+    my ($node, $role, $query, $expected, $gssencmode, $test_name) = @_;
+
+    # need to connect over TCP/IP for Kerberos
+    my ($res, $stdoutres, $stderrres) = $node->psql(
+        'postgres',
+        "$query",
+        extra_params => [
+            '-XAtd',
+            $node->connstr('postgres')
+              . " host=$host hostaddr=$hostaddr $gssencmode",
+            '-U',
+            $role
+        ]);
+
+    is($res, 0, $test_name);
+    like($stdoutres, $expected, $test_name);
+    is($stderrres, "", $test_name);
+    return;
+}
+
 unlink($node->data_dir . '/pg_hba.conf');
 $node->append_conf('pg_hba.conf',
     qq{host all all $hostaddr/32 gss map=mymap});
@@ -231,6 +254,27 @@ test_access(
     "gssencmode=require",
     "succeeds with GSS-encrypted access required with host hba");

+# Test that we can transport a reasonable amount of data.
+test_query(
+    $node,
+    "test1",
+    'SELECT * FROM generate_series(1, 100000);',
+    qr/^1\n.*\n1024\n.*\n9999\n.*\n100000$/s,
+    "gssencmode=require",
+    "receiving 100K lines works");
+
+test_query(
+    $node,
+    "test1",
+    "CREATE TABLE mytab (f1 int primary key);\n"
+      . "COPY mytab FROM STDIN;\n"
+      . join("\n", (1 .. 100000))
+      . "\n\\.\n"
+      . "SELECT COUNT(*) FROM mytab;",
+    qr/^100000$/s,
+    "gssencmode=require",
+    "sending 100K lines works");
+
 unlink($node->data_dir . '/pg_hba.conf');
 $node->append_conf('pg_hba.conf',
     qq{hostgssenc all all $hostaddr/32 gss map=mymap});
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 2ae507a..cf763d7 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -331,6 +331,12 @@ secure_raw_write(Port *port, const void *ptr, size_t len)
 {
     ssize_t        n;

+    if (random() < INT_MAX / 4)
+    {
+        errno = EWOULDBLOCK;
+        return -1;
+    }
+
 #ifdef WIN32
     pgwin32_noblock = true;
 #endif
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index 52f6e87..c94587d 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -27,6 +27,7 @@
 #include <signal.h>
 #include <fcntl.h>
 #include <ctype.h>
+#include <limits.h>

 #ifdef WIN32
 #include "win32.h"
@@ -329,6 +330,12 @@ pqsecure_raw_write(PGconn *conn, const void *ptr, size_t len)

     DECLARE_SIGPIPE_INFO(spinfo);

+    if (random() < INT_MAX / 2)
+    {
+        SOCK_ERRNO_SET(EWOULDBLOCK);
+        return -1;
+    }
+
 #ifdef MSG_NOSIGNAL
     if (conn->sigpipe_flag)
         flags |= MSG_NOSIGNAL;

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
I wrote:
> (Still doesn't touch the static-buffer issue, though.)

And here's a delta patch for that.  This fixes an additional portability
hazard, which is that there were various places doing stuff like

        input.length = ntohl(*(uint32 *) PqGSSRecvBuffer);

That's a SIGBUS hazard on alignment-picky hardware, because there is
no guarantee that a variable that's just declared "char ...[...]"
will have any particular alignment.  But malloc'ing the space will
provide maxaligned storage.

My FreeBSD testing has now given me enough confidence in these patches
that I'm just going to go ahead and push them.  But, if you'd like to
do some more testing in advance of 12.2 release, please do.

            regards, tom lane

diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
index 87cb6ea..c25cfda 100644
--- a/src/backend/libpq/be-secure-gssapi.c
+++ b/src/backend/libpq/be-secure-gssapi.c
@@ -50,27 +50,30 @@
 #define PQ_GSS_SEND_BUFFER_SIZE 16384
 #define PQ_GSS_RECV_BUFFER_SIZE 16384

-/* PqGSSSendBuffer is for *encrypted* data */
-static char PqGSSSendBuffer[PQ_GSS_SEND_BUFFER_SIZE];
+/*
+ * Since we manage at most one GSS-encrypted connection per backend,
+ * we can just keep all this state in static variables.  The char *
+ * variables point to buffers that are allocated once and re-used.
+ */
+static char *PqGSSSendBuffer;    /* Encrypted data waiting to be sent */
 static int    PqGSSSendLength;    /* End of data available in PqGSSSendBuffer */
 static int    PqGSSSendNext;        /* Next index to send a byte from
                                  * PqGSSSendBuffer */
 static int    PqGSSSendConsumed;    /* Number of *unencrypted* bytes consumed for
                                  * current contents of PqGSSSendBuffer */

-/* PqGSSRecvBuffer is for *encrypted* data */
-static char PqGSSRecvBuffer[PQ_GSS_RECV_BUFFER_SIZE];
+static char *PqGSSRecvBuffer;    /* Received, encrypted data */
 static int    PqGSSRecvLength;    /* End of data available in PqGSSRecvBuffer */

-/* PqGSSResultBuffer is for *unencrypted* data */
-static char PqGSSResultBuffer[PQ_GSS_RECV_BUFFER_SIZE];
+static char *PqGSSResultBuffer; /* Decryption of data in gss_RecvBuffer */
 static int    PqGSSResultLength;    /* End of data available in PqGSSResultBuffer */
 static int    PqGSSResultNext;    /* Next index to read a byte from
                                  * PqGSSResultBuffer */

-uint32        max_packet_size;    /* Maximum size we can encrypt and fit the
+static uint32 PqGSSMaxPktSize;    /* Maximum size we can encrypt and fit the
                                  * results into our output buffer */

+
 /*
  * Attempt to write len bytes of data from ptr to a GSSAPI-encrypted connection.
  *
@@ -175,8 +178,8 @@ be_gssapi_write(Port *port, void *ptr, size_t len)
          * we will have to loop and possibly be called multiple times to get
          * through all the data.
          */
-        if (bytes_to_encrypt > max_packet_size)
-            input.length = max_packet_size;
+        if (bytes_to_encrypt > PqGSSMaxPktSize)
+            input.length = PqGSSMaxPktSize;
         else
             input.length = bytes_to_encrypt;

@@ -461,7 +464,20 @@ secure_open_gssapi(Port *port)
     OM_uint32    major,
                 minor;

-    /* initialize state variables */
+    /*
+     * Allocate buffers and initialize state variables.  By malloc'ing the
+     * buffers at this point, we avoid wasting static data space in processes
+     * that will never use them, and we ensure that the buffers are
+     * sufficiently aligned for the length-word accesses that we do in some
+     * places in this file.
+     */
+    PqGSSSendBuffer = malloc(PQ_GSS_SEND_BUFFER_SIZE);
+    PqGSSRecvBuffer = malloc(PQ_GSS_RECV_BUFFER_SIZE);
+    PqGSSResultBuffer = malloc(PQ_GSS_RECV_BUFFER_SIZE);
+    if (!PqGSSSendBuffer || !PqGSSRecvBuffer || !PqGSSResultBuffer)
+        ereport(FATAL,
+                (errcode(ERRCODE_OUT_OF_MEMORY),
+                 errmsg("out of memory")));
     PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0;
     PqGSSRecvLength = PqGSSResultLength = PqGSSResultNext = 0;

@@ -608,7 +624,7 @@ secure_open_gssapi(Port *port)
      */
     major = gss_wrap_size_limit(&minor, port->gss->ctx, 1, GSS_C_QOP_DEFAULT,
                                 PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32),
-                                &max_packet_size);
+                                &PqGSSMaxPktSize);

     if (GSS_ERROR(major))
         pg_GSS_error(FATAL, gettext_noop("GSSAPI size check error"),
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 89b1346..80b54bc 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -462,7 +462,7 @@ pqDropConnection(PGconn *conn, bool flushInput)
     /* Always discard any unsent data */
     conn->outCount = 0;

-    /* Free authentication state */
+    /* Free authentication/encryption state */
 #ifdef ENABLE_GSS
     {
         OM_uint32    min_s;
@@ -471,6 +471,21 @@ pqDropConnection(PGconn *conn, bool flushInput)
             gss_delete_sec_context(&min_s, &conn->gctx, GSS_C_NO_BUFFER);
         if (conn->gtarg_nam)
             gss_release_name(&min_s, &conn->gtarg_nam);
+        if (conn->gss_SendBuffer)
+        {
+            free(conn->gss_SendBuffer);
+            conn->gss_SendBuffer = NULL;
+        }
+        if (conn->gss_RecvBuffer)
+        {
+            free(conn->gss_RecvBuffer);
+            conn->gss_RecvBuffer = NULL;
+        }
+        if (conn->gss_ResultBuffer)
+        {
+            free(conn->gss_ResultBuffer);
+            conn->gss_ResultBuffer = NULL;
+        }
     }
 #endif
 #ifdef ENABLE_SSPI
diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c
index 054cf92..9fd06ea 100644
--- a/src/interfaces/libpq/fe-secure-gssapi.c
+++ b/src/interfaces/libpq/fe-secure-gssapi.c
@@ -53,26 +53,22 @@
 #define PQ_GSS_SEND_BUFFER_SIZE 16384
 #define PQ_GSS_RECV_BUFFER_SIZE 16384

-/* PqGSSSendBuffer is for *encrypted* data */
-static char PqGSSSendBuffer[PQ_GSS_SEND_BUFFER_SIZE];
-static int    PqGSSSendLength;    /* End of data available in PqGSSSendBuffer */
-static int    PqGSSSendNext;        /* Next index to send a byte from
-                                 * PqGSSSendBuffer */
-static int    PqGSSSendConsumed;    /* Number of *unencrypted* bytes consumed for
-                                 * current contents of PqGSSSendBuffer */
-
-/* PqGSSRecvBuffer is for *encrypted* data */
-static char PqGSSRecvBuffer[PQ_GSS_RECV_BUFFER_SIZE];
-static int    PqGSSRecvLength;    /* End of data available in PqGSSRecvBuffer */
-
-/* PqGSSResultBuffer is for *unencrypted* data */
-static char PqGSSResultBuffer[PQ_GSS_RECV_BUFFER_SIZE];
-static int    PqGSSResultLength;    /* End of data available in PqGSSResultBuffer */
-static int    PqGSSResultNext;    /* Next index to read a byte from
-                                 * PqGSSResultBuffer */
-
-uint32        max_packet_size;    /* Maximum size we can encrypt and fit the
-                                 * results into our output buffer */
+/*
+ * We need these state variables per-connection.  To allow the functions
+ * in this file to look mostly like those in be-secure-gssapi.c, set up
+ * these macros.
+ */
+#define PqGSSSendBuffer (conn->gss_SendBuffer)
+#define PqGSSSendLength (conn->gss_SendLength)
+#define PqGSSSendNext (conn->gss_SendNext)
+#define PqGSSSendConsumed (conn->gss_SendConsumed)
+#define PqGSSRecvBuffer (conn->gss_RecvBuffer)
+#define PqGSSRecvLength (conn->gss_RecvLength)
+#define PqGSSResultBuffer (conn->gss_ResultBuffer)
+#define PqGSSResultLength (conn->gss_ResultLength)
+#define PqGSSResultNext (conn->gss_ResultNext)
+#define PqGSSMaxPktSize (conn->gss_MaxPktSize)
+

 /*
  * Attempt to write len bytes of data from ptr to a GSSAPI-encrypted connection.
@@ -184,8 +180,8 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
          * we will have to loop and possibly be called multiple times to get
          * through all the data.
          */
-        if (bytes_to_encrypt > max_packet_size)
-            input.length = max_packet_size;
+        if (bytes_to_encrypt > PqGSSMaxPktSize)
+            input.length = PqGSSMaxPktSize;
         else
             input.length = bytes_to_encrypt;

@@ -477,7 +473,6 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
 PostgresPollingStatusType
 pqsecure_open_gss(PGconn *conn)
 {
-    static int    first = 1;
     ssize_t        ret;
     OM_uint32    major,
                 minor;
@@ -486,12 +481,25 @@ pqsecure_open_gss(PGconn *conn)
     gss_buffer_desc input = GSS_C_EMPTY_BUFFER,
                 output = GSS_C_EMPTY_BUFFER;

-    /* Initialize, if first time through */
-    if (first)
+    /*
+     * If first time through for this connection, allocate buffers and
+     * initialize state variables.  By malloc'ing the buffers separately, we
+     * ensure that they are sufficiently aligned for the length-word accesses
+     * that we do in some places in this file.
+     */
+    if (PqGSSSendBuffer == NULL)
     {
+        PqGSSSendBuffer = malloc(PQ_GSS_SEND_BUFFER_SIZE);
+        PqGSSRecvBuffer = malloc(PQ_GSS_RECV_BUFFER_SIZE);
+        PqGSSResultBuffer = malloc(PQ_GSS_RECV_BUFFER_SIZE);
+        if (!PqGSSSendBuffer || !PqGSSRecvBuffer || !PqGSSResultBuffer)
+        {
+            printfPQExpBuffer(&conn->errorMessage,
+                              libpq_gettext("out of memory\n"));
+            return PGRES_POLLING_FAILED;
+        }
         PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0;
         PqGSSRecvLength = PqGSSResultLength = PqGSSResultNext = 0;
-        first = 0;
     }

     /*
@@ -651,7 +659,7 @@ pqsecure_open_gss(PGconn *conn)
          */
         major = gss_wrap_size_limit(&minor, conn->gctx, 1, GSS_C_QOP_DEFAULT,
                                     PQ_GSS_SEND_BUFFER_SIZE - sizeof(uint32),
-                                    &max_packet_size);
+                                    &PqGSSMaxPktSize);

         if (GSS_ERROR(major))
         {
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1e36be2..79bc378 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -493,6 +493,23 @@ struct pg_conn
     bool        try_gss;        /* GSS attempting permitted */
     bool        gssenc;            /* GSS encryption is usable */
     gss_cred_id_t gcred;        /* GSS credential temp storage. */
+
+    /* GSS encryption I/O state --- see fe-secure-gssapi.c */
+    char       *gss_SendBuffer; /* Encrypted data waiting to be sent */
+    int            gss_SendLength; /* End of data available in gss_SendBuffer */
+    int            gss_SendNext;    /* Next index to send a byte from
+                                 * gss_SendBuffer */
+    int            gss_SendConsumed;    /* Number of *unencrypted* bytes consumed
+                                     * for current contents of gss_SendBuffer */
+    char       *gss_RecvBuffer; /* Received, encrypted data */
+    int            gss_RecvLength; /* End of data available in gss_RecvBuffer */
+    char       *gss_ResultBuffer;    /* Decryption of data in gss_RecvBuffer */
+    int            gss_ResultLength;    /* End of data available in
+                                     * gss_ResultBuffer */
+    int            gss_ResultNext; /* Next index to read a byte from
+                                 * gss_ResultBuffer */
+    uint32        gss_MaxPktSize; /* Maximum size we can encrypt and fit the
+                                 * results into our output buffer */
 #endif

 #ifdef ENABLE_SSPI

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI?)

From
Peter
Date:
On Fri, Jan 10, 2020 at 10:51:50PM -0500, Tom Lane wrote:
! Here's a draft patch that cleans up all the logic errors I could find.

Okiee, thank You!
Let's see (was a bit busy yesterday trying to upgrade pgadmin3 -
difficult matter), now lets sort this out:

With the first patch applied (as from Friday - applied only on the
client side), the application did appear to work well.

But then, when engaging bandwidth-limiting to some modem-speed, it did
not work: psql would receive all (or most of) the data from a SELECT,
but then somehow not recognize the end of it and sit there and wait for
whatever:

> flowmdev=> select * from flows;
> ^CCancel request sent
> ^CCancel request sent


Now with the new patches 0001+0003 applied, on both server & client,
all now running 12.1 release, on a first run I did not perceive
a malfunction, bandwidth limited or not.
I'll leave them applied, but this here will not experience serious
loads; You'll need somebody else to test for that...


rgds,
PMc



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
Peter <pmc@citylink.dinoex.sub.org> writes:
> With the first patch applied (as from Friday - applied only on the
> client side), the application did appear to work well.
> But then, when engaging bandwidth-limiting to some modem-speed, it did
> not work: psql would receive all (or most of) the data from a SELECT,
> but then somehow not recognize the end of it and sit there and wait for
> whatever:

Yeah, that's just the behavior I'd expect (and was able to reproduce
here) because of the additional design problem.

> Now with the new patches 0001+0003 applied, on both server & client,
> all now running 12.1 release, on a first run I did not perceive
> a malfunction, bandwidth limited or not.
> I'll leave them applied, but this here will not experience serious
> loads; You'll need somebody else to test for that...

Cool, let us know if you do see any problems.

            regards, tom lane



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI?)

From
Stephen Frost
Date:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I wrote:
> > Here's a draft patch that cleans up all the logic errors I could find.
>
> So last night I was assuming that this problem just requires more careful
> attention to what to return in the error exit paths.  In the light of
> morning, though, I realize that the algorithms involved in
> be-secure-gssapi.c and fe-secure-gssapi.c are just fundamentally wrong:
>
> * On the read side, the code will keep looping until it gets a no-data
> error from the underlying socket call.  This is silly.  In every or
> almost every use, the caller's read length request corresponds to the
> size of a buffer that's meant to be larger than typical messages, so
> that betting that we're going to fill that buffer completely is the
> wrong way to bet.  Meanwhile, it's fairly likely that the incoming
> encrypted packet's length *does* correspond to some actual message
> boundary; that would only not happen if the sender is forced to break
> up a message, which ought to be a minority situation, else our buffer
> size choices are too small.  So it's very likely that the looping just
> results in doubling the number of read() calls that are made, with
> half of them failing with EWOULDBLOCK.  What we should do instead is
> return to the caller whenever we finish handing back the decrypted
> contents of a packet.  We can do the read() on the next call, after
> the caller's dealt with that data.

Yeah, I agree that this is a better approach.  Doing unnecessary
read()'s certainly isn't ideal but beyond being silly it doesn't sound
like this was fundamentally broken..? (yes, the error cases certainly
weren't properly being handled, I understand that)

> * On the write side, if the code encrypts some data and then gets
> EWOULDBLOCK trying to write it, it will tell the caller that it
> successfully wrote that data.  If that was all the data the caller
> had to write (again, not so unlikely) this is a catastrophic
> mistake, because the caller will be satisfied and will go to sleep,
> rather than calling again to attempt another write.  What we *must*
> do is to reflect the write failure verbatim whether or not we
> encrypted some data.  We must remember how much data we encrypted
> and then discount that much of the caller's supplied data next time.
> There are hints in the existing comments that somebody understood
> this at one point, but the code isn't acting that way today.

That's a case I hadn't considered and you're right- the algorithm
certainly wouldn't work in such a case.  I don't recall specifically if
the code had handled it better previously, or not, but I do recall there
was something previously about being given a buffer and then having the
API defined as "give me back the exact same buffer because I had to
stop" and I recall finding that to ugly, but I get it now, seeing this
issue.  I'd certainly be happier if there was a better alternative but I
don't know that there really is.

Thanks,

Stephen

Attachment

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> ... We must remember how much data we encrypted
>> and then discount that much of the caller's supplied data next time.
>> There are hints in the existing comments that somebody understood
>> this at one point, but the code isn't acting that way today.

> That's a case I hadn't considered and you're right- the algorithm
> certainly wouldn't work in such a case.  I don't recall specifically if
> the code had handled it better previously, or not, but I do recall there
> was something previously about being given a buffer and then having the
> API defined as "give me back the exact same buffer because I had to
> stop" and I recall finding that to ugly, but I get it now, seeing this
> issue.  I'd certainly be happier if there was a better alternative but I
> don't know that there really is.

Yeah.  The only bright spot is that there's no reason for the caller
to change its mind about what it wants to write, so that this restriction
doesn't really affect anything.  (The next call might potentially add
*more* data at the end, but that's fine.)

I realized when I got into it that my sketch above also considered only
part of the problem.  In the general case, we might've encrypted some data
from the current write request and successfully sent it, and then
encrypted some more data but been unable to (fully) send that packet.
In this situation, it's best to report that we wrote however much data
corresponds to the fully sent packet(s).  That way the caller can discard
that data from its buffer.  We can't report the data corresponding to the
in-progress packet as being written, though, or we have the
might-not-get-another-call problem.  Fortunately the API already has the
notion of a partial write, since the underlying socket calls do.

            regards, tom lane



Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI?)

From
Stephen Frost
Date:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> ... We must remember how much data we encrypted
> >> and then discount that much of the caller's supplied data next time.
> >> There are hints in the existing comments that somebody understood
> >> this at one point, but the code isn't acting that way today.
>
> > That's a case I hadn't considered and you're right- the algorithm
> > certainly wouldn't work in such a case.  I don't recall specifically if
> > the code had handled it better previously, or not, but I do recall there
> > was something previously about being given a buffer and then having the
> > API defined as "give me back the exact same buffer because I had to
> > stop" and I recall finding that to ugly, but I get it now, seeing this
> > issue.  I'd certainly be happier if there was a better alternative but I
> > don't know that there really is.
>
> Yeah.  The only bright spot is that there's no reason for the caller
> to change its mind about what it wants to write, so that this restriction
> doesn't really affect anything.  (The next call might potentially add
> *more* data at the end, but that's fine.)

Right, makes sense.

> I realized when I got into it that my sketch above also considered only
> part of the problem.  In the general case, we might've encrypted some data
> from the current write request and successfully sent it, and then
> encrypted some more data but been unable to (fully) send that packet.
> In this situation, it's best to report that we wrote however much data
> corresponds to the fully sent packet(s).  That way the caller can discard
> that data from its buffer.  We can't report the data corresponding to the
> in-progress packet as being written, though, or we have the
> might-not-get-another-call problem.  Fortunately the API already has the
> notion of a partial write, since the underlying socket calls do.

Yeah, I see how that's also an issue and agree that it makes sense to
report back what's been written and sent as a partial write, and not
report back everything we've "consumed" since we might not get called
again in that case.

Thanks!

Stephen

Attachment