Re: strange error reporting - Mailing list pgsql-hackers

From Tom Lane
Subject Re: strange error reporting
Date
Msg-id 3792030.1620053267@sss.pgh.pa.us
Whole thread Raw
In response to Re: strange error reporting  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, May 3, 2021 at 6:08 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>> Throwing the socket address in there seems a bit distracting and
>> misleading, and it also pushes off the actual information very far to
>> the end.  (Also, in some cases the socket path is very long, making the
>> actual information even harder to find.)  By the time you get to this
>> error, you have already connected, so mentioning the server address
>> seems secondary at best.

> It feels a little counterintuitive to me too but I am nevertheless
> inclined to believe that it's an improvement. When multi-host
> connection strings are used, the server address may not be clear. In
> fact, even when they're not, it may not be clear to a new user that
> socket communication is used, and it may not be clear where the socket
> is located.

Yeah.  The specific problem I'm concerned about solving here is
"I wasn't connecting to the server I thought I was", which could be
a contributing factor in almost any connection-time failure.  The
multi-host-connection-string feature made that issue noticeably worse,
but surely we've all seen trouble reports that boiled down to that
even before that feature came in.

As you say, we could perhaps redesign the messages to provide this
info in another order.  But it'd be difficult, and I think it might
come out even more confusing in cases where libpq tried several
servers on the way to finally failing.  The old code's error
reporting for such cases completely sucked, whereas now you get
a reasonably complete trace of the attempts.  As a quick example,
for a case of bad hostname followed by wrong port:

$ psql -d "host=foo1,sss2 port=5432,5342"
psql: error: could not translate host name "foo1" to address: Name or service not known
connection to server at "sss2" (192.168.1.48), port 5342 failed: Connection refused
        Is the server running on that host and accepting TCP/IP connections?

v13 renders this as

$ psql -d "host=foo1,sss2 port=5432,5342"
psql: error: could not translate host name "foo1" to address: Name or service not known
could not connect to server: Connection refused
        Is the server running on host "sss2" (192.168.1.48) and accepting
        TCP/IP connections on port 5342?

Now, of course the big problem there is the lack of consistency about
how the two errors are laid out; but I'd argue that putting the
server identity info first is better than putting it later.

Also, if you experiment with other cases such as some of the servers
complaining about wrong user name, the old behavior is even harder
to follow about which server said what.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: MaxOffsetNumber for Table AMs
Next
From: Robert Haas
Date:
Subject: Re: MaxOffsetNumber for Table AMs