Re: An I/O error occurred while sending to the backend (PG 13.4) - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: An I/O error occurred while sending to the backend (PG 13.4)
Date
Msg-id 20220228220503.GF25269@telsasoft.com
Whole thread Raw
In response to RE: An I/O error occurred while sending to the backend (PG 13.4)  ("ldh@laurent-hasson.com" <ldh@laurent-hasson.com>)
Responses RE: An I/O error occurred while sending to the backend (PG 13.4)  ("ldh@laurent-hasson.com" <ldh@laurent-hasson.com>)
List pgsql-performance
On Mon, Feb 28, 2022 at 09:43:09PM +0000, ldh@laurent-hasson.com wrote:
>    On Wed, Feb 23, 2022 at 07:04:15PM -0600, Justin Pryzby wrote:
>    >  > And the aforementioned network trace.  You could set a capture filter on TCP
>    >  > SYN|RST so it's not absurdly large.  From my notes, it might look like this:
>    >  > (tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)
>    >  
>    >  I'd also add '|| icmp'.  My hunch is that you'll see some ICMP (not "ping")
>    >  being sent by an intermediate gateway, resulting in the connection being
>    >  reset.
> 
> I am so sorry but I do not understand what you are asking me to do. I am unfamiliar with these commands. Is this a
postgresconfiguration file? Is this something I just do once or something I leave on to hopefully catch it when the
issueoccurs? Is this something to do on the DB machine or the ETL machine? FYI:
 

It's no problem.

I suggest that you run wireshark with a capture filter to try to show *why* the
connections are failing.  I think the capture filter might look like:

(icmp || (tcp[tcpflags] & (tcp-rst|tcp-syn|tcp-fin)!=0)) && host 10.64.17.211

With the "host" filtering for the IP address of the *remote* machine.

You could run that on whichever machine is more convenient and leave it running
for however long it takes for that error to happen.  You'll be able to save a
.pcap file for inspection.  I suppose it'll show either a TCP RST or an ICMP.
Whichever side sent that is where the problem is.  I still suspect the issue
isn't in postgres.

>   - My ETL machine is on 10.64.17.211
>   - My DB machine is on 10.64.17.210
>   - Both on Windows Server 2012 R2, x64

These network details make my theory unlikely.

They're on the same subnet with no intermediate gateways, and communicate
directly via a hub/switch/crossover cable.  If that's true, then both will have
each other's hardware address in ARP after pinging from one to the other.

-- 
Justin



pgsql-performance by date:

Previous
From: "ldh@laurent-hasson.com"
Date:
Subject: RE: An I/O error occurred while sending to the backend (PG 13.4)
Next
From: Charles Huang
Date:
Subject: RLS not using index scan but seq scan when condition gets a bit complicated