On Mon, Feb 28, 2022 at 09:43:09PM +0000, ldh@laurent-hasson.com wrote:
> On Wed, Feb 23, 2022 at 07:04:15PM -0600, Justin Pryzby wrote:
> > > And the aforementioned network trace. You could set a capture filter on TCP
> > > SYN|RST so it's not absurdly large. From my notes, it might look like this:
> > > (tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)
> >
> > I'd also add '|| icmp'. My hunch is that you'll see some ICMP (not "ping")
> > being sent by an intermediate gateway, resulting in the connection being
> > reset.
>
> I am so sorry but I do not understand what you are asking me to do. I am unfamiliar with these commands. Is this a
postgresconfiguration file? Is this something I just do once or something I leave on to hopefully catch it when the
issueoccurs? Is this something to do on the DB machine or the ETL machine? FYI:
It's no problem.
I suggest that you run wireshark with a capture filter to try to show *why* the
connections are failing. I think the capture filter might look like:
(icmp || (tcp[tcpflags] & (tcp-rst|tcp-syn|tcp-fin)!=0)) && host 10.64.17.211
With the "host" filtering for the IP address of the *remote* machine.
You could run that on whichever machine is more convenient and leave it running
for however long it takes for that error to happen. You'll be able to save a
.pcap file for inspection. I suppose it'll show either a TCP RST or an ICMP.
Whichever side sent that is where the problem is. I still suspect the issue
isn't in postgres.
> - My ETL machine is on 10.64.17.211
> - My DB machine is on 10.64.17.210
> - Both on Windows Server 2012 R2, x64
These network details make my theory unlikely.
They're on the same subnet with no intermediate gateways, and communicate
directly via a hub/switch/crossover cable. If that's true, then both will have
each other's hardware address in ARP after pinging from one to the other.
--
Justin