Re: BackgroundPsql swallowing errors on windows - Mailing list pgsql-hackers

From Andres Freund
Subject Re: BackgroundPsql swallowing errors on windows
Date
Msg-id qo3x3ycl5fokh4tumm3lo6rn6ijvl7mmes7k24fm3kuv4xaupt@px4vl5ocnvth
Whole thread Raw
In response to Re: BackgroundPsql swallowing errors on windows  (Noah Misch <noah@leadboat.com>)
Responses Re: BackgroundPsql swallowing errors on windows
List pgsql-hackers
Hi,

On 2025-02-16 10:47:40 -0800, Noah Misch wrote:
> On Sun, Feb 16, 2025 at 01:02:01PM -0500, Andres Freund wrote:
> > On 2025-02-16 09:39:43 -0800, Noah Misch wrote:
> > > On Thu, Feb 13, 2025 at 12:39:04PM -0500, Andres Freund wrote:
> > > > I suspect what's happening is that the communication with the
> > > > external process allows for reordering between stdout/stderr.
> > > > 
> > > > And indeed, changing BackgroundPsql::query() to emit the banner on both stdout
> > > > and stderr and waiting on both seems to fix the issue.
> > > 
> > > That makes sense.  I wondered how one might fix IPC::Run to preserve the
> > > relative timing of stdout and stderr, not perturbing the timing the way that
> > > disrupted your test run.  I can think of two strategies:
> > > 
> > > - Remove the proxy.
> > > 
> > > - Make pipe data visible to Perl variables only when at least one of the
> > >   proxy<-program_under_test pipes had no data ready to read.  In other words,
> > >   if both pipes have data ready, make all that data visible to Perl code
> > >   simultaneously.  (When both the stdout pipe and the stderr pipe have data
> > >   ready, one can't determine data arrival order.)
> > > 
> > > Is there a possibly-less-invasive change that might work?
> > 
> > I don't really know enough about IPC::Run's internals to answer. My
> > interpretation of how it might work, purely from observation, is that it opens
> > one tcp connection for each "pipe" and that that's what's introducing the
> > potential of reordering, as the different sockets can have different delivery
> > timeframes.
> 
> Right.
> 
> > If that's it, it seems proxying all the pipes through one
> > connection might be an option.
> 
> It would.  Thanks.  However, I think that would entail modifying the program
> under test to cooperate with the arrangement.  When running an ordinary
> program that does write(1, ...) and write(2, ...), the read end needs some way
> to deal with the uncertainty about which write happened first.  dup2(1, 2)
> solves the order ambiguity, but it loses other signal.

I think what's happening in this case must go beyond just that. Afaict just
doing ->pump_nb() would otherwise solve it. My uninformed theory is that two
tcp connections are used. With two pipes

P1: write(1)
P1: write(2)
P2: read(1)
P2: read(2)

wouldn't ever result in P2 not seeing data on either of reads. But with two
TCP sockets there can be time between the send() completing and recv() on the
other side reading the data, even on a local system (e.g. due to the tcp stack
waiting a while for more data before sending data).

To avoid that the proxy program could read from N pipes and then proxy them
through *one* socket by prefixing the data with information about which pipe
the data is from. Then IPC::Run could split the data again, using the added
prefix.

I don't think that would require modifying the program under test?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: BackgroundPsql swallowing errors on windows
Next
From: Peter Smith
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation