On Thu, Mar 20, 2025 at 01:05:05PM -0400, Andres Freund wrote:
> On 2025-03-19 18:17:37 -0400, Andres Freund wrote:
> > On 2025-03-19 14:25:30 -0700, Noah Misch wrote:
> > > > + * marked as failed. In case of a partial read, some buffers may be
> > > > + * ok.
> > > > + */
> > > > + failed =
> > > > + prior_result.status == ARS_ERROR
> > > > + || prior_result.result <= buf_off;
> > >
> > > I didn't run an experiment to check the following, but I think this should be
> > > s/<=/</. Suppose we requested two blocks and read some amount of bytes
> > > [1*BLCKSZ, 2*BLSCKSZ - 1]. md_readv_complete will store result=1. buf_off==0
> > > should compute failed=false here, but buf_off==1 should compute failed=true.
> >
> > Huh, you might be right. I thought I wrote a test for this, I wonder why it
> > didn't catch the problem...
>
> It was correct as-is. With result=1 you get precisely the result you describe
> as the desired outcome, no?
> prior_result.result <= buf_off
> ->
> 1 <= 0 -> failed = 0
> 1 <= 1 -> failed = 1
>
> but if it were < as you suggest:
>
> prior_result.result < buf_off
> ->
> 1 < 0 -> failed = 0
> 1 < 1 -> failed = 0
>
> I.e. we would assume that the second buffer also completed.
That's right. I see it now. My mistake.
> What does concern me is that the existing tests do *not* catch the problem if
> I turn "<=" into "<". The second buffer in this case wrongly gets marked as
> valid. We do retry the read (because bufmgr.c thinks only one block was read),
> but find the buffer to already be valid.
>
> The reason the test doesn't fail, is that the way I set up the "short read"
> tests. The injection point runs after the IO completed and just modifies the
> result. However, the actual buffer contents still got modified.
>
>
> The easiest way around that seems to be to have the injection point actually
> zero out the remaining memory.
Sounds reasonable and sufficient.
FYI, I've resumed the comprehensive review. That's still ongoing.