Hi,
On 2025-03-19 18:17:37 -0400, Andres Freund wrote:
> On 2025-03-19 14:25:30 -0700, Noah Misch wrote:
> > > + * marked as failed. In case of a partial read, some buffers may be
> > > + * ok.
> > > + */
> > > + failed =
> > > + prior_result.status == ARS_ERROR
> > > + || prior_result.result <= buf_off;
> >
> > I didn't run an experiment to check the following, but I think this should be
> > s/<=/</. Suppose we requested two blocks and read some amount of bytes
> > [1*BLCKSZ, 2*BLSCKSZ - 1]. md_readv_complete will store result=1. buf_off==0
> > should compute failed=false here, but buf_off==1 should compute failed=true.
>
> Huh, you might be right. I thought I wrote a test for this, I wonder why it
> didn't catch the problem...
It was correct as-is. With result=1 you get precisely the result you describe
as the desired outcome, no?
prior_result.result <= buf_off
->
1 <= 0 -> failed = 0
1 <= 1 -> failed = 1
but if it were < as you suggest:
prior_result.result < buf_off
->
1 < 0 -> failed = 0
1 < 1 -> failed = 0
I.e. we would assume that the second buffer also completed.
What does concern me is that the existing tests do *not* catch the problem if
I turn "<=" into "<". The second buffer in this case wrongly gets marked as
valid. We do retry the read (because bufmgr.c thinks only one block was read),
but find the buffer to already be valid.
The reason the test doesn't fail, is that the way I set up the "short read"
tests. The injection point runs after the IO completed and just modifies the
result. However, the actual buffer contents still got modified.
The easiest way around that seems to be to have the injection point actually
zero out the remaining memory. Not pretty, but it'd be harder to just submit
shortend IOs in multiple IO methods. It'd be even better if we could
trivially use something like randomize_mem(), but it's only conditionally
compiled...
Greetings,
Andres Freund