> On Fri, Mar 28, 2025 at 09:23:13AM GMT, Filip Janus wrote:
> > + else
> > + if (nbytesOriginal - file->pos != 0)
> > + /* curOffset must be corrected also if compression is
> > + * enabled, nbytes was changed by compression but we
> > + * have to use the original value of nbytes
> > + */
> > + file->curOffset-=bytestowrite;
> >
> > It's not something introduced by the compression patch - the first part
> > is what we used to do before. But I find it a bit confusing - isn't it
> > mixing the correction of "logical file position" adjustment we did
> > before, and also the adjustment possibly needed due to compression?
> >
> > In fact, isn't it going to fail if the code gets multiple loops in
> >
> > while (wpos < file->nbytes)
> > {
> > ...
> > }
> >
> > because bytestowrite will be the value from the last loop? I haven't
> > tried, but I guess writing wide tuples (more than 8k) might fail.
> >
>
> I will definitely test it with larger tuples than 8K.
>
> Maybe I don't understand it correctly,
> the adjustment is performed in the case that file->nbytes and file->pos
> differ.
> So it must persist also if we are working with the compressed data, but the
> problem is that data stored and compressed on disk has different sizes than
> data incoming uncompressed ones, so what should be the correction value.
> By debugging, I realized that the correction should correspond to the size
> of
> bytestowrite from the last iteration of the loop.
I agree, this looks strange. If the idea is to set curOffset to its
original value + pos, and the original value was advanced multiple times
by bytestowrite, it seems incorrect to adjust it by bytestowrite, it
seems incorrect to adjust it only once. From what I see current tests do
not exercise a case where the while will get multiple loops, so it looks
fine.
At the same time maybe I'm missing something, but how exactly such test
for 8k tuples and multiple loops in the while block should look like?
E.g. when I force a hash join on a table with a single wide text column,
the minimal tuple that is getting written to the temporary file still
has rather small length, I assume due to toasting. Is there some other
way to achieve that?