Thread: BufFileSize() doesn't work on a "shared" BufFiles

BufFileSize() doesn't work on a "shared" BufFiles

From
Heikki Linnakangas
Date:
Hi,

I've been noodling around the tuplesort/logtape code, reviewing the 
changes that were made in v11 to support parallel sorting, and also 
rebasing my older patch to replace the polyphase merge with a simple 
balanced merge.

Looking at TapeShare:

> /*
>  * The approach tuplesort.c takes to parallel external sorts is that workers,
>  * whose state is almost the same as independent serial sorts, are made to
>  * produce a final materialized tape of sorted output in all cases.  This is
>  * frozen, just like any case requiring a final materialized tape.  However,
>  * there is one difference, which is that freezing will also export an
>  * underlying shared fileset BufFile for sharing.  Freezing produces TapeShare
>  * metadata for the worker when this happens, which is passed along through
>  * shared memory to leader.
>  *
>  * The leader process can then pass an array of TapeShare metadata (one per
>  * worker participant) to LogicalTapeSetCreate(), alongside a handle to a
>  * shared fileset, which is sufficient to construct a new logical tapeset that
>  * consists of each of the tapes materialized by workers.
>  *
>  * Note that while logtape.c does create an empty leader tape at the end of the
>  * tapeset in the leader case, it can never be written to due to a restriction
>  * in the shared buffile infrastructure.
>  */
> typedef struct TapeShare
> {
>     /*
>      * firstblocknumber is first block that should be read from materialized
>      * tape.
>      *
>      * buffilesize is the size of associated BufFile following freezing.
>      */
>     long        firstblocknumber;
>     off_t        buffilesize;
> } TapeShare;

Why is 'buffilesize' part of the exported state? The leader process can 
easily call BufFileSize() itself, instead of having the worker process 
pass it through shared memory, right? Wrong. I tried to change it that 
way, and after some debugging, I realized that BufFileSize() doesn't 
work if it's called on a "shared" BufFile. It always returns 0. That's 
because it uses FileGetSize(), which in turn only works on a temporary 
file. None of this is mentioned in the comments :-(.

Perhaps that would be OK, if it was properly commented. But it's not 
actually hard to make BufFileSize() work on shared files, too, so I 
think we should do that.

Another little bug I noticed is that BufFileAppend() incorrectly resets 
the 'offsets' of the source files. You won't notice if you call 
BufFileAppend() immediately after BufFileOpenShared(), but if you had 
called BufFileSeek() or BufFileRead() on the shared BufFile handle 
before calling BufFileAppend(), it would get confused.

I propose the attached patch.

- Heikki

Attachment

Re: BufFileSize() doesn't work on a "shared" BufFiles

From
Peter Geoghegan
Date:
On Mon, Apr 30, 2018 at 10:18 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Perhaps that would be OK, if it was properly commented. But it's not
> actually hard to make BufFileSize() work on shared files, too, so I think we
> should do that.

I agree that this could use a comment, but I don't see much point in
adding the extra FileSeek(). The fact that fd.c is unwilling to track
filesize for non-temp files (where "non-temp" means a
PathNameOpenTemporaryFile()-returned/exported file) is due to
vfdP->fileSize being involved in temp_file_limit enforcement. Maybe a
FD_TEMP_FILE_LIMIT assertion within FileGetSize() would help?

> Another little bug I noticed is that BufFileAppend() incorrectly resets the
> 'offsets' of the source files. You won't notice if you call BufFileAppend()
> immediately after BufFileOpenShared(), but if you had called BufFileSeek()
> or BufFileRead() on the shared BufFile handle before calling
> BufFileAppend(), it would get confused.

Seems worth fixing.

-- 
Peter Geoghegan


Re: BufFileSize() doesn't work on a "shared" BufFiles

From
Heikki Linnakangas
Date:
On 30/04/18 22:00, Peter Geoghegan wrote:
> On Mon, Apr 30, 2018 at 10:18 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Perhaps that would be OK, if it was properly commented. But it's not
>> actually hard to make BufFileSize() work on shared files, too, so I think we
>> should do that.
> 
> I agree that this could use a comment, but I don't see much point in
> adding the extra FileSeek(). The fact that fd.c is unwilling to track
> filesize for non-temp files (where "non-temp" means a
> PathNameOpenTemporaryFile()-returned/exported file) is due to
> vfdP->fileSize being involved in temp_file_limit enforcement. Maybe a
> FD_TEMP_FILE_LIMIT assertion within FileGetSize() would help?

Seems simpler to just make it work. It's not like the extra lseek() is 
going to make any performance difference. Functions that work only under 
certain conditions are annoying.

Pushed, thanks.

- Heikki