Re: [HACKERS] WIP: [[Parallel] Shared] Hash - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: [HACKERS] WIP: [[Parallel] Shared] Hash |
Date | |
Msg-id | CAH2-Wz=+9Rfqh5UdvdW9rGezdhrMGGH-JL1X9FXXVZdeeGeOJA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] WIP: [[Parallel] Shared] Hash (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
Re: [HACKERS] WIP: [[Parallel] Shared] Hash |
List | pgsql-hackers |
On Tue, Mar 21, 2017 at 5:07 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >> buffile.c should stop pretending to care about anything other than >> temp files, IMV. 100% of all clients that want temporary files go >> through buffile.c. 100% of all clients that want non-temp files (files >> which are not marked FD_TEMPORARY) access fd.c directly, rather than >> going through buffile.c. > > I still need BufFile because I want buffering. > > There are 3 separate characteristics enabled by flags with 'temporary' > in their name. I think we should consider separating the concerns by > splitting and renaming them: > > 1. Segmented BufFile behaviour. I propose renaming BufFile's isTemp > member to isSegmented, because that is what it really does. I want > that feature independently without getting confused about lifetimes. > Tested with small MAX_PHYSICAL_FILESIZE as you suggested. I would have proposed to get rid of the isTemp field entirely. It is always true with current usage, any only #ifdef NOT_USED code presumes that it could be any other way. BufFile is all about temp files, which ISTM should be formalized. The whole point of BufFile is to segment fd.c temp file segments. Who would ever want to use BufFile without that capability anyway? > 2. The temp_file_limit system. Currently this applies to fd.c files > opened with FD_TEMPORARY. You're right that we shouldn't be able to > escape that sanity check on disk space just because we want to manage > disk file ownership differently. I propose that we create a new flag > FD_TEMP_FILE_LIMIT that can be set independentlyisTemp of the flags > controlling disk file lifetime. When working with SharedBufFileSet, > the limit applies to each backend in respect of files it created, > while it has them open. This seems a lot simpler than any > shared-temp-file-limit type scheme and is vaguely similar to the way > work_mem applies in each backend for parallel query. I agree that that makes sense as a user-visible behavior of temp_file_limit. This user-visible behavior is what I actually implemented for parallel CREATE INDEX. > 3. Delete-on-close/delete-at-end-of-xact. I don't want to use that > facility so I propose disconnecting it from the above. We c{ould > rename those fd.c-internal flags FD_TEMPORARY and FD_XACT_TEMPORARY to > FD_DELETE_AT_CLOSE and FD_DELETE_AT_EOXACT. This reliably unlink()s all files, albeit while relying on unlink() ENOENT as a condition that terminates deletion of one particular worker's BufFile's segments. However, because you effectively no longer use resowner.c, ISTM that there is still a resource leak in error paths. ResourceOwnerReleaseInternal() won't call FileClose() for temp-ish files (that are not quite temp files in the current sense) in the absence of no other place managing to do so, such as BufFileClose(). How can you be sure that you'll actually close() the FD itself (not vFD) within fd.c in the event of an error? Or Delete(), which does some LRU maintenance for backend's local VfdCache? If I follow the new code correctly, then it doesn't matter that you've unlink()'d to take care of the more obvious resource management chore. You can still have a reference leak like this, if I'm not mistaken, because you still have backend local state (local VfdCache) that is left totally decoupled with the new "shadow resource manager" for shared BufFiles. > As shown in 0008-hj-shared-buf-file-v8.patch. Thoughts? A less serious issue I've also noticed is that you add palloc() calls, implicitly using the current memory context, within buffile.c. BufFileOpenTagged() has some, for example. However, there is a note that we don't need to save the memory context when we open a BufFile because we always repalloc(). That is no longer the case here. -- Peter Geoghegan
pgsql-hackers by date: