Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Date | |
Msg-id | CAA4eK1J+TEFyhbAEkRo-nibh+kjo0=aXtT2YbdEP1cvF2gOgmA@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
List | pgsql-hackers |
On Fri, Aug 7, 2020 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Aug 6, 2020 at 2:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > .. > > This patch's functionality can be independently verified by SQL APIs > > Your changes look fine to me. > I have pushed that patch last week and attached are the remaining patches. I have made a few changes in the next patch 0001-Extend-the-BufFile-interface.patch and have some comments on it which are as below: 1. case SEEK_END: - /* could be implemented, not needed currently */ + + /* + * Get the file size of the last file to get the last offset of + * that file. + */ + newFile = file->numFiles - 1; + newOffset = FileSize(file->files[file->numFiles - 1]); + if (newOffset < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not determine size of temporary file \"%s\" from BufFile \"%s\": %m", + FilePathName(file->files[file->numFiles - 1]), + file->name))); + break; break; There is no need for multiple breaks in the above code. I have fixed this one in the attached patch. 2. +void +BufFileTruncateShared(BufFile *file, int fileno, off_t offset) +{ + int newFile = file->numFiles; + off_t newOffset = file->curOffset; + char segment_name[MAXPGPATH]; + int i; + + /* Loop over all the files upto the fileno which we want to truncate. */ + for (i = file->numFiles - 1; i >= fileno; i--) + { + /* + * Except the fileno, we can directly delete other files. If the + * offset is 0 then we can delete the fileno file as well unless it is + * the first file. + */ + if ((i != fileno || offset == 0) && fileno != 0) + { + SharedSegmentName(segment_name, file->name, i); + FileClose(file->files[i]); + if (!SharedFileSetDelete(file->fileset, segment_name, true)) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not delete shared fileset \"%s\": %m", + segment_name))); + newFile--; + newOffset = MAX_PHYSICAL_FILESIZE; + } + else + { + if (FileTruncate(file->files[i], offset, + WAIT_EVENT_BUFFILE_TRUNCATE) < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not truncate file \"%s\": %m", + FilePathName(file->files[i])))); + + newOffset = offset; + } + } + + file->numFiles = newFile; + file->curOffset = newOffset; +} In the end, you have only set 'numFiles' and 'curOffset' members of BufFile and left others. I think other members like 'curFile' also need to be set especially for the case where we have deleted segments at the end, also, shouldn't we need to set 'pos' and 'nbytes' as we do in BufFileSeek. If there is some reason that we don't to set these other members then maybe it is better to add a comment to make it clear. Another thing we need to think here whether we need to flush the buffer data for the dirty buffer? Consider a case where we truncate the file up to a position that falls in the buffer. Now we might truncate the file and part of buffer contents will become invalid, next time if we flush such a buffer then the file can contain the garbage or maybe this will be handled if we update the position in buffer appropriately but all of this should be explained in comments. If what I said is correct, then we still can skip buffer flush in some cases as we do in BufFileSeek. Also, consider if we need to do other handling (convert seek to "start of next seg" to "end of last seg") as we do after changing the seek position in BufFileSeek. 3. /* * Initialize a space for temporary files that can be opened by other backends. * Other backends must attach to it before accessing it. Associate this * SharedFileSet with 'seg'. Any contained files will be deleted when the * last backend detaches. * * We can also use this interface if the temporary files are used only by * single backend but the files need to be opened and closed multiple times * and also the underlying files need to survive across transactions. For * such cases, dsm segment 'seg' should be passed as NULL. We remove such * files on proc exit. * * Files will be distributed over the tablespaces configured in * temp_tablespaces. * * Under the covers the set is one or more directories which will eventually * be deleted when there are no backends attached. */ void SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg) { .. I think we can remove the part of the above comment after 'eventually be deleted' (see last sentence in comment) because now the files can be removed in more than one way and we have explained that in the comments before this last sentence of the comment. If you can rephrase it differently to cover the other case as well, then that is fine too. -- With Regards, Amit Kapila.
Attachment
pgsql-hackers by date: