Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Date | |
Msg-id | CAFiTN-ufHPL5VAk=BW5abxWq2OVv55niuNnc7Kjh5tcCPJmQYQ@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
|
List | pgsql-hackers |
On Tue, Jun 2, 2020 at 4:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 2, 2020 at 3:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, May 28, 2020 at 5:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Also, what if the changes file size overflows "OS file size limit"? > > > If we agree that the above are problems then do you think we should > > > explore using BufFile interface (see storage/file/buffile.c) to avoid > > > all such problems? > > > > I also think that the file size is a problem. I think we can use > > BufFile with some modifications. We can not use the > > BufFileCreateTemp, because of few reasons > > 1) files get deleted on close, but we have to open/close on every > > stream start/stop. > > 2) even if we try to avoid closing we need to the BufFile pointers > > (which take 8192k per file) because there is no option to pass the > > file name. > > > > I thin for our use case BufFileCreateShared is more suitable. I think > > we need to do some modifications so that we can use these apps without > > SharedFileSet. Otherwise, we need to unnecessarily need to create > > SharedFileSet for each transaction and also need to maintain it in xid > > array or xid hash until transaction commit/abort. So I suggest > > following modifications in shared files set so that we can > > conveniently use it. > > 1. ChooseTablespace(const SharedFileSet fileset, const char name) > > if fileset is NULL then select the DEFAULTTABLESPACEOID > > 2. SharedFileSetPath(char path, SharedFileSet fileset, Oid tablespace) > > If fileset is NULL then in directory path we can use MyProcPID or > > something instead of fileset->creator_pid. > > > > Hmm, I find these modifications a bit ad-hoc. So, not sure if it is > better than the patch maintains sharedfileset information. I think we might do something better here, maybe by supplying function pointer or so, but maintaining sharedfileset which contains different tablespace/mutext which we don't need at all for our purpose also doesn't sound very appealing. Let me see if I can not come up with some clean way of avoiding the need to shared-fileset then maybe we can go with the shared fileset idea. > > 3. Pass some parameter to BufFileOpenShared, so that it can open the > > file in RW mode instead of read-only mode. > > > > This seems okay. > > > > > > 2. > > > apply_handle_stream_abort() > > > { > > > .. > > > + /* discard the subxacts added later */ > > > + nsubxacts = subidx; > > > + > > > + /* write the updated subxact list */ > > > + subxact_info_write(MyLogicalRepWorker->subid, xid); > > > .. > > > } > > > > > > Here, if subxacts becomes zero, then also subxact_info_write will > > > create a new file and write checksum. > > > > How, will it create the new file, in fact it will write nsubxacts as 0 > > in the existing file, and I think we need to do that right so that in > > next open we will know that the nsubxact is 0. > > > > I think subxact_info_write > > > should have a check for nsubxacts > 0 before writing to the file. > > > > But, even if nsubxacts become 0 we want to write the file so that we > > can overwrite the previous info. > > > > Can't we just remove the file for such a case? But, as of now, we expect if it is not a first-time stream start then the file exists. Actually, currently, it's very easy that if it is not the first segment we always expect that the file must exist, otherwise an error. Now if it is not the first segment then we will need to handle multiple cases. a) subxact_info_read need to handle the error case, because the file may not exist because there was no subxact in last stream or it was deleted because nsubxact become 0. b) subxact_info_write, there will be multiple cases that if nsubxact was already 0 then we can avoid writing the file, but if it become 0 now we need to remove the file. Let me think more on that. > > apply_handle_stream_abort() > { > .. > + /* XXX optimize the search by bsearch on sorted data */ > + for (i = nsubxacts; i > 0; i--) > + { > + if (subxacts[i - 1].xid == subxid) > + { > + subidx = (i - 1); > + found = true; > + break; > + } > + } > + > + /* > + * If it's an empty sub-transaction then we will not find the subxid > + * here so just free the memory and return. > + */ > + if (!found) > + { > + /* Free the subxacts memory */ > + if (subxacts) > + pfree(subxacts); > + > + subxacts = NULL; > + subxact_last = InvalidTransactionId; > + nsubxacts = 0; > + nsubxacts_max = 0; > + > + return; > + } > .. > } > > I have one question regarding the above code. Isn't it possible that > a particular subtransaction id doesn't have any change but others do > we have? For ex. cases like below: > > postgres=# begin; > BEGIN > postgres=*# insert into t1 values(1); > INSERT 0 1 > postgres=*# savepoint s1; > SAVEPOINT > postgres=*# savepoint s2; > SAVEPOINT > postgres=*# insert into t1 values(2); > INSERT 0 1 > postgres=*# insert into t1 values(3); > INSERT 0 1 > postgres=*# Rollback to savepoint s1; > ROLLBACK > postgres=*# commit; > > Here, we have performed Rolledback to savepoint s1 which doesn't have > any change of its own. I think this would have handled but just > wanted to confirm. But internally, that will send abort for the s2 first, and for that, we will find xid and truncate, and later we will send abort for s1 but that we will not find and do nothing? Anyway, I will test it and let you know. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: