Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Date | |
Msg-id | CAA4eK1LV3ZD+zHXH36EtSLa8iLxkRdibaW-Wf1u1bHXvRsak1w@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
|
List | pgsql-hackers |
On Tue, Jun 2, 2020 at 3:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, May 28, 2020 at 5:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Also, what if the changes file size overflows "OS file size limit"? > > If we agree that the above are problems then do you think we should > > explore using BufFile interface (see storage/file/buffile.c) to avoid > > all such problems? > > I also think that the file size is a problem. I think we can use > BufFile with some modifications. We can not use the > BufFileCreateTemp, because of few reasons > 1) files get deleted on close, but we have to open/close on every > stream start/stop. > 2) even if we try to avoid closing we need to the BufFile pointers > (which take 8192k per file) because there is no option to pass the > file name. > > I thin for our use case BufFileCreateShared is more suitable. I think > we need to do some modifications so that we can use these apps without > SharedFileSet. Otherwise, we need to unnecessarily need to create > SharedFileSet for each transaction and also need to maintain it in xid > array or xid hash until transaction commit/abort. So I suggest > following modifications in shared files set so that we can > conveniently use it. > 1. ChooseTablespace(const SharedFileSet fileset, const char name) > if fileset is NULL then select the DEFAULTTABLESPACEOID > 2. SharedFileSetPath(char path, SharedFileSet fileset, Oid tablespace) > If fileset is NULL then in directory path we can use MyProcPID or > something instead of fileset->creator_pid. > Hmm, I find these modifications a bit ad-hoc. So, not sure if it is better than the patch maintains sharedfileset information. > 3. Pass some parameter to BufFileOpenShared, so that it can open the > file in RW mode instead of read-only mode. > This seems okay. > > > 2. > > apply_handle_stream_abort() > > { > > .. > > + /* discard the subxacts added later */ > > + nsubxacts = subidx; > > + > > + /* write the updated subxact list */ > > + subxact_info_write(MyLogicalRepWorker->subid, xid); > > .. > > } > > > > Here, if subxacts becomes zero, then also subxact_info_write will > > create a new file and write checksum. > > How, will it create the new file, in fact it will write nsubxacts as 0 > in the existing file, and I think we need to do that right so that in > next open we will know that the nsubxact is 0. > > I think subxact_info_write > > should have a check for nsubxacts > 0 before writing to the file. > > But, even if nsubxacts become 0 we want to write the file so that we > can overwrite the previous info. > Can't we just remove the file for such a case? apply_handle_stream_abort() { .. + /* XXX optimize the search by bsearch on sorted data */ + for (i = nsubxacts; i > 0; i--) + { + if (subxacts[i - 1].xid == subxid) + { + subidx = (i - 1); + found = true; + break; + } + } + + /* + * If it's an empty sub-transaction then we will not find the subxid + * here so just free the memory and return. + */ + if (!found) + { + /* Free the subxacts memory */ + if (subxacts) + pfree(subxacts); + + subxacts = NULL; + subxact_last = InvalidTransactionId; + nsubxacts = 0; + nsubxacts_max = 0; + + return; + } .. } I have one question regarding the above code. Isn't it possible that a particular subtransaction id doesn't have any change but others do we have? For ex. cases like below: postgres=# begin; BEGIN postgres=*# insert into t1 values(1); INSERT 0 1 postgres=*# savepoint s1; SAVEPOINT postgres=*# savepoint s2; SAVEPOINT postgres=*# insert into t1 values(2); INSERT 0 1 postgres=*# insert into t1 values(3); INSERT 0 1 postgres=*# Rollback to savepoint s1; ROLLBACK postgres=*# commit; Here, we have performed Rolledback to savepoint s1 which doesn't have any change of its own. I think this would have handled but just wanted to confirm. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: