Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Date | |
Msg-id | CAA4eK1L5PyRZMS0B8C+d_RCHo0VX6hu6D6tPnXnqPhy4tcNtFQ@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
List | pgsql-hackers |
On Thu, Jan 9, 2020 at 10:30 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Jan 9, 2020 at 9:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Jan 8, 2020 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I have observed one more design issue. > > > > > > > Good observation. > > > > > The problem is that when we > > > get a toasted chunks we remember the changes in the memory(hash table) > > > but don't stream until we get the actual change on the main table. > > > Now, the problem is that we might get the change of the toasted table > > > and the main table in different streams. So basically, in a stream, > > > if we have only got the toasted tuples then even after > > > ReorderBufferStreamTXN the memory usage will not be reduced. > > > > > > > I think we can't split such changes in a different stream (unless we > > design an entirely new solution to send partial changes of toast > > data), so we need to send them together. We can keep a flag like > > data_complete in ReorderBufferTxn and mark it complete only when we > > are able to assemble the entire tuple. Now, whenever, we try to > > stream the changes once we reach the memory threshold, we can check > > whether the data_complete flag is true, if so, then only send the > > changes, otherwise, we can pick the next largest transaction. I think > > we can retry it for few times and if we get the incomplete data for > > multiple transactions, then we can decide to spill the transaction or > > maybe we can directly spill the first largest transaction which has > > incomplete data. > > > Yeah, we might do something on this line. Basically, we need to mark > the top-transaction as data-incomplete if any of its subtransaction is > having data-incomplete (it will always be the latest sub-transaction > of the top transaction). Also, for streaming, we are checking the > largest top transaction whereas for spilling we just need the larget > (sub) transaction. So we also need to decide while picking the > largest top transaction for streaming, if we get a few transactions > with in-complete data then how we will go for the spill. Do we spill > all the sub-transactions under this top transaction or we will again > find the larget (sub) transaction for spilling. > I think it is better to do later as that will lead to the spill of only required (minimum changes to get the memory below threshold) changes. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: