Re: Pluggable Storage - Andres's take - Mailing list pgsql-hackers
From | Amit Khandekar |
---|---|
Subject | Re: Pluggable Storage - Andres's take |
Date | |
Msg-id | CAJ3gD9ceWjbQV5fyt=3WBSBf4hG+2ur1m4sB9RbkwPGfof+F7A@mail.gmail.com Whole thread Raw |
In response to | Re: Pluggable Storage - Andres's take (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Pluggable Storage - Andres's take
|
List | pgsql-hackers |
On Thu, 21 Feb 2019 at 18:06, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Feb 21, 2019 at 6:44 AM Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > > Ok, so something like XidHorizonPrefetchState ? On similar lines, does > > prefetch_buffer() function name sound too generic as well ? > > Yeah, that sounds good. > And, yeah, then maybe rename the function too. Renamed the function to xid_horizon_prefetch_buffer(). > > > > +/* > > > + * An arbitrary way to come up with a pre-fetch distance that grows with io > > > + * concurrency, but is at least 10 and not more than the max effective io > > > + * concurrency. > > > + */ > > > > > > This comment is kinda useless, because it only tells us what the code > > > does (which is obvious anyway) and not why it does that. Saying that > > > your formula is arbitrary may not be the best way to attract support > > > for it. > > > > Well, I had checked the way the number of drive spindles > > (effective_io_concurrency) is used to calculate the prefetch distance > > for bitmap heap scans (ComputeIoConcurrency). Basically I think the > > intention behind that method is to come up with a number that makes it > > highly likely that we pre-fetch a block of each of the drive spindles. > > But I didn't get how that exactly works, all the less for non-parallel > > bitmap scans. Same is the case for the pre-fetching that we do here > > for xid-horizon stuff, where we do the block reads sequentially. Me > > and Andres discussed this offline, and he was of the opinion that this > > formula won't help here, and instead we just keep a constant distance > > that is some number greater than effective_io_concurrency. I agree > > that instead of saying "arbitrary" we should explain why we have done > > that, and before that, come up with an agreed-upon formula. > > Maybe something like: We don't use the regular formula to determine > how much to prefetch here, but instead just add a constant to > effective_io_concurrency. That's because it seems best to do some > prefetching here even when effective_io_concurrency is set to 0, but > if the DBA thinks it's OK to do more prefetching for other operations, > then it's probably OK to do more prefetching in this case, too. It > may be that this formula is too simplistic, but at the moment we have > no evidence of that or any idea about what would work better. Thanks for writing it down for me. I think this is good-to-go as a comment; so I put this as-is into the patch. > > > > + for (i = prefetch_state->next_item; i < nitems && count < prefetch_count; i++) > > > > > > It looks strange to me that next_item is stored in prefetch_state and > > > nitems is passed around as an argument. Is there some reason why it's > > > like that? > > > > We could keep the max count in the structure itself as well. There > > isn't any specific reason for not keeping it there. It's just that > > this function prefetch_state () is not a general function for > > maintaining a prefetch state that spans across function calls; so we > > might as well just pass the max count to that function instead of > > having another field in that structure. I am not inclined specifically > > towards either of the approaches. > > All right, count me as +0.5 for putting a copy in the structure. Have put the nitems into the structure. Thanks for the review. Attached v2. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Attachment
pgsql-hackers by date: