Re: heapgettup refactoring - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: heapgettup refactoring |
Date | |
Msg-id | CAApHDvo41W-30nfKiy+14XnZp+DeotVuMNxwU3pyZfU-xRO0Hw@mail.gmail.com Whole thread Raw |
In response to | Re: heapgettup refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: heapgettup refactoring
|
List | pgsql-hackers |
On Wed, 8 Feb 2023 at 09:41, Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote: > > I ended up adjusting HeapScanDescData more than what is minimally > > required to remove rs_inited. I wondered if rs_cindex should be closer > > to rs_cblock in the struct so that we're more likely to be adjusting > > the same cache line than if that field were closer to the end of the > > struct. We don't need rs_coffset and rs_cindex at the same time, so I > > made it a union. I see that the struct is still 712 bytes before and > > after this change. I've not yet tested to see if there are any > > performance gains to this change. > > > > I like the idea of using a union. Using the tests mentioned in [1], I tested out remove_HeapScanDescData_rs_inited_field.patch. It's not looking very promising at all. seqscan.sql test: master (e2c78e7ab) tps = 27.769076 (without initial connection time) tps = 28.155233 (without initial connection time) tps = 26.990389 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch tps = 23.990490 (without initial connection time) tps = 23.450662 (without initial connection time) tps = 23.600194 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch without union HeapScanDescData change (just remove rs_inited field) tps = 24.419007 (without initial connection time) tps = 24.221389 (without initial connection time) tps = 24.187756 (without initial connection time) countall.sql test: master (e2c78e7ab) tps = 33.999408 (without initial connection time) tps = 33.664292 (without initial connection time) tps = 33.869115 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch tps = 31.194316 (without initial connection time) tps = 30.804987 (without initial connection time) tps = 30.770236 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch without union HeapScanDescData change (just remove rs_inited field) tps = 32.626187 (without initial connection time) tps = 32.876362 (without initial connection time) tps = 32.481729 (without initial connection time) I don't really have any explanation for why this slows performance so much. My thoughts are that if the performance of scans is this sensitive to the order of the fields in the struct then it's an independent project to learn out why and what we can realistically change to get the best performance here. > So, I was wondering what we should do about initializing rs_coffset here > since it doesn't fall under "don't initialize it because it is only used > for page-at-a-time mode". It might not be required for us to initialize > it in initscan, but we do bother to initialize other "current scan > state" fields. We could check if pagemode is enabled and initialize > rs_coffset or rs_cindex depending on that. Maybe master should be initialising this field already. I didn't quite see it as important as it's never used before rs_inited is set to true. Maybe setting it to InvalidOffsetNumber is a good idea just in case something tries to use it before it gets set. > > * Note: when we fall off the end of the scan in either direction, we > > - * reset rs_inited. This means that a further request with the same > > - * scan direction will restart the scan, which is a bit odd, but a > > - * request with the opposite scan direction will start a fresh scan > > + * reset rs_cblock to InvalidBlockNumber. This means that a further request > > + * with the same scan direction will restart the scan, which is a bit odd, but > > + * a request with the opposite scan direction will start a fresh scan > > * in the proper direction. The latter is required behavior for cursors, > > * while the former case is generally undefined behavior in Postgres > > * so we don't care too much. > > Not the fault of this patch, but I am having trouble parsing this > comment. What does restart the scan mean? I get that it is undefined > behavior, but it is confusing because it kind of sounds like a rescan > which is not what it is (right?). And what exactly is a fresh scan? It > is probably correct, I just don't really understand what it is saying. > Feel free to ignore this aside, as I think your change is correctly > updating the comment. I struggled with this too. It just looks incorrect. As far as I see it, once the scan ends we do the same thing regardless of what the scan direction is. Maybe it's worth looking back at when that comment was added and seeing if it was true when it was written and then see what changed. I think it's worth improving that independently. I think I'd like to focus on the cleanup stuff before looking into getting rid of rs_inited. I'm not sure I'm going to get time to do the required performance tests to look into why removing rs_inited slows things down so much. > Also, a few random other complaints about the comments in > HeapScanDescData that are also not the fault of this patch: > > - why is this comment there twice? > /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */ Seems to date back to 2019. I'll push something shortly to remove the additional one. > - above the union it said (prior to this patch also) > /* scan current state */ > which is in contrast to > /* state set up at initscan time */ > from the top, but, arguably, rs_strategy is set up at initscan time I'm not sure what the best thing to do about that is. Given the performance numbers I showed above after removing the rs_inited field, I'd rather not move that field in the struct up to the other fields that are set in initscan(). Perhaps you can suggest a patch which improves the comments? > - who or what is NB? > /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */ Latin for "note". It's used quite commonly in our code. David [1] https://postgr.es/m/CAApHDvpnA9SGp3OeXr4cYqX_w=NYN2YMzf2zfrABPNDsUqoNqw@mail.gmail.com
pgsql-hackers by date: