Thread: Minor optimizations in lazy_scan_heap
I was looking at the code in lazy_scan_heap() and I realized there are couple of low-hanging optimizations that we can dothere.<br /><br />1. The for-loop walks through each block of the relation. But if scan_all is set to false, which wouldbe the case most often, we can jump over to the next not-all-visible block directly (after considering the SKIP_PAGES_THRESHOLDetc). I understand that the cost of looping with no-op may not be considerable, but it looks unnecessary.And it can matter when there are thousands and millions of consecutive all-visible blocks in a large table.<br/><br />2. We also do a visibilitymap_test() for each block. I think it will be more prudent to have a visibilitymapAPI, say visibilitymap_test_range(), which can take a range of blocks and return the first not-all-visible blockfrom the range. Internally, the function can then test several blocks at a time. We can still do this without holdinga lock on the VM buffer because when scan_all is false, we don't care much about the correctness of the visibilitycheck anyway. Also, this function can later be optimized if we start saving some summary information about visibilitymaps, in which case we can more efficiently find first not-all-visible block.<br /><br />3. I also thought thatthe call to vacuum_delay_point() for every visibility check is not required and a simple CHECK_FOR_INTERRUPTS would begood enough. Later I realized that may be we need that because visibility map check can do an IO for the VM page. But ifwe do 2, then we can at least limit calling vacuum_delay_point() once for every VM page, instead of one per bit. I concedethat the cost of calling vacuum_delay_point() may not be too high, but it again looks unnecessary and can be takencare by a slight re-factoring of the code.<br /><br />Comments ? Anyone thinks any/all of above is useful ?<br /><br/>Thanks,<br />Pavan<br clear="all" /><br />-- <br />Pavan Deolasee<br /><a href="http://www.linkedin.com/in/pavandeolasee"target="_blank">http://www.linkedin.com/in/pavandeolasee</a><br />
On Mon, Dec 3, 2012 at 1:23 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > I was looking at the code in lazy_scan_heap() and I realized there are > couple of low-hanging optimizations that we can do there. > > 1. The for-loop walks through each block of the relation. But if scan_all is > set to false, which would be the case most often, we can jump over to the > next not-all-visible block directly (after considering the > SKIP_PAGES_THRESHOLD etc). I understand that the cost of looping with no-op > may not be considerable, but it looks unnecessary. And it can matter when > there are thousands and millions of consecutive all-visible blocks in a > large table. > > 2. We also do a visibilitymap_test() for each block. I think it will be more > prudent to have a visibilitymap API, say visibilitymap_test_range(), which > can take a range of blocks and return the first not-all-visible block from > the range. Internally, the function can then test several blocks at a time. > We can still do this without holding a lock on the VM buffer because when > scan_all is false, we don't care much about the correctness of the > visibility check anyway. Also, this function can later be optimized if we > start saving some summary information about visibility maps, in which case > we can more efficiently find first not-all-visible block. > > 3. I also thought that the call to vacuum_delay_point() for every visibility > check is not required and a simple CHECK_FOR_INTERRUPTS would be good > enough. Later I realized that may be we need that because visibility map > check can do an IO for the VM page. But if we do 2, then we can at least > limit calling vacuum_delay_point() once for every VM page, instead of one > per bit. I concede that the cost of calling vacuum_delay_point() may not be > too high, but it again looks unnecessary and can be taken care by a slight > re-factoring of the code. > > Comments ? Anyone thinks any/all of above is useful ? I doubt that any of these things make enough difference to be worth bothering with, but if you have benchmark results suggesting otherwise I'm all ears. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 4, 2012 at 11:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I doubt that any of these things make enough difference to be worthOn Mon, Dec 3, 2012 at 1:23 AM, Pavan Deolasee <pavan.deolasee@gmail.com>
>
> Comments ? Anyone thinks any/all of above is useful ?
bothering with,
You're right. These are not big ticket optimisations, still I felt they are worth doing because tiny bits add up over a time and also because the code may become little simpler. The benchmarks don't show anything interesting though. The time taken to scan 100K+ bits is sub-second. So even when I tried with the attached patch, the numbers did not show any noticeable difference. It might be worth trying with a table with 1M or 10M data blocks, but I don't have such a hardware to test.
The patch itself can be improved further, especially we can possibly optimise the loop and test 32-bits at a time, instead of 8 I am doing currently. Not sure its worth though.
Thanks,
Pavan
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee