Re: Confine vacuum skip logic to lazy_scan_skip - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Confine vacuum skip logic to lazy_scan_skip
Date
Msg-id 798102.1761149528@sss.pgh.pa.us
Whole thread Raw
In response to Re: Confine vacuum skip logic to lazy_scan_skip  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
[ seizing on this old commit as being most closely related to the issue ]

Thomas Munro <thomas.munro@gmail.com> writes:
> On Tue, Jul 16, 2024 at 1:52 PM Noah Misch <noah@leadboat.com> wrote:
>> On Mon, Jul 15, 2024 at 03:26:32PM +1200, Thomas Munro wrote:
>> That's reasonable.  radixtree already forbids mutations concurrent with
>> iteration, so there's no new concurrency hazard.  One alternative is
>> per_buffer_data big enough for MaxOffsetNumber, but that might thrash caches
>> measurably.  That patch is good to go apart from these trivialities:

> Thanks!  I have pushed that patch, without those changes you didn't like.

The security team recently updated our Coverity instance to the latest
version, and it's started complaining as follows:


*** CID 1667418:         Memory - corruptions  (OVERRUN)
/srv/coverity/git/pgsql-git/postgresql/src/backend/access/heap/vacuumlazy.c: 2812             in lazy_vacuum_heap_rel()
2806              * already have the correct page pinned anyway.
2807              */
2808             visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
2809
2810             /* We need a non-cleanup exclusive lock to mark dead_items unused */
2811             LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
>>>     CID 1667418:         Memory - corruptions  (OVERRUN)
>>>     Overrunning callee's array of size 291 by passing argument "num_offsets" (which evaluates to 2048) in call to
"lazy_vacuum_heap_page".
2812             lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
2813                                   num_offsets, vmbuffer);
2814
2815             /* Now that we've vacuumed the page, record its available space */
2816             page = BufferGetPage(buf);
2817             freespace = PageGetHeapFreeSpace(page);


The reason it thinks that num_offsets could be as much as 2048 is
presumably the code a little bit above this:

        OffsetNumber offsets[MaxOffsetNumber];
        ...
        num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
        Assert(num_offsets <= lengthof(offsets));

However, lazy_vacuum_heap_page blindly assumes that the passed value
will be no more than MaxHeapTuplesPerPage.  It seems like we ought
to get these two functions in sync, either both using MaxOffsetNumber
or both using MaxHeapTuplesPerPage for their array sizes.

It looks to me like MaxHeapTuplesPerPage should be sufficient.
Also, after reading TidStoreGetBlockOffsets I wonder if we
should replace that Assert with

        num_offsets = Min(num_offsets, lengthof(offsets));

Thoughts?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Question for coverage report
Next
From: Tom Lane
Date:
Subject: Re: Dynamic shared memory areas