Re: Use streaming read API in pgstattuple. - Mailing list pgsql-hackers

From Nazir Bilal Yavuz
Subject Re: Use streaming read API in pgstattuple.
Date
Msg-id CAN55FZ0+xO1Y_Ru2Xe3D0taeJmTkUB2qbg-iwJXC6iX_mpx1Fw@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi,

Thank you for working on this!

On Mon, 25 Nov 2024 at 21:17, Kirill Reshke <reshkekirill@gmail.com> wrote:
> While reviewing other threads implementing stream API for various core
> subsystems, I spotted that pgstattuple could also benefit from that.
> So, PFA.
>
> Notice refactoring around pgstat_hash_page and changes in pgstat_page
> signature. This is needed because pgstat_hash_tuple uses
> _hash_getbuf_with_strategy rather than ReadBufferExtended, which is
> OK, but makes adapting the streaming API impossible. So, I changed
> this place. Old codes should behave exactly the same way as new ones.
> I eliminated the additional check that _hash_getbuf_with_strategy
> performed, which was blkno == P_NEW. However, since the streaming API
> already delivers the right buffer, I think it's okay.

I agree that this looks okay.

> This change behaves sanely on my by-hand testing.

I encountered some problems while playing with your patch. This query
[1] fails when the patch is applied although CI passes, it wasn't
failing before. I looked at the code and found that:

=== 1

    blkno = start;
    for (;;)
    {
        /* Get the current relation length */
        LockRelationForExtension(rel, ExclusiveLock);
        nblocks = RelationGetNumberOfBlocks(rel);
        UnlockRelationForExtension(rel, ExclusiveLock);

I think you need to set p.last_exclusive to nblocks here.

=== 2

-        for (; blkno < nblocks; blkno++)
+        while(BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
         {
             CHECK_FOR_INTERRUPTS();

-            pagefn(&stat, rel, blkno, bstrategy);
+            pagefn(&stat, rel, buf);
         }

blkno doesn't get increased now and this causes an infinite loop.

===

Also, since this problem couldn't be found by the CI, you may want to
add a test for the pgstat_index function.

[1] CREATE EXTENSION pgstattuple; create table test (a int primary
key, b int[]); create index test_hashidx on test using hash (b);
select pgstattuple(oid) from pg_class where relname = 'test_hashidx';

--
Regards,
Nazir Bilal Yavuz
Microsoft



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Potential ABI breakage in upcoming minor releases
Next
From: Alvaro Herrera
Date:
Subject: Re: [PATCH] Missing Assert in the code