Thread: Re: Use streaming read API in pgstattuple.

Re: Use streaming read API in pgstattuple.

From
Nazir Bilal Yavuz
Date:
Hi,

Thank you for working on this!

On Mon, 25 Nov 2024 at 21:17, Kirill Reshke <reshkekirill@gmail.com> wrote:
> While reviewing other threads implementing stream API for various core
> subsystems, I spotted that pgstattuple could also benefit from that.
> So, PFA.
>
> Notice refactoring around pgstat_hash_page and changes in pgstat_page
> signature. This is needed because pgstat_hash_tuple uses
> _hash_getbuf_with_strategy rather than ReadBufferExtended, which is
> OK, but makes adapting the streaming API impossible. So, I changed
> this place. Old codes should behave exactly the same way as new ones.
> I eliminated the additional check that _hash_getbuf_with_strategy
> performed, which was blkno == P_NEW. However, since the streaming API
> already delivers the right buffer, I think it's okay.

I agree that this looks okay.

> This change behaves sanely on my by-hand testing.

I encountered some problems while playing with your patch. This query
[1] fails when the patch is applied although CI passes, it wasn't
failing before. I looked at the code and found that:

=== 1

    blkno = start;
    for (;;)
    {
        /* Get the current relation length */
        LockRelationForExtension(rel, ExclusiveLock);
        nblocks = RelationGetNumberOfBlocks(rel);
        UnlockRelationForExtension(rel, ExclusiveLock);

I think you need to set p.last_exclusive to nblocks here.

=== 2

-        for (; blkno < nblocks; blkno++)
+        while(BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
         {
             CHECK_FOR_INTERRUPTS();

-            pagefn(&stat, rel, blkno, bstrategy);
+            pagefn(&stat, rel, buf);
         }

blkno doesn't get increased now and this causes an infinite loop.

===

Also, since this problem couldn't be found by the CI, you may want to
add a test for the pgstat_index function.

[1] CREATE EXTENSION pgstattuple; create table test (a int primary
key, b int[]); create index test_hashidx on test using hash (b);
select pgstattuple(oid) from pg_class where relname = 'test_hashidx';

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Use streaming read API in pgstattuple.

From
Nazir Bilal Yavuz
Date:
Hi,

> Hello! Thank you for taking a peek. Your review comments have been
> corrected. Since my changes were wrong, I honestly don't know why this
> worked in version 1. By a miracle.
>
> As for CI, i rechecked v1:
>
> ```
> db2=#
> select * from pgstathashindex('test_hashidx');
>  version | bucket_pages | overflow_pages | bitmap_pages | unused_pages
> | live_items | dead_items | free_percent
> ---------+--------------+----------------+--------------+--------------+------------+------------+--------------
>        4 |            4 |              0 |            1 |            0
> |          0 |          0 |          100
> (1 row)
>
> db2=#
> select * from pgstattuple('test_hashidx');
> ERROR:  could not read blocks 6..16 in file "base/16454/16473": read
> only 0 of 90112 bytes
> ```
>
> In v2 this behaves correctly.

Yes, I confirm that it works.

> Should we add `pgstattuple(...)` after `pgstat*type*index(..)`
> everywhere in pgstattuple regression tests?

Not everywhere but at least one pgstattuple(...) call for each index type.

There is one behavior change, it may not be important but I think it
is worth mentioning:

-        for (; blkno < nblocks; blkno++)
+        p.last_exclusive = nblocks;
+
+        while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
         {
             CHECK_FOR_INTERRUPTS();

-            pagefn(&stat, rel, blkno, bstrategy);
+            pagefn(&stat, rel, buf);
         }
+
+        Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);

With this change we assume that if stream returns an invalid buffer
that means stream must be finished. We don't check if the stream
didn't finish but somehow read an invalid buffer. In the upstream
version, if we read an invalid buffer then postgres server will fail.
But in the patched version, the server will continue to run because it
will think that stream has reached to end. This could be happening for
other streaming read users; for example at least the
read_stream_next_buffer() calls in the collect_corrupt_items()
function face the same issue.


--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Use streaming read API in pgstattuple.

From
Matheus Alcantara
Date:
Hi,


On 29/11/24 04:28, Nazir Bilal Yavuz wrote:
> -        for (; blkno < nblocks; blkno++)
> +        p.last_exclusive = nblocks;
> +
> +        while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
>           {
>               CHECK_FOR_INTERRUPTS();
> 
> -            pagefn(&stat, rel, blkno, bstrategy);
> +            pagefn(&stat, rel, buf);
>           }
> +
> +        Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
> 
> With this change we assume that if stream returns an invalid buffer
> that means stream must be finished. We don't check if the stream
> didn't finish but somehow read an invalid buffer. In the upstream
> version, if we read an invalid buffer then postgres server will fail.
> But in the patched version, the server will continue to run because it
> will think that stream has reached to end. This could be happening for
> other streaming read users; for example at least the
> read_stream_next_buffer() calls in the collect_corrupt_items()
> function face the same issue.

Just for reference; On pg_prewarm() for example this loop is 
implemented as:

for (block = first_block; block <= last_block; ++block)
{
     Buffer        buf;
     ...
     buf = read_stream_next_buffer(stream, NULL);
     ...
}
Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);


Would this approach make sense on these cases? (this patch and on 
collect_corrupt_items)


-- 
Matheus Alcantara



Re: Use streaming read API in pgstattuple.

From
Nazir Bilal Yavuz
Date:
Hi,

On Fri, 29 Nov 2024 at 20:05, Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
>
> Hi,
>
>
> On 29/11/24 04:28, Nazir Bilal Yavuz wrote:
> > -        for (; blkno < nblocks; blkno++)
> > +        p.last_exclusive = nblocks;
> > +
> > +        while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
> >           {
> >               CHECK_FOR_INTERRUPTS();
> >
> > -            pagefn(&stat, rel, blkno, bstrategy);
> > +            pagefn(&stat, rel, buf);
> >           }
> > +
> > +        Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
> >
> > With this change we assume that if stream returns an invalid buffer
> > that means stream must be finished. We don't check if the stream
> > didn't finish but somehow read an invalid buffer. In the upstream
> > version, if we read an invalid buffer then postgres server will fail.
> > But in the patched version, the server will continue to run because it
> > will think that stream has reached to end. This could be happening for
> > other streaming read users; for example at least the
> > read_stream_next_buffer() calls in the collect_corrupt_items()
> > function face the same issue.
>
> Just for reference; On pg_prewarm() for example this loop is
> implemented as:
>
> for (block = first_block; block <= last_block; ++block)
> {
>      Buffer             buf;
>      ...
>      buf = read_stream_next_buffer(stream, NULL);
>      ...
> }
> Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
>
>
> Would this approach make sense on these cases? (this patch and on
> collect_corrupt_items)

Yes, that should work and I guess it would be easy to apply this
approach to this patch but it would be hard to apply this to
collect_corrupt_items().

In cases like pg_prewarm or the current scenario, we know the exact
number of blocks to read, which allows us to determine when the stream
should finish (e.g., after the loop runs blockno times) and return an
invalid buffer. But in collect_corrupt_items(), the number of loop
iterations required is not directly tied to blockno since some blocks
may be skipped. This makes it difficult to know when the stream should
terminate and return an invalid buffer.


-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Use streaming read API in pgstattuple.

From
Kirill Reshke
Date:
Hm, CI fails[0]
This is most likely the reason why `select
pgstattuple('test_hashidx');` was omitted in pgstattuple tests: we are
not guaranteed about the number of bucket_pages in the empty index.
Maybe we should populate base relation and check for only non-volatile
fields like live_items and version?


[0]
https://api.cirrus-ci.com/v1/artifact/task/5037800950595584/testrun/build-32/testrun/pgstattuple/regress/regression.diffs

-- 
Best regards,
Kirill Reshke