Thread: Re: Use streaming read API in pgstattuple.
Hi, Thank you for working on this! On Mon, 25 Nov 2024 at 21:17, Kirill Reshke <reshkekirill@gmail.com> wrote: > While reviewing other threads implementing stream API for various core > subsystems, I spotted that pgstattuple could also benefit from that. > So, PFA. > > Notice refactoring around pgstat_hash_page and changes in pgstat_page > signature. This is needed because pgstat_hash_tuple uses > _hash_getbuf_with_strategy rather than ReadBufferExtended, which is > OK, but makes adapting the streaming API impossible. So, I changed > this place. Old codes should behave exactly the same way as new ones. > I eliminated the additional check that _hash_getbuf_with_strategy > performed, which was blkno == P_NEW. However, since the streaming API > already delivers the right buffer, I think it's okay. I agree that this looks okay. > This change behaves sanely on my by-hand testing. I encountered some problems while playing with your patch. This query [1] fails when the patch is applied although CI passes, it wasn't failing before. I looked at the code and found that: === 1 blkno = start; for (;;) { /* Get the current relation length */ LockRelationForExtension(rel, ExclusiveLock); nblocks = RelationGetNumberOfBlocks(rel); UnlockRelationForExtension(rel, ExclusiveLock); I think you need to set p.last_exclusive to nblocks here. === 2 - for (; blkno < nblocks; blkno++) + while(BufferIsValid(buf = read_stream_next_buffer(stream, NULL))) { CHECK_FOR_INTERRUPTS(); - pagefn(&stat, rel, blkno, bstrategy); + pagefn(&stat, rel, buf); } blkno doesn't get increased now and this causes an infinite loop. === Also, since this problem couldn't be found by the CI, you may want to add a test for the pgstat_index function. [1] CREATE EXTENSION pgstattuple; create table test (a int primary key, b int[]); create index test_hashidx on test using hash (b); select pgstattuple(oid) from pg_class where relname = 'test_hashidx'; -- Regards, Nazir Bilal Yavuz Microsoft
Hi, > Hello! Thank you for taking a peek. Your review comments have been > corrected. Since my changes were wrong, I honestly don't know why this > worked in version 1. By a miracle. > > As for CI, i rechecked v1: > > ``` > db2=# > select * from pgstathashindex('test_hashidx'); > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages > | live_items | dead_items | free_percent > ---------+--------------+----------------+--------------+--------------+------------+------------+-------------- > 4 | 4 | 0 | 1 | 0 > | 0 | 0 | 100 > (1 row) > > db2=# > select * from pgstattuple('test_hashidx'); > ERROR: could not read blocks 6..16 in file "base/16454/16473": read > only 0 of 90112 bytes > ``` > > In v2 this behaves correctly. Yes, I confirm that it works. > Should we add `pgstattuple(...)` after `pgstat*type*index(..)` > everywhere in pgstattuple regression tests? Not everywhere but at least one pgstattuple(...) call for each index type. There is one behavior change, it may not be important but I think it is worth mentioning: - for (; blkno < nblocks; blkno++) + p.last_exclusive = nblocks; + + while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL))) { CHECK_FOR_INTERRUPTS(); - pagefn(&stat, rel, blkno, bstrategy); + pagefn(&stat, rel, buf); } + + Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer); With this change we assume that if stream returns an invalid buffer that means stream must be finished. We don't check if the stream didn't finish but somehow read an invalid buffer. In the upstream version, if we read an invalid buffer then postgres server will fail. But in the patched version, the server will continue to run because it will think that stream has reached to end. This could be happening for other streaming read users; for example at least the read_stream_next_buffer() calls in the collect_corrupt_items() function face the same issue. -- Regards, Nazir Bilal Yavuz Microsoft
Hi, On 29/11/24 04:28, Nazir Bilal Yavuz wrote: > - for (; blkno < nblocks; blkno++) > + p.last_exclusive = nblocks; > + > + while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL))) > { > CHECK_FOR_INTERRUPTS(); > > - pagefn(&stat, rel, blkno, bstrategy); > + pagefn(&stat, rel, buf); > } > + > + Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer); > > With this change we assume that if stream returns an invalid buffer > that means stream must be finished. We don't check if the stream > didn't finish but somehow read an invalid buffer. In the upstream > version, if we read an invalid buffer then postgres server will fail. > But in the patched version, the server will continue to run because it > will think that stream has reached to end. This could be happening for > other streaming read users; for example at least the > read_stream_next_buffer() calls in the collect_corrupt_items() > function face the same issue. Just for reference; On pg_prewarm() for example this loop is implemented as: for (block = first_block; block <= last_block; ++block) { Buffer buf; ... buf = read_stream_next_buffer(stream, NULL); ... } Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer); Would this approach make sense on these cases? (this patch and on collect_corrupt_items) -- Matheus Alcantara
Hi, On Fri, 29 Nov 2024 at 20:05, Matheus Alcantara <matheusssilv97@gmail.com> wrote: > > Hi, > > > On 29/11/24 04:28, Nazir Bilal Yavuz wrote: > > - for (; blkno < nblocks; blkno++) > > + p.last_exclusive = nblocks; > > + > > + while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL))) > > { > > CHECK_FOR_INTERRUPTS(); > > > > - pagefn(&stat, rel, blkno, bstrategy); > > + pagefn(&stat, rel, buf); > > } > > + > > + Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer); > > > > With this change we assume that if stream returns an invalid buffer > > that means stream must be finished. We don't check if the stream > > didn't finish but somehow read an invalid buffer. In the upstream > > version, if we read an invalid buffer then postgres server will fail. > > But in the patched version, the server will continue to run because it > > will think that stream has reached to end. This could be happening for > > other streaming read users; for example at least the > > read_stream_next_buffer() calls in the collect_corrupt_items() > > function face the same issue. > > Just for reference; On pg_prewarm() for example this loop is > implemented as: > > for (block = first_block; block <= last_block; ++block) > { > Buffer buf; > ... > buf = read_stream_next_buffer(stream, NULL); > ... > } > Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer); > > > Would this approach make sense on these cases? (this patch and on > collect_corrupt_items) Yes, that should work and I guess it would be easy to apply this approach to this patch but it would be hard to apply this to collect_corrupt_items(). In cases like pg_prewarm or the current scenario, we know the exact number of blocks to read, which allows us to determine when the stream should finish (e.g., after the loop runs blockno times) and return an invalid buffer. But in collect_corrupt_items(), the number of loop iterations required is not directly tied to blockno since some blocks may be skipped. This makes it difficult to know when the stream should terminate and return an invalid buffer. -- Regards, Nazir Bilal Yavuz Microsoft
Hm, CI fails[0] This is most likely the reason why `select pgstattuple('test_hashidx');` was omitted in pgstattuple tests: we are not guaranteed about the number of bucket_pages in the empty index. Maybe we should populate base relation and check for only non-volatile fields like live_items and version? [0] https://api.cirrus-ci.com/v1/artifact/task/5037800950595584/testrun/build-32/testrun/pgstattuple/regress/regression.diffs -- Best regards, Kirill Reshke