Thread: Is a pg_stat_force_next_flush() call sufficient for regression tests?
Hi, I noticed eelpout failed with this in stats.sql, in the pg_stat_io tests added a couple months ago [1]: @@ -1415,7 +1415,7 @@ :io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses; ?column? | ?column? ----------+---------- - t | t + t | f (1 row) The failure seems completely unrelated to the new commit, so this seems like some randomness / timing issue. The failing bit does this: ---------------------------- VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) test_io_vac_strategy; SELECT pg_stat_force_next_flush(); SELECT sum(reuses) AS reuses, sum(reads) AS reads FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_after_ SELECT :io_sum_vac_strategy_after_reads > :io_sum_vac_strategy_before_reads, :io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses; ---------------------------- So I'm wondering if pg_stat_force_next_flush() is enough - AFAICS this only sets some flag for the *next* pgstat_report_stat() call, but how do we know that happens before the query execution? Shouldn't there be something like pg_stat_flush() that actually does the flushing, instead of just setting the flag? regards [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=eelpout&dt=2023-07-03%2011%3A09%3A13 -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Is a pg_stat_force_next_flush() call sufficient for regression tests?
From
Kyotaro Horiguchi
Date:
At Mon, 3 Jul 2023 15:45:52 +0200, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote in > So I'm wondering if pg_stat_force_next_flush() is enough - AFAICS this > only sets some flag for the *next* pgstat_report_stat() call, but how do > we know that happens before the query execution? > > Shouldn't there be something like pg_stat_flush() that actually does the > flushing, instead of just setting the flag? The reason for the function is that pg_stat_flush() is supposed not to be called within a transaction. AFAICS pg_stat_force_next_flush() takes effect after a successfull transaction end and before the next command execution. To verify this, I put in an assertion to check that the flag gets consumed before reading of pg_stat_io (a.diff), then ran pgbench with the attached custom script. As expected, it didn't fire at all during several trials. When I wrapped all lines in t.sql within a begin-commit block, the assertion fired off immediately as a matter of course. Is there any chance concurrent backends or some other things can actually hinder the backend from reusing buffers? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
On 7/4/23 04:29, Kyotaro Horiguchi wrote: > At Mon, 3 Jul 2023 15:45:52 +0200, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote in >> So I'm wondering if pg_stat_force_next_flush() is enough - AFAICS this >> only sets some flag for the *next* pgstat_report_stat() call, but how do >> we know that happens before the query execution? >> >> Shouldn't there be something like pg_stat_flush() that actually does the >> flushing, instead of just setting the flag? > > The reason for the function is that pg_stat_flush() is supposed not to > be called within a transaction. AFAICS pg_stat_force_next_flush() > takes effect after a successfull transaction end and before the next > command execution. > Sure, if we're supposed to report the stats only at the end of a transaction, that makes sense. But then why didn't that happen here? > To verify this, I put in an assertion to check that the flag gets > consumed before reading of pg_stat_io (a.diff), then ran pgbench with > the attached custom script. As expected, it didn't fire at all during > several trials. When I wrapped all lines in t.sql within a > begin-commit block, the assertion fired off immediately as a matter of > course. > If I understand correctly, this just verifies that 1) if everything goes well, we report the stats at the end of the transaction (otherwise the case without BEGIN/COMMIT would fail) 2) we don't report stats when in a transaction (with the BEGIN/COMMIT) But the eelpout failure clearly suggests this may misbehave. > Is there any chance concurrent backends or some other things can > actually hinder the backend from reusing buffers? > No idea. I'm not very familiar with the reworked pgstat system, but either the pgstat_report_stat() was not called for some reason, or it decided there's nothing to report (i.e. have_iostats==false). Not sure why would that happen. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company