Thread: Odd off-by-one dirty buffers and checkpoint buffers written
view bc is just a joining wrapper around pg_buffercache.
regression=# select datname, relname, count(*), sum(count(*)) over () AS total from bc where isdirty group by datname, relname;
datname | relname | count | total
---------+---------+-------+-------
(0 rows)
datname | relname | count | total
---------+---------+-------+-------
(0 rows)
regression=# update tenk1 set stringu1 = stringu1 || '' where (unique1 % 384) = 3;
UPDATE 27
regression=# select datname, relname, count(*), sum(count(*)) over () AS total from bc where isdirty group by datname, relname;
datname | relname | count | total
------------+---------+-------+-------
regression | tenk1 | 3 | 3
(1 row)
regression=# checkpoint;
CHECKPOINT
UPDATE 27
regression=# select datname, relname, count(*), sum(count(*)) over () AS total from bc where isdirty group by datname, relname;
datname | relname | count | total
------------+---------+-------+-------
regression | tenk1 | 3 | 3
(1 row)
regression=# checkpoint;
CHECKPOINT
2022-04-19 23:17:08.256 UTC [161084] LOG: checkpoint starting: immediate force wait
2022-04-19 23:17:08.264 UTC [161084] LOG: checkpoint complete: wrote 4 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.002 s, total=0.009 s; sync files=2, longest=0.002 s, average=0.001 s; distance=12 kB, estimate=72358 kB
2022-04-19 23:17:08.264 UTC [161084] LOG: checkpoint complete: wrote 4 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.002 s, total=0.009 s; sync files=2, longest=0.002 s, average=0.001 s; distance=12 kB, estimate=72358 kB
I've done this four times in a row and while the number of dirty buffers shown each time vary (see below) I see that "wrote N buffers" is always exactly one more than the total count of dirty buffers. I'm just curious if anyone has a quick answer for this unusual correspondence.
David J.
regression=# update tenk1 set stringu1 = stringu1 || '' where (unique1 % 384) = 3;
UPDATE 27
regression=# select datname, relname, count(*), sum(count(*)) over () AS total from bc where isdirty group by datname, relname;
datname | relname | count | total
------------+----------------------+-------+-------
regression | tenk1 | 33 | 102
regression | tenk1_hundred | 9 | 102
regression | tenk1_thous_tenthous | 18 | 102
regression | tenk1_unique1 | 27 | 102
regression | tenk1_unique2 | 15 | 102
(5 rows)
UPDATE 27
regression=# select datname, relname, count(*), sum(count(*)) over () AS total from bc where isdirty group by datname, relname;
datname | relname | count | total
------------+----------------------+-------+-------
regression | tenk1 | 33 | 102
regression | tenk1_hundred | 9 | 102
regression | tenk1_thous_tenthous | 18 | 102
regression | tenk1_unique1 | 27 | 102
regression | tenk1_unique2 | 15 | 102
(5 rows)
2022-04-19 23:13:03.480 UTC [161084] LOG: checkpoint starting: immediate force wait
2022-04-19 23:13:03.523 UTC [161084] LOG: checkpoint complete: wrote 103 buffers (0.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s, sync=0.014 s, total=0.044 s; sync files=8, longest=0.008 s, average=0.002 s; distance=721 kB, estimate=110165 kB
2022-04-19 23:13:03.523 UTC [161084] LOG: checkpoint complete: wrote 103 buffers (0.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s, sync=0.014 s, total=0.044 s; sync files=8, longest=0.008 s, average=0.002 s; distance=721 kB, estimate=110165 kB
On Tue, Apr 19, 2022 at 04:21:21PM -0700, David G. Johnston wrote: > I've done this four times in a row and while the number of dirty buffers > shown each time vary (see below) I see that "wrote N buffers" is always > exactly one more than the total count of dirty buffers. I'm just curious > if anyone has a quick answer for this unusual correspondence. I see that SlruInternalWritePage() increments ckpt_bufs_written, so my first guess would be that it's due to something like CheckPointCLOG(). -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Apr 19, 2022 at 4:36 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Apr 19, 2022 at 04:21:21PM -0700, David G. Johnston wrote:
> I've done this four times in a row and while the number of dirty buffers
> shown each time vary (see below) I see that "wrote N buffers" is always
> exactly one more than the total count of dirty buffers. I'm just curious
> if anyone has a quick answer for this unusual correspondence.
I see that SlruInternalWritePage() increments ckpt_bufs_written, so my
first guess would be that it's due to something like CheckPointCLOG().
I peeked at pg_stat_bgwriter and see an increase in buffers_checkpoint matching the dirty buffers number.
I also looked at pg_stat_slru to try and find the corresponding change caused by:
slru.c:766 (SlruPhysicalWritePage)
pgstat_count_slru_page_written(shared->slru_stats_idx);
I do see (Xact) blks_hit change during this process (after the update/commit, not the checkpoint, though) but it increases by 2 when dirty buffers is 4. I was expecting 4, thinking that blocks and buffers and pages are basically the same things (which [1] seems to affirm).
David J.
At Tue, 19 Apr 2022 17:51:24 -0700, "David G. Johnston" <david.g.johnston@gmail.com> wrote in > On Tue, Apr 19, 2022 at 4:36 PM Nathan Bossart <nathandbossart@gmail.com> > wrote: > > > On Tue, Apr 19, 2022 at 04:21:21PM -0700, David G. Johnston wrote: > > > I've done this four times in a row and while the number of dirty buffers > > > shown each time vary (see below) I see that "wrote N buffers" is always > > > exactly one more than the total count of dirty buffers. I'm just curious > > > if anyone has a quick answer for this unusual correspondence. > > > > I see that SlruInternalWritePage() increments ckpt_bufs_written, so my > > first guess would be that it's due to something like CheckPointCLOG(). > > > > > I peeked at pg_stat_bgwriter and see an increase in buffers_checkpoint > matching the dirty buffers number. > > I also looked at pg_stat_slru to try and find the corresponding change > caused by: > > slru.c:766 (SlruPhysicalWritePage) > pgstat_count_slru_page_written(shared->slru_stats_idx); > > I do see (Xact) blks_hit change during this process (after the > update/commit, not the checkpoint, though) but it increases by 2 when dirty > buffers is 4. I was expecting 4, thinking that blocks and buffers and > pages are basically the same things (which [1] seems to affirm). > > https://www.postgresql.org/message-id/13563.1044552279%40sss.pgh.pa.us If I understand you point correctly.. Xact SLRU is so-called CLOG, on which transaction statuses (running/committed/aborted) are recorded. Its pages are separate objects from table pages, which are out-of-sight of pg_bufferchace. However, the same relationship between pages, blocks and buffers applies to the both cases in parallel. The reason for the 2 hits of Xact SLRU is that once for visibility (MVCC) check and another for commit. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Apr 20, 2022 at 1:03 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
The reason for the 2 hits of Xact SLRU is that once for visibility
(MVCC) check and another for commit.
Makes sense. Thanks. Now, is the lack of such a detail when looking at pg_stat_slru (for this and the other 6 named caches) an omission by intent or just no one has taken the time to write up what the different caches are holding? I would think a brief sentence for each followed by a link to the main section describing the feature would be decent content to add to the introduction for the view in 28.2.21.
Also, is "other" ever expected to be something besides all zeros?
Thanks!
David J.