Thread: BufferUsage counters' values have changed
Hi hackers,
I noticed that BufferUsage counters' values are strangely different for the
same queries on REL_15_STABLE and REL_16_STABLE. For example, I run
CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE test(b int);
INSERT INTO test(b) SELECT generate_series(1,1000);
SELECT query, local_blks_hit, local_blks_read, local_blks_written,
local_blks_dirtied, temp_blks_written FROM pg_stat_statements;
and output on REL_15_STABLE contains
query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1005
local_blks_read | 2
local_blks_written | 5
local_blks_dirtied | 5
temp_blks_written | 0
while output on REL_16_STABLE contains
query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 0
local_blks_dirtied | 5
temp_blks_written | 8
I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master. With the patch applied output contains
query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 8
local_blks_dirtied | 5
temp_blks_written | 0
I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
This code is new to me, and I'm still trying to understand whether it's a bug
in computing the counters or just changes in how many blocks are read/written
during the query execution. If anyone can help me, I would be grateful.
Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/
I noticed that BufferUsage counters' values are strangely different for the
same queries on REL_15_STABLE and REL_16_STABLE. For example, I run
CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE test(b int);
INSERT INTO test(b) SELECT generate_series(1,1000);
SELECT query, local_blks_hit, local_blks_read, local_blks_written,
local_blks_dirtied, temp_blks_written FROM pg_stat_statements;
and output on REL_15_STABLE contains
query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1005
local_blks_read | 2
local_blks_written | 5
local_blks_dirtied | 5
temp_blks_written | 0
while output on REL_16_STABLE contains
query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 0
local_blks_dirtied | 5
temp_blks_written | 8
I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master. With the patch applied output contains
query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 8
local_blks_dirtied | 5
temp_blks_written | 0
I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
This code is new to me, and I'm still trying to understand whether it's a bug
in computing the counters or just changes in how many blocks are read/written
during the query execution. If anyone can help me, I would be grateful.
Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/
On Mon, Sep 11, 2023 at 9:08 AM Karina Litskevich <litskevichkarina@gmail.com> wrote:
I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master.
I've forgotten to attach the patch. Here it is.
Attachment
Hi, On Mon, 11 Sept 2023 at 14:28, Karina Litskevich <litskevichkarina@gmail.com> wrote: > > Hi hackers, > > I noticed that BufferUsage counters' values are strangely different for the > same queries on REL_15_STABLE and REL_16_STABLE. For example, I run > > CREATE EXTENSION pg_stat_statements; > CREATE TEMP TABLE test(b int); > INSERT INTO test(b) SELECT generate_series(1,1000); > SELECT query, local_blks_hit, local_blks_read, local_blks_written, > local_blks_dirtied, temp_blks_written FROM pg_stat_statements; > > and output on REL_15_STABLE contains > > query | INSERT INTO test(b) SELECT generate_series($1,$2) > local_blks_hit | 1005 > local_blks_read | 2 > local_blks_written | 5 > local_blks_dirtied | 5 > temp_blks_written | 0 > > while output on REL_16_STABLE contains > > query | INSERT INTO test(b) SELECT generate_series($1,$2) > local_blks_hit | 1006 > local_blks_read | 0 > local_blks_written | 0 > local_blks_dirtied | 5 > temp_blks_written | 8 > > > I found a bug that causes one of the differences. Wrong counter is incremented > in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied > to REL_16_STABLE and master. With the patch applied output contains Nice finding! I agree, it should be changed. > query | INSERT INTO test(b) SELECT generate_series($1,$2) > local_blks_hit | 1006 > local_blks_read | 0 > local_blks_written | 8 > local_blks_dirtied | 5 > temp_blks_written | 0 > > > I still wonder why local_blks_written is greater than it was on REL_15_STABLE, > and why local_blks_read became zero. These changes are caused by fcdda1e4b5. > This code is new to me, and I'm still trying to understand whether it's a bug > in computing the counters or just changes in how many blocks are read/written > during the query execution. If anyone can help me, I would be grateful. I spent some time on it: local_blks_read became zero because: 1_ One more cache hit. It was supposed to be local_blks_read but it is local_blks_hit now. This is an assumption, I didn't check this deeply. 2_ Before fcdda1e4b5, there was one local_blks_read coming from buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno, RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() -> pgBufferUsage.local_blks_read++. But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno, RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't called and local_blks_read isn't incremented. local_blks_written is greater because of the combination of fcdda1e4b5 and 00d1e02be2. In PG_15: RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK) -> ReadBufferExtended() -> ReadBuffer_common() -> pgBufferUsage.local_blks_written++; (called 5 times) [0] In PG_16: 1_ 5 of the local_blks_written is coming from: RelationGetBufferForTuple() -> RelationAddBlocks() -> ExtendBufferedRelBy() -> ExtendBufferedRelCommon() -> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written += extend_by; (extend_by is 1, this is called 5 times) [1] 2_ 3 of the local_blks_written is coming from: RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() -> fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() -> ExtendBufferedRelTo() -> ExtendBufferedRelCommon() -> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written += extend_by; (extend_by is 3, this is called 1 time) [2] I think [0] is the same path as [1] but [2] is new. 'fsm extends' wasn't counted in local_blks_written in PG_15. Calling ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be counted in local_blks_written. I am not sure which one is correct. I hope these help. Regards, Nazir Bilal Yavuz Microsoft
Hi, On 2023-09-11 09:23:59 +0300, Karina Litskevich wrote: > On Mon, Sep 11, 2023 at 9:08 AM Karina Litskevich < > litskevichkarina@gmail.com> wrote: > > > I found a bug that causes one of the differences. Wrong counter is > > incremented > > in ExtendBufferedRelLocal(). The attached patch fixes it and should be > > applied > > to REL_16_STABLE and master. > > > > I've forgotten to attach the patch. Here it is. > From 999a3d533a9b74c8568cc8a3d715c287de45dd2c Mon Sep 17 00:00:00 2001 > From: Karina Litskevich <litskevichkarina@gmail.com> > Date: Thu, 7 Sep 2023 17:44:40 +0300 > Subject: [PATCH v1] Fix local_blks_written counter incrementation > > --- > src/backend/storage/buffer/localbuf.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c > index 1735ec7141..567b8d15ef 100644 > --- a/src/backend/storage/buffer/localbuf.c > +++ b/src/backend/storage/buffer/localbuf.c > @@ -431,7 +431,7 @@ ExtendBufferedRelLocal(BufferManagerRelation bmr, > > *extended_by = extend_by; > > - pgBufferUsage.temp_blks_written += extend_by; > + pgBufferUsage.local_blks_written += extend_by; > > return first_block; > } > -- > 2.34.1 > Ugh, you're right. The naming of local vs temp here is pretty unfortunate imo. I wonder if we ought to at least dd a comment to BufferUsage clarifying the situation? Just reading the comments therein one would be hard pressed to figure out which of the variables temp table activity should be added to. I don't think we currently can write a test for this in the core tests, as the relevant data isn't visible anywhere, iirc. Thus I added a test to pg_stat_statements. Afaict it should be stable? Running the attached patch through CI, planning to push after that succeeds, unless somebody has a comment? Greetings, Andres Freund
Attachment
Hi, On 2023-09-13 16:04:00 +0300, Nazir Bilal Yavuz wrote: > On Mon, 11 Sept 2023 at 14:28, Karina Litskevich > <litskevichkarina@gmail.com> wrote: > > > > Hi hackers, > > > > I noticed that BufferUsage counters' values are strangely different for the > > same queries on REL_15_STABLE and REL_16_STABLE. For example, I run > > > > CREATE EXTENSION pg_stat_statements; > > CREATE TEMP TABLE test(b int); > > INSERT INTO test(b) SELECT generate_series(1,1000); > > SELECT query, local_blks_hit, local_blks_read, local_blks_written, > > local_blks_dirtied, temp_blks_written FROM pg_stat_statements; > > > > and output on REL_15_STABLE contains > > > > query | INSERT INTO test(b) SELECT generate_series($1,$2) > > local_blks_hit | 1005 > > local_blks_read | 2 > > local_blks_written | 5 > > local_blks_dirtied | 5 > > temp_blks_written | 0 > > > > while output on REL_16_STABLE contains > > > > query | INSERT INTO test(b) SELECT generate_series($1,$2) > > local_blks_hit | 1006 > > local_blks_read | 0 > > local_blks_written | 0 > > local_blks_dirtied | 5 > > temp_blks_written | 8 > > > > > > I found a bug that causes one of the differences. Wrong counter is incremented > > in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied > > to REL_16_STABLE and master. With the patch applied output contains > > Nice finding! I agree, it should be changed. > > > query | INSERT INTO test(b) SELECT generate_series($1,$2) > > local_blks_hit | 1006 > > local_blks_read | 0 > > local_blks_written | 8 > > local_blks_dirtied | 5 > > temp_blks_written | 0 > > > > > > I still wonder why local_blks_written is greater than it was on REL_15_STABLE, > > and why local_blks_read became zero. These changes are caused by fcdda1e4b5. > > This code is new to me, and I'm still trying to understand whether it's a bug > > in computing the counters or just changes in how many blocks are read/written > > during the query execution. If anyone can help me, I would be grateful. > > I spent some time on it: > > local_blks_read became zero because: > 1_ One more cache hit. It was supposed to be local_blks_read but it is > local_blks_hit now. This is an assumption, I didn't check this deeply. > 2_ Before fcdda1e4b5, there was one local_blks_read coming from > buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno, > RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() -> > pgBufferUsage.local_blks_read++. > But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno, > RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't > called and local_blks_read isn't incremented. That imo is a legitimate difference / improvement. The read we previously did here was unnecessary. > local_blks_written is greater because of the combination of fcdda1e4b5 > and 00d1e02be2. > In PG_15: > RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK) > -> ReadBufferExtended() -> ReadBuffer_common() -> > pgBufferUsage.local_blks_written++; (called 5 times) [0] > In PG_16: > 1_ 5 of the local_blks_written is coming from: > RelationGetBufferForTuple() -> RelationAddBlocks() -> > ExtendBufferedRelBy() -> ExtendBufferedRelCommon() -> > ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written += > extend_by; (extend_by is 1, this is called 5 times) [1] > 2_ 3 of the local_blks_written is coming from: > RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() -> > fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() -> > ExtendBufferedRelTo() -> ExtendBufferedRelCommon() -> > ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written += > extend_by; (extend_by is 3, this is called 1 time) [2] > > I think [0] is the same path as [1] but [2] is new. 'fsm extends' > wasn't counted in local_blks_written in PG_15. Calling > ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be > counted in local_blks_written. I am not sure which one is correct. I think it's correct to count the fsm writes here. The pg_stat_statement columns aren't specific to the main relation for or such... If anything it was a bug to not count them before. Thanks for looking into this. Greetings, Andres Freund
Hi, On 2023-09-13 11:59:39 -0700, Andres Freund wrote: > Running the attached patch through CI, planning to push after that succeeds, > unless somebody has a comment? And pushed. Thanks Karina for the report and fix! Greetings, Andres Freund
Nazir, Andres, thank you both for help!
On Wed, Sep 13, 2023 at 10:10 PM Andres Freund <andres@anarazel.de> wrote:
On 2023-09-13 16:04:00 +0300, Nazir Bilal Yavuz wrote:
> local_blks_read became zero because:
> 1_ One more cache hit. It was supposed to be local_blks_read but it is
> local_blks_hit now. This is an assumption, I didn't check this deeply.
> 2_ Before fcdda1e4b5, there was one local_blks_read coming from
> buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
> RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
> pgBufferUsage.local_blks_read++.
> But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
> RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
> called and local_blks_read isn't incremented.
That imo is a legitimate difference / improvement. The read we previously did
here was unnecessary.
> local_blks_written is greater because of the combination of fcdda1e4b5
> and 00d1e02be2.
> In PG_15:
> RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
> -> ReadBufferExtended() -> ReadBuffer_common() ->
> pgBufferUsage.local_blks_written++; (called 5 times) [0]
> In PG_16:
> 1_ 5 of the local_blks_written is coming from:
> RelationGetBufferForTuple() -> RelationAddBlocks() ->
> ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
> extend_by; (extend_by is 1, this is called 5 times) [1]
> 2_ 3 of the local_blks_written is coming from:
> RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
> fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
> ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
> extend_by; (extend_by is 3, this is called 1 time) [2]
>
> I think [0] is the same path as [1] but [2] is new. 'fsm extends'
> wasn't counted in local_blks_written in PG_15. Calling
> ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
> counted in local_blks_written. I am not sure which one is correct.
I think it's correct to count the fsm writes here. The pg_stat_statement
columns aren't specific to the main relation for or such... If anything it was
a bug to not count them before.
Best regards,
Karina LitskevichPostgres Professional: http://postgrespro.com/