BufferUsage counters' values have changed

Started by Karina Litskevichover 2 years ago7 messages
#1Karina Litskevich
litskevichkarina@gmail.com

Hi hackers,

I noticed that BufferUsage counters' values are strangely different for the
same queries on REL_15_STABLE and REL_16_STABLE. For example, I run

CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE test(b int);
INSERT INTO test(b) SELECT generate_series(1,1000);
SELECT query, local_blks_hit, local_blks_read, local_blks_written,
local_blks_dirtied, temp_blks_written FROM pg_stat_statements;

and output on REL_15_STABLE contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1005
local_blks_read | 2
local_blks_written | 5
local_blks_dirtied | 5
temp_blks_written | 0

while output on REL_16_STABLE contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 0
local_blks_dirtied | 5
temp_blks_written | 8

I found a bug that causes one of the differences. Wrong counter is
incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be
applied
to REL_16_STABLE and master. With the patch applied output contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 8
local_blks_dirtied | 5
temp_blks_written | 0

I still wonder why local_blks_written is greater than it was on
REL_15_STABLE,
and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
This code is new to me, and I'm still trying to understand whether it's a
bug
in computing the counters or just changes in how many blocks are
read/written
during the query execution. If anyone can help me, I would be grateful.

Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/

#2Karina Litskevich
litskevichkarina@gmail.com
In reply to: Karina Litskevich (#1)
1 attachment(s)
Re: BufferUsage counters' values have changed

On Mon, Sep 11, 2023 at 9:08 AM Karina Litskevich <
litskevichkarina@gmail.com> wrote:

I found a bug that causes one of the differences. Wrong counter is
incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be
applied
to REL_16_STABLE and master.

I've forgotten to attach the patch. Here it is.

Attachments:

v1-0001-Fix-local_blks_written-counter-incrementation.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Fix-local_blks_written-counter-incrementation.patchDownload
From 999a3d533a9b74c8568cc8a3d715c287de45dd2c Mon Sep 17 00:00:00 2001
From: Karina Litskevich <litskevichkarina@gmail.com>
Date: Thu, 7 Sep 2023 17:44:40 +0300
Subject: [PATCH v1] Fix local_blks_written counter incrementation

---
 src/backend/storage/buffer/localbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 1735ec7141..567b8d15ef 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -431,7 +431,7 @@ ExtendBufferedRelLocal(BufferManagerRelation bmr,
 
 	*extended_by = extend_by;
 
-	pgBufferUsage.temp_blks_written += extend_by;
+	pgBufferUsage.local_blks_written += extend_by;
 
 	return first_block;
 }
-- 
2.34.1

#3Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Karina Litskevich (#1)
Re: BufferUsage counters' values have changed

Hi,

On Mon, 11 Sept 2023 at 14:28, Karina Litskevich
<litskevichkarina@gmail.com> wrote:

Hi hackers,

I noticed that BufferUsage counters' values are strangely different for the
same queries on REL_15_STABLE and REL_16_STABLE. For example, I run

CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE test(b int);
INSERT INTO test(b) SELECT generate_series(1,1000);
SELECT query, local_blks_hit, local_blks_read, local_blks_written,
local_blks_dirtied, temp_blks_written FROM pg_stat_statements;

and output on REL_15_STABLE contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1005
local_blks_read | 2
local_blks_written | 5
local_blks_dirtied | 5
temp_blks_written | 0

while output on REL_16_STABLE contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 0
local_blks_dirtied | 5
temp_blks_written | 8

I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master. With the patch applied output contains

Nice finding! I agree, it should be changed.

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 8
local_blks_dirtied | 5
temp_blks_written | 0

I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
This code is new to me, and I'm still trying to understand whether it's a bug
in computing the counters or just changes in how many blocks are read/written
during the query execution. If anyone can help me, I would be grateful.

I spent some time on it:

local_blks_read became zero because:
1_ One more cache hit. It was supposed to be local_blks_read but it is
local_blks_hit now. This is an assumption, I didn't check this deeply.
2_ Before fcdda1e4b5, there was one local_blks_read coming from
buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
pgBufferUsage.local_blks_read++.
But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
called and local_blks_read isn't incremented.

local_blks_written is greater because of the combination of fcdda1e4b5
and 00d1e02be2.
In PG_15:
RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
-> ReadBufferExtended() -> ReadBuffer_common() ->
pgBufferUsage.local_blks_written++; (called 5 times) [0]
In PG_16:
1_ 5 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RelationAddBlocks() ->
ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 1, this is called 5 times) [1]
2_ 3 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 3, this is called 1 time) [2]

I think [0] is the same path as [1] but [2] is new. 'fsm extends'
wasn't counted in local_blks_written in PG_15. Calling
ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
counted in local_blks_written. I am not sure which one is correct.

I hope these help.

Regards,
Nazir Bilal Yavuz
Microsoft

#4Andres Freund
andres@anarazel.de
In reply to: Karina Litskevich (#2)
1 attachment(s)
Re: BufferUsage counters' values have changed

Hi,

On 2023-09-11 09:23:59 +0300, Karina Litskevich wrote:

On Mon, Sep 11, 2023 at 9:08 AM Karina Litskevich <
litskevichkarina@gmail.com> wrote:

I found a bug that causes one of the differences. Wrong counter is
incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be
applied
to REL_16_STABLE and master.

I've forgotten to attach the patch. Here it is.

From 999a3d533a9b74c8568cc8a3d715c287de45dd2c Mon Sep 17 00:00:00 2001
From: Karina Litskevich <litskevichkarina@gmail.com>
Date: Thu, 7 Sep 2023 17:44:40 +0300
Subject: [PATCH v1] Fix local_blks_written counter incrementation

---
src/backend/storage/buffer/localbuf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 1735ec7141..567b8d15ef 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -431,7 +431,7 @@ ExtendBufferedRelLocal(BufferManagerRelation bmr,

*extended_by = extend_by;

-	pgBufferUsage.temp_blks_written += extend_by;
+	pgBufferUsage.local_blks_written += extend_by;

return first_block;
}
--
2.34.1

Ugh, you're right.

The naming of local vs temp here is pretty unfortunate imo. I wonder if we
ought to at least dd a comment to BufferUsage clarifying the situation? Just
reading the comments therein one would be hard pressed to figure out which of
the variables temp table activity should be added to.

I don't think we currently can write a test for this in the core tests, as the
relevant data isn't visible anywhere, iirc. Thus I added a test to
pg_stat_statements. Afaict it should be stable?

Running the attached patch through CI, planning to push after that succeeds,
unless somebody has a comment?

Greetings,

Andres Freund

Attachments:

v2-0001-Fix-tracking-of-temp-table-relation-extensions-as.patchtext/x-diff; charset=us-asciiDownload
From ae7b170e9032af0fd257a39953fe0b6ef7e9637c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 13 Sep 2023 11:46:45 -0700
Subject: [PATCH v2] Fix tracking of temp table relation extensions as writes

Karina figured out that I (Andres) confused BufferUsage.temp_blks_written with
BufferUsage.local_blks_written in fcdda1e4b5.

Tests in core PG can't easily test this, as BufferUsage is just used for
EXPLAIN (ANALYZE, BUFFERS) and pg_stat_statements. Thus this commit adds tests
for this to pg_stat_statements.

Reported-by: Karina Litskevich <litskevichkarina@gmail.com>
Author: Karina Litskevich <litskevichkarina@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CACiT8ibxXA6+0amGikbeFhm8B84XdQVo6D0Qfd1pQ1s8zpsnxQ@mail.gmail.com
Backpatch: 16-, where fcdda1e4b5 was merged
---
 src/backend/storage/buffer/localbuf.c       |  2 +-
 contrib/pg_stat_statements/expected/dml.out | 27 +++++++++++++++++++++
 contrib/pg_stat_statements/sql/dml.sql      | 19 +++++++++++++++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 1735ec71419..567b8d15ef0 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -431,7 +431,7 @@ ExtendBufferedRelLocal(BufferManagerRelation bmr,
 
 	*extended_by = extend_by;
 
-	pgBufferUsage.temp_blks_written += extend_by;
+	pgBufferUsage.local_blks_written += extend_by;
 
 	return first_block;
 }
diff --git a/contrib/pg_stat_statements/expected/dml.out b/contrib/pg_stat_statements/expected/dml.out
index 7b9c8f979ee..ede47a71acc 100644
--- a/contrib/pg_stat_statements/expected/dml.out
+++ b/contrib/pg_stat_statements/expected/dml.out
@@ -139,6 +139,33 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
      1 |    1 | SELECT pg_stat_statements_reset()
 (10 rows)
 
+-- check that [temp] table relation extensions are tracked as writes
+CREATE TABLE pgss_extend_tab (a int, b text);
+CREATE TEMP TABLE pgss_extend_temp_tab (a int, b text);
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+INSERT INTO pgss_extend_tab (a, b) SELECT generate_series(1, 1000), 'something';
+INSERT INTO pgss_extend_temp_tab (a, b) SELECT generate_series(1, 1000), 'something';
+WITH sizes AS (
+  SELECT
+    pg_relation_size('pgss_extend_tab') / current_setting('block_size')::int8 AS rel_size,
+    pg_relation_size('pgss_extend_temp_tab') / current_setting('block_size')::int8 AS temp_rel_size
+)
+SELECT
+    SUM(local_blks_written) >= (SELECT temp_rel_size FROM sizes) AS temp_written_ok,
+    SUM(local_blks_dirtied) >= (SELECT temp_rel_size FROM sizes) AS temp_dirtied_ok,
+    SUM(shared_blks_written) >= (SELECT rel_size FROM sizes) AS written_ok,
+    SUM(shared_blks_dirtied) >= (SELECT rel_size FROM sizes) AS dirtied_ok
+FROM pg_stat_statements;
+ temp_written_ok | temp_dirtied_ok | written_ok | dirtied_ok 
+-----------------+-----------------+------------+------------
+ t               | t               | t          | t
+(1 row)
+
 SELECT pg_stat_statements_reset();
  pg_stat_statements_reset 
 --------------------------
diff --git a/contrib/pg_stat_statements/sql/dml.sql b/contrib/pg_stat_statements/sql/dml.sql
index af2f9fcf73b..3b5d2afb858 100644
--- a/contrib/pg_stat_statements/sql/dml.sql
+++ b/contrib/pg_stat_statements/sql/dml.sql
@@ -73,4 +73,23 @@ MERGE INTO pgss_dml_tab USING pgss_dml_tab st ON (st.a = pgss_dml_tab.a AND st.a
 DROP TABLE pgss_dml_tab;
 
 SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- check that [temp] table relation extensions are tracked as writes
+CREATE TABLE pgss_extend_tab (a int, b text);
+CREATE TEMP TABLE pgss_extend_temp_tab (a int, b text);
+SELECT pg_stat_statements_reset();
+INSERT INTO pgss_extend_tab (a, b) SELECT generate_series(1, 1000), 'something';
+INSERT INTO pgss_extend_temp_tab (a, b) SELECT generate_series(1, 1000), 'something';
+WITH sizes AS (
+  SELECT
+    pg_relation_size('pgss_extend_tab') / current_setting('block_size')::int8 AS rel_size,
+    pg_relation_size('pgss_extend_temp_tab') / current_setting('block_size')::int8 AS temp_rel_size
+)
+SELECT
+    SUM(local_blks_written) >= (SELECT temp_rel_size FROM sizes) AS temp_written_ok,
+    SUM(local_blks_dirtied) >= (SELECT temp_rel_size FROM sizes) AS temp_dirtied_ok,
+    SUM(shared_blks_written) >= (SELECT rel_size FROM sizes) AS written_ok,
+    SUM(shared_blks_dirtied) >= (SELECT rel_size FROM sizes) AS dirtied_ok
+FROM pg_stat_statements;
+
 SELECT pg_stat_statements_reset();
-- 
2.38.0

#5Andres Freund
andres@anarazel.de
In reply to: Nazir Bilal Yavuz (#3)
Re: BufferUsage counters' values have changed

Hi,

On 2023-09-13 16:04:00 +0300, Nazir Bilal Yavuz wrote:

On Mon, 11 Sept 2023 at 14:28, Karina Litskevich
<litskevichkarina@gmail.com> wrote:

Hi hackers,

I noticed that BufferUsage counters' values are strangely different for the
same queries on REL_15_STABLE and REL_16_STABLE. For example, I run

CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE test(b int);
INSERT INTO test(b) SELECT generate_series(1,1000);
SELECT query, local_blks_hit, local_blks_read, local_blks_written,
local_blks_dirtied, temp_blks_written FROM pg_stat_statements;

and output on REL_15_STABLE contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1005
local_blks_read | 2
local_blks_written | 5
local_blks_dirtied | 5
temp_blks_written | 0

while output on REL_16_STABLE contains

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 0
local_blks_dirtied | 5
temp_blks_written | 8

I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master. With the patch applied output contains

Nice finding! I agree, it should be changed.

query | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit | 1006
local_blks_read | 0
local_blks_written | 8
local_blks_dirtied | 5
temp_blks_written | 0

I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
This code is new to me, and I'm still trying to understand whether it's a bug
in computing the counters or just changes in how many blocks are read/written
during the query execution. If anyone can help me, I would be grateful.

I spent some time on it:

local_blks_read became zero because:
1_ One more cache hit. It was supposed to be local_blks_read but it is
local_blks_hit now. This is an assumption, I didn't check this deeply.
2_ Before fcdda1e4b5, there was one local_blks_read coming from
buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
pgBufferUsage.local_blks_read++.
But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
called and local_blks_read isn't incremented.

That imo is a legitimate difference / improvement. The read we previously did
here was unnecessary.

local_blks_written is greater because of the combination of fcdda1e4b5
and 00d1e02be2.
In PG_15:
RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
-> ReadBufferExtended() -> ReadBuffer_common() ->
pgBufferUsage.local_blks_written++; (called 5 times) [0]
In PG_16:
1_ 5 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RelationAddBlocks() ->
ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 1, this is called 5 times) [1]
2_ 3 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 3, this is called 1 time) [2]

I think [0] is the same path as [1] but [2] is new. 'fsm extends'
wasn't counted in local_blks_written in PG_15. Calling
ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
counted in local_blks_written. I am not sure which one is correct.

I think it's correct to count the fsm writes here. The pg_stat_statement
columns aren't specific to the main relation for or such... If anything it was
a bug to not count them before.

Thanks for looking into this.

Greetings,

Andres Freund

#6Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#4)
Re: BufferUsage counters' values have changed

Hi,

On 2023-09-13 11:59:39 -0700, Andres Freund wrote:

Running the attached patch through CI, planning to push after that succeeds,
unless somebody has a comment?

And pushed.

Thanks Karina for the report and fix!

Greetings,

Andres Freund

#7Karina Litskevich
litskevichkarina@gmail.com
In reply to: Andres Freund (#5)
Re: BufferUsage counters' values have changed

Nazir, Andres, thank you both for help!

On Wed, Sep 13, 2023 at 10:10 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-09-13 16:04:00 +0300, Nazir Bilal Yavuz wrote:

local_blks_read became zero because:
1_ One more cache hit. It was supposed to be local_blks_read but it is
local_blks_hit now. This is an assumption, I didn't check this deeply.
2_ Before fcdda1e4b5, there was one local_blks_read coming from
buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
pgBufferUsage.local_blks_read++.
But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
called and local_blks_read isn't incremented.

That imo is a legitimate difference / improvement. The read we previously
did
here was unnecessary.

local_blks_written is greater because of the combination of fcdda1e4b5
and 00d1e02be2.
In PG_15:
RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
-> ReadBufferExtended() -> ReadBuffer_common() ->
pgBufferUsage.local_blks_written++; (called 5 times) [0]
In PG_16:
1_ 5 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RelationAddBlocks() ->
ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 1, this is called 5 times) [1]
2_ 3 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 3, this is called 1 time) [2]

I think [0] is the same path as [1] but [2] is new. 'fsm extends'
wasn't counted in local_blks_written in PG_15. Calling
ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
counted in local_blks_written. I am not sure which one is correct.

I think it's correct to count the fsm writes here. The pg_stat_statement
columns aren't specific to the main relation for or such... If anything it
was
a bug to not count them before.

Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/