pg_walinspect float4/float8 confusion

Started by Peter Eisentrautover 3 years ago3 messages

peter.eisentraut@enterprisedb.com

over 3 years ago

The pg_walinspect function pg_get_wal_stats() has output arguments
declared as float4 (count_percentage, record_size_percentage, etc.), but
the internal computations are all done in type double. Is there a
reason why this is then converted to float4 for output? It probably
doesn't matter in practice, but it seems unnecessarily confusing. Or at
least add a comment so it doesn't look like an accident. Also compare
with pgstattuple, which uses float8 in its SQL interface for similar data.

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 3 years ago

In reply to: Peter Eisentraut (#1)

2 attachment(s)

Re: pg_walinspect float4/float8 confusion

On Thu, Sep 8, 2022 at 5:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

The pg_walinspect function pg_get_wal_stats() has output arguments
declared as float4 (count_percentage, record_size_percentage, etc.), but
the internal computations are all done in type double. Is there a
reason why this is then converted to float4 for output? It probably
doesn't matter in practice, but it seems unnecessarily confusing. Or at
least add a comment so it doesn't look like an accident. Also compare
with pgstattuple, which uses float8 in its SQL interface for similar data.

Thanks for finding this. There's no specific reason as such. However,
it's good to be in sync with what code does internally and what it
exposes to the users. pg_walinspect uses double data type (double
precision floating point number) for internal calculations and cuts it
down to single precision floating point number float4 to the users.
Attaching a patch herewith. I'm not sure if this needs to be
backported, if at all, we were to, IMO it should be backported to
reduce the code diff.

While on, I found that pgstattuple uses uint64 for internal percentile
calculations as opposed to double data type for others. Attaching a
small patch to fix it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Use-float8-datatype-for-percentiles-in-pg_walinsp.patchapplication/octet-stream; name=v1-0001-Use-float8-datatype-for-percentiles-in-pg_walinsp.patchDownload

From df8b908794563cb3aef9ef9ed17228779c00f4d1 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 9 Sep 2022 03:40:37 +0000
Subject: [PATCH v1] Use float8 datatype for percentiles in pg_walinspect stat
 functions

pg_walinspect uses datatype double (double precision floating point
number) for WAL stats percentile calculations and expose them via
float4 (single precision floating point number), which an unnecessary
loss of precision and confusing. Even though, it's harmless that way,
let's use float8 (double precision floating-point number) to be in
sync with what pg_walinspect does internally and what it exposes to
the users. This seems to be the pattern used elsewhere in the code.

Reported-by: Peter Eisentraut
Author: Bharath Rupireddy
Reviewed-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/36ee692b-232f-0484-ce94-dc39d82021ad%40enterprisedb.com
---
 contrib/pg_walinspect/pg_walinspect--1.0.sql | 16 ++++++++--------
 contrib/pg_walinspect/pg_walinspect.c        |  8 ++++----
 doc/src/sgml/pgwalinspect.sgml               | 16 ++++++++--------
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/contrib/pg_walinspect/pg_walinspect--1.0.sql b/contrib/pg_walinspect/pg_walinspect--1.0.sql
index 4243516d8a..08b3dd5556 100644
--- a/contrib/pg_walinspect/pg_walinspect--1.0.sql
+++ b/contrib/pg_walinspect/pg_walinspect--1.0.sql
@@ -80,13 +80,13 @@ CREATE FUNCTION pg_get_wal_stats(IN start_lsn pg_lsn,
     IN  per_record boolean DEFAULT false,
     OUT "resource_manager/record_type" text,
     OUT count int8,
-    OUT count_percentage float4,
+    OUT count_percentage float8,
     OUT record_size int8,
-    OUT record_size_percentage float4,
+    OUT record_size_percentage float8,
     OUT fpi_size int8,
-    OUT fpi_size_percentage float4,
+    OUT fpi_size_percentage float8,
     OUT combined_size int8,
-    OUT combined_size_percentage float4
+    OUT combined_size_percentage float8
 )
 RETURNS SETOF record
 AS 'MODULE_PATHNAME', 'pg_get_wal_stats'
@@ -102,13 +102,13 @@ CREATE FUNCTION pg_get_wal_stats_till_end_of_wal(IN start_lsn pg_lsn,
     IN  per_record boolean DEFAULT false,
     OUT "resource_manager/record_type" text,
     OUT count int8,
-    OUT count_percentage float4,
+    OUT count_percentage float8,
     OUT record_size int8,
-    OUT record_size_percentage float4,
+    OUT record_size_percentage float8,
     OUT fpi_size int8,
-    OUT fpi_size_percentage float4,
+    OUT fpi_size_percentage float8,
     OUT combined_size int8,
-    OUT combined_size_percentage float4
+    OUT combined_size_percentage float8
 )
 RETURNS SETOF record
 AS 'MODULE_PATHNAME', 'pg_get_wal_stats_till_end_of_wal'
diff --git a/contrib/pg_walinspect/pg_walinspect.c b/contrib/pg_walinspect/pg_walinspect.c
index 2f51a8dec4..38fb4106da 100644
--- a/contrib/pg_walinspect/pg_walinspect.c
+++ b/contrib/pg_walinspect/pg_walinspect.c
@@ -430,13 +430,13 @@ FillXLogStatsRow(const char *name,
 
 	values[i++] = CStringGetTextDatum(name);
 	values[i++] = Int64GetDatum(n);
-	values[i++] = Float4GetDatum(n_pct);
+	values[i++] = Float8GetDatum(n_pct);
 	values[i++] = Int64GetDatum(rec_len);
-	values[i++] = Float4GetDatum(rec_len_pct);
+	values[i++] = Float8GetDatum(rec_len_pct);
 	values[i++] = Int64GetDatum(fpi_len);
-	values[i++] = Float4GetDatum(fpi_len_pct);
+	values[i++] = Float8GetDatum(fpi_len_pct);
 	values[i++] = Int64GetDatum(tot_len);
-	values[i++] = Float4GetDatum(tot_len_pct);
+	values[i++] = Float8GetDatum(tot_len_pct);
 
 	Assert(i == ncols);
 }
diff --git a/doc/src/sgml/pgwalinspect.sgml b/doc/src/sgml/pgwalinspect.sgml
index de63a70965..1a1bee7d6a 100644
--- a/doc/src/sgml/pgwalinspect.sgml
+++ b/doc/src/sgml/pgwalinspect.sgml
@@ -164,13 +164,13 @@ postgres=# select start_lsn, end_lsn, prev_lsn, xid, resource_manager, record_ty
                        per_record boolean DEFAULT false,
                        "resource_manager/record_type" OUT text,
                        count OUT int8,
-                       count_percentage OUT float4,
+                       count_percentage OUT float8,
                        record_length OUT int8,
-                       record_length_percentage OUT float4,
+                       record_length_percentage OUT float8,
                        fpi_length OUT int8,
-                       fpi_length_percentage OUT float4,
+                       fpi_length_percentage OUT float8,
                        combined_size OUT int8,
-                       combined_size_percentage OUT float4)
+                       combined_size_percentage OUT float8)
       returns setof record
      </function>
     </term>
@@ -241,13 +241,13 @@ postgres=# select * from pg_get_wal_stats('0/14AFC30', '0/15011D7', true) where
                                        per_record boolean DEFAULT false,
                                        "resource_manager/record_type" OUT text,
                                        count OUT int8,
-                                       count_percentage OUT float4,
+                                       count_percentage OUT float8,
                                        record_length OUT int8,
-                                       record_length_percentage OUT float4,
+                                       record_length_percentage OUT float8,
                                        fpi_length OUT int8,
-                                       fpi_length_percentage OUT float4,
+                                       fpi_length_percentage OUT float8,
                                        combined_size OUT int8,
-                                       combined_size_percentage OUT float4)
+                                       combined_size_percentage OUT float8)
       returns setof record
      </function>
     </term>
-- 
2.34.1

v1-0001-Use-double-datatype-for-percentiles-in-pgstattupl.patchapplication/octet-stream; name=v1-0001-Use-double-datatype-for-percentiles-in-pgstattupl.patchDownload

From 643b954e93d6d57baff511fe6d80e2fbae01058d Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 9 Sep 2022 03:38:11 +0000
Subject: [PATCH v1] Use double datatype for percentiles in pgstattuple

pgstattuple uses datatype double for other percentile calculations
and expose those values to the users via float8 datatype. However,
scanned_percent in struct output_type is of uint64, change it to
use double to be inline with other percentiles.

Reported-by: Peter Eisentraut
Author: Bharath Rupireddy
Reviewed-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/36ee692b-232f-0484-ce94-dc39d82021ad%40enterprisedb.com
---
 contrib/pgstattuple/pgstatapprox.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index 15ddc32239..af53978064 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -38,7 +38,7 @@ Datum		pgstattuple_approx_internal(Oid relid, FunctionCallInfo fcinfo);
 typedef struct output_type
 {
 	uint64		table_len;
-	uint64		scanned_percent;
+	double		scanned_percent;
 	uint64		tuple_count;
 	uint64		tuple_len;
 	double		tuple_percent;
-- 
2.34.1

Peter Eisentraut

peter.eisentraut@enterprisedb.com

over 3 years ago

In reply to: Bharath Rupireddy (#2)

Re: pg_walinspect float4/float8 confusion

On 09.09.22 05:51, Bharath Rupireddy wrote:

On Thu, Sep 8, 2022 at 5:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

The pg_walinspect function pg_get_wal_stats() has output arguments
declared as float4 (count_percentage, record_size_percentage, etc.), but
the internal computations are all done in type double. Is there a
reason why this is then converted to float4 for output? It probably
doesn't matter in practice, but it seems unnecessarily confusing. Or at
least add a comment so it doesn't look like an accident. Also compare
with pgstattuple, which uses float8 in its SQL interface for similar data.

Thanks for finding this. There's no specific reason as such. However,
it's good to be in sync with what code does internally and what it
exposes to the users. pg_walinspect uses double data type (double
precision floating point number) for internal calculations and cuts it
down to single precision floating point number float4 to the users.
Attaching a patch herewith. I'm not sure if this needs to be
backported, if at all, we were to, IMO it should be backported to
reduce the code diff.

done

While on, I found that pgstattuple uses uint64 for internal percentile
calculations as opposed to double data type for others. Attaching a
small patch to fix it.

Good find. I also changed the computation to use 100.0 instead of 100,
so that you actually get non-integer values out of it.

I didn't backpatch this, since it would probably result in a small
behavior change and the results with the previous code are not wrong,
just unnecessarily truncated.