Datum as struct

Started by Peter Eisentraut7 months ago16 messages
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Another draft patch set that I had lying around that was mentioned in [0]/messages/by-id/1749799.1752797397@sss.pgh.pa.us.

The idea is to change Datum to a struct so that you can no longer rely
on it being implicitly convertable to other C types, which enforces use
of proper DatumGet*() and *GetDatum() functions, which might ultimately
help with portability and robustness and the like. (An alternative idea
is to make Datum a union so that you don't need so many casts to begin
with. But this has a lot of the same issues, so it can be considered
implicitly here.)

The first three patches clean up the use of the conversion functions.
Some are missing, some are superfluous, some are used the wrong way
around (!!).

The fourth patch tidies up the use of varatt.h macros. These should
arguably not be used on Datum values but instead be converted with
DatumGetPointer(). This is very similar to the patch that I also
included in the thread "Convert varatt.h macros to static inline
functions" that I just posted.

The fifth patch cleans up some untidy use of hash functions with the
wrong argument type.

The last patch is the actual conversion. As you can see there, and as
was also mentioned in [0]/messages/by-id/1749799.1752797397@sss.pgh.pa.us, it's quite a bit of churn. I'm not seriously
proposing it at this point, but maybe it can serve as a guide for
refactoring some of the interfaces to make the impact smaller, or
something like that.

But I think the patches 0001 through 0005 are useful now.

I tested this against the original patch in [0]/messages/by-id/1749799.1752797397@sss.pgh.pa.us. It fixes some of the
issues discussed there but not all of them.

[0]: /messages/by-id/1749799.1752797397@sss.pgh.pa.us
/messages/by-id/1749799.1752797397@sss.pgh.pa.us

Attachments:

v1-0001-Fix-mixups-of-FooGetDatum-vs.-DatumGetFoo.patchtext/plain; charset=UTF-8; name=v1-0001-Fix-mixups-of-FooGetDatum-vs.-DatumGetFoo.patchDownload+9-10
v1-0002-Remove-useless-superfluous-Datum-conversions.patchtext/plain; charset=UTF-8; name=v1-0002-Remove-useless-superfluous-Datum-conversions.patchDownload+26-31
v1-0003-Add-missing-Datum-conversions.patchtext/plain; charset=UTF-8; name=v1-0003-Add-missing-Datum-conversions.patchDownload+112-113
v1-0004-Fix-varatt-versus-Datum-type-confusions.patchtext/plain; charset=UTF-8; name=v1-0004-Fix-varatt-versus-Datum-type-confusions.patchDownload+56-57
v1-0005-Fix-various-hash-function-uses.patchtext/plain; charset=UTF-8; name=v1-0005-Fix-various-hash-function-uses.patchDownload+9-10
v1-0006-WIP-Datum-as-struct.patchtext/plain; charset=UTF-8; name=v1-0006-WIP-Datum-as-struct.patchDownload+915-893
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: Datum as struct

Peter Eisentraut <peter@eisentraut.org> writes:

But I think the patches 0001 through 0005 are useful now.
I tested this against the original patch in [0]. It fixes some of the
issues discussed there but not all of them.

I looked through this. I think 0001-0003 are unconditionally
improvements and you should just commit them. Many of the
pointer-related changes are identical to ones that I had arrived at
while working on cleaning up the gcc warnings from my 8-byte-Datum
patch, so this would create merge conflicts with that patch, and
getting these changes in would make that one shorter.

I do have one review comment on 0003: in
brin_minmax_multi_distance_numeric, you wrote

-	PG_RETURN_FLOAT8(DirectFunctionCall1(numeric_float8, d));
+	PG_RETURN_FLOAT8(DatumGetFloat8(DirectFunctionCall1(numeric_float8, d)));

Another way to do that could be

PG_RETURN_DATUM(DirectFunctionCall1(numeric_float8, d));

I'm not sure we should expect the compiler to be able to elide
the conversions to float8 and back, so I prefer the second way.

Also I see a "// XXX" in pg_get_aios, which I guess is a note
to confirm the data type to use for ioh_id?

My only reservation about 0004 is whether we want to do it like
this, or with the separate "_D" functions that I proposed in
the other thread. Right now the vote seems to be 2 to 1 for
adding DatumGetPointer calls, so unless there are more votes
you should go ahead with 0004 as well.

No objection to 0005 either.

I agree that 0006 is not something we want to commit. Aside
from all the replacements for "(Datum) 0" --- which'd be fine
I guess if there weren't so darn many --- it seems to me that
bits like this are punching too many holes in the abstraction:

-	if (PointerGetDatum(key) != entry->key)
+	if (PointerGetDatum(key).value != entry->key.value)

It'd probably be an improvement to code that like

if ((Pointer) key != DatumGetPointer(entry->key))

which matches the way we do things in many other places.
But in general any direct reference to .value outside the
DatumGetFoo/FooGetDatum functions seems like an abstraction fail.

However, the sheer number of quasi-cosmetic issues you found
this way gives me pause. If we don't have something like
0006 available for test purposes, more bugs of the same
ilk are surely going to creep in.

regards, tom lane

#3Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#2)
Re: Datum as struct

On Thu, Jul 31, 2025 at 01:17:54PM -0400, Tom Lane wrote:

My only reservation about 0004 is whether we want to do it like
this, or with the separate "_D" functions that I proposed in
the other thread. Right now the vote seems to be 2 to 1 for
adding DatumGetPointer calls, so unless there are more votes
you should go ahead with 0004 as well.

Quote from this message:
/messages/by-id/1520348.1753891591@sss.pgh.pa.us

And relevant paragraph:

I think the way forward here is to tackle that head-on and split the
top-level macros into two static inlines, along the lines of
VARDATA(Pointer ptr) and VARDATA_D(Datum dat) where the _D versions
are simply DatumGetPointer and then call the non-D versions.

Will reply on this thread with an option. Spoiler: I have mixed
feelings about the addition of the _D flavors over the additions of
DatumGetPointer() in the existing calls.

@@ -144,7 +144,7 @@ toast_save_datum(Relation rel, Datum value,
int num_indexes;
int validIndex;

-    Assert(!VARATT_IS_EXTERNAL(value));
+    Assert(!VARATT_IS_EXTERNAL(dval));
[...]
+++ b/src/backend/replication/logical/proto.c
             if (isnull[i])
                 continue;
-            else if (VARATT_IS_EXTERNAL_ONDISK(value))
+            else if (VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(value)))
                 toast_delete_datum(rel, value, is_speculative);
[...]
+++ b/src/backend/replication/logical/proto.c
-		if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values[i]))
+		if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(values[i])))

I was wondering the impact of this change in terms of the refactoring
work I've been for TOAST at [1]/messages/by-id/aFOnKHG7Wn-Srnpv@paquier.xyz -- Michael, which is something that I'm also
planning to get in committable shape very soon to ease the addition of
more on-disk external TOAST pointers. When I've looked at the
problem, I've found the current non-distinction between Datums and
varlenas confusing, so forcing more control with types is an
improvement IMO. At the end the impact is I think null, my stuff uses
varlenas in the refactored code rather than Datums.

[1]: /messages/by-id/aFOnKHG7Wn-Srnpv@paquier.xyz -- Michael
--
Michael

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#2)
pgaio_io_get_id() type (was Re: Datum as struct)

On 31.07.25 19:17, Tom Lane wrote:

Also I see a "// XXX" in pg_get_aios, which I guess is a note
to confirm the data type to use for ioh_id?

Yes, the stuff returned from pgaio_io_get_id() should be int, but some
code uses uint32, and also UINT32_MAX as a special marker. Here is a
patch that tries to straighten that out by using int consistently and -1
as a special marker.

Attachments:

0001-Use-consistent-type-for-pgaio_io_get_id-result.patchtext/plain; charset=UTF-8; name=0001-Use-consistent-type-for-pgaio_io_get_id-result.patchDownload+9-10
#5Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: Datum as struct
diff --git a/src/backend/utils/adt/timestamp.c 

b/src/backend/utils/adt/timestamp.c

index 25cff56c3d0..e640b48205b 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -4954,7 +4954,7 @@ timestamptz_trunc_internal(text *units, 

TimestampTz timestamp, pg_tz *tzp)

case DTK_SECOND:
case DTK_MILLISEC:
case DTK_MICROSEC:
- PG_RETURN_TIMESTAMPTZ(timestamp);
+ return timestamp;
break;

default:

This one is an actual bug. With int64-pass-by-reference,
PG_RETURN_TIMESTAMPTZ() returns a newly allocated address. This is
easily reproduced by

SELECT date_trunc( 'week', timestamp with time zone 'infinity' );

which will return a random value on 32-bit systems. (It correctly
returns 'infinity' otherwise.)

I'll see about backpatching fixes for this.

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#5)
Re: Datum as struct

On 05.08.25 20:06, Peter Eisentraut wrote:

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/ 

adt/timestamp.c

index 25cff56c3d0..e640b48205b 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -4954,7 +4954,7 @@ timestamptz_trunc_internal(text *units, 

TimestampTz timestamp, pg_tz *tzp)

                  case DTK_SECOND:
                  case DTK_MILLISEC:
                  case DTK_MICROSEC:
-                    PG_RETURN_TIMESTAMPTZ(timestamp);
+                    return timestamp;
                     break;

                 default:

This one is an actual bug.  With int64-pass-by-reference,
PG_RETURN_TIMESTAMPTZ() returns a newly allocated address.  This is
easily reproduced by

SELECT date_trunc( 'week', timestamp with time zone 'infinity' );

which will return a random value on 32-bit systems.  (It correctly
returns 'infinity' otherwise.)

I'll see about backpatching fixes for this.

This is new in PG18, so I'm continuing the discussion in the original
thread:
/messages/by-id/CAAvxfHc4084dGzEJR0_pBZkDuqbPGc5wn7gK_M0XR_kRiCdUJQ@mail.gmail.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#6)
Re: Datum as struct

Peter Eisentraut <peter@eisentraut.org> writes:

On 05.08.25 20:06, Peter Eisentraut wrote:

-                    PG_RETURN_TIMESTAMPTZ(timestamp);
+                    return timestamp;

This one is an actual bug.

I took another look through this patch series and realized that this
bit from 0003 is also a live bug that should be back-patched, for
exactly the same reason as with the above:

diff --git a/contrib/intarray/_int_selfuncs.c b/contrib/intarray/_int_selfuncs.c
index 6c3b7ace146..9bf64486242 100644
--- a/contrib/intarray/_int_selfuncs.c
+++ b/contrib/intarray/_int_selfuncs.c
@@ -177,7 +177,7 @@ _int_matchsel(PG_FUNCTION_ARGS)
 	if (query->size == 0)
 	{
 		ReleaseVariableStats(vardata);
-		return (Selectivity) 0.0;
+		PG_RETURN_FLOAT8(0.0);
 	}

/*

I think that on a 32-bit machine this would actually result in a
null-pointer core dump, since the 0.0 would be coerced to a zero
Datum value. The line is not reached in our regression tests,
and given the lack of field complaints, it must be hard to reach
in normal use too. Or 32-bit machines are deader than I thought.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Datum as struct

I wrote:

I think that on a 32-bit machine this would actually result in a
null-pointer core dump, since the 0.0 would be coerced to a zero
Datum value. The line is not reached in our regression tests,
and given the lack of field complaints, it must be hard to reach
in normal use too. Or 32-bit machines are deader than I thought.

On closer inspection, it's unreachable because bqarr_in won't
accept an empty query, and there is no other function that will
create query_int values. So probably it's not worth the trouble
to back-patch.

regards, tom lane

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#8)
Re: Datum as struct

On 07.08.25 03:55, Tom Lane wrote:

I wrote:

I think that on a 32-bit machine this would actually result in a
null-pointer core dump, since the 0.0 would be coerced to a zero
Datum value. The line is not reached in our regression tests,
and given the lack of field complaints, it must be hard to reach
in normal use too. Or 32-bit machines are deader than I thought.

On closer inspection, it's unreachable because bqarr_in won't
accept an empty query, and there is no other function that will
create query_int values. So probably it's not worth the trouble
to back-patch.

I ended up backpatching this, since it was easy enough and I didn't want
to leave such patently broken code lie around. And it will allow us to
label the remaining cleanup patches as "harmless". I'll go commit those
to master next.

#10Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#4)
Re: pgaio_io_get_id() type (was Re: Datum as struct)

Hi,

On 2025-08-05 19:20:20 +0200, Peter Eisentraut wrote:

On 31.07.25 19:17, Tom Lane wrote:

Also I see a "// XXX" in pg_get_aios, which I guess is a note
to confirm the data type to use for ioh_id?

Yes, the stuff returned from pgaio_io_get_id() should be int, but some code
uses uint32, and also UINT32_MAX as a special marker. Here is a patch that
tries to straighten that out by using int consistently and -1 as a special
marker.

From e1bb97e13af226cd15c9a59929aef48bc92c4ac2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 5 Aug 2025 19:16:38 +0200
Subject: [PATCH] Use consistent type for pgaio_io_get_id() result

The result of pgaio_io_get_id() was being assigned to a mix of int and
uint32 variables. This fixes it to use int consistently, which seems
the most correct. Also change the queue empty special value in
method_worker.c to -1 from UINT32_MAX.

WFM. I guess I could see some value going with an unsigned value instead of a
signed one, but it really doesn't matter...

Greetings,

Andres

#11Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#1)
Re: Datum as struct

Hi,

On 2025-07-31 16:02:35 +0200, Peter Eisentraut wrote:

Another draft patch set that I had lying around that was mentioned in [0].

Nice, thanks for doing that. I tried this a few years back and was scared away
by the churn, but I think it's well worth tackling.

One thing that would be an interesting follow-up would be to make the struct
not just carry the datum, but also the type of the field, to be filled in by
tuple deforming and the *GetDatum() functions. Then we could assert that the
correct DatumGet*() functions are used. I think that'd allow us to detect a
rather large number of issues that we currently aren't finding - I'd bet a
review of a large patchset that we'd fine a number of old bugs.

From 869185199e7a7b711afb5b51834cbb0691ee8223 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 31 Jul 2025 15:18:02 +0200
Subject: [PATCH v1 3/6] Add missing Datum conversions

Add various missing conversions from and to Datum. The previous code
mostly relied on implicit conversions or its own explicit casts
instead of using the correct DatumGet*() or *GetDatum() functions.

diff --git a/contrib/btree_gist/btree_utils_num.c b/contrib/btree_gist/btree_utils_num.c
index 346ee837d75..446fa930b92 100644
--- a/contrib/btree_gist/btree_utils_num.c
+++ b/contrib/btree_gist/btree_utils_num.c
@@ -119,38 +119,38 @@ gbt_num_fetch(GISTENTRY *entry, const gbtree_ninfo *tinfo)
switch (tinfo->t)
{
case gbt_t_bool:
-			datum = BoolGetDatum(*(bool *) entry->key);
+			datum = BoolGetDatum(*(bool *) DatumGetPointer(entry->key));
break;
case gbt_t_int2:
-			datum = Int16GetDatum(*(int16 *) entry->key);
+			datum = Int16GetDatum(*(int16 *) DatumGetPointer(entry->key));
break;
case gbt_t_int4:
-			datum = Int32GetDatum(*(int32 *) entry->key);
+			datum = Int32GetDatum(*(int32 *) DatumGetPointer(entry->key));
break;
case gbt_t_int8:
-			datum = Int64GetDatum(*(int64 *) entry->key);
+			datum = Int64GetDatum(*(int64 *) DatumGetPointer(entry->key));
break;
case gbt_t_oid:
case gbt_t_enum:
-			datum = ObjectIdGetDatum(*(Oid *) entry->key);
+			datum = ObjectIdGetDatum(*(Oid *) DatumGetPointer(entry->key));
break;
case gbt_t_float4:
-			datum = Float4GetDatum(*(float4 *) entry->key);
+			datum = Float4GetDatum(*(float4 *) DatumGetPointer(entry->key));
break;
case gbt_t_float8:
-			datum = Float8GetDatum(*(float8 *) entry->key);
+			datum = Float8GetDatum(*(float8 *) DatumGetPointer(entry->key));
break;
case gbt_t_date:
-			datum = DateADTGetDatum(*(DateADT *) entry->key);
+			datum = DateADTGetDatum(*(DateADT *) DatumGetPointer(entry->key));
break;
case gbt_t_time:
-			datum = TimeADTGetDatum(*(TimeADT *) entry->key);
+			datum = TimeADTGetDatum(*(TimeADT *) DatumGetPointer(entry->key));
break;
case gbt_t_ts:
-			datum = TimestampGetDatum(*(Timestamp *) entry->key);
+			datum = TimestampGetDatum(*(Timestamp *) DatumGetPointer(entry->key));
break;
case gbt_t_cash:
-			datum = CashGetDatum(*(Cash *) entry->key);
+			datum = CashGetDatum(*(Cash *) DatumGetPointer(entry->key));
break;
default:
datum = entry->key;

Would probably be more readable if you put DatumGetPointer(entry->key) into a
local variable...

From dfd9e4b16e979ac52be682d8c093f883da8d7ba2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 31 Jul 2025 15:18:02 +0200
Subject: [PATCH v1 5/6] Fix various hash function uses

These instances were using Datum-returning functions where a
lower-level function returning uint32 would be more appropriate.

Indeed.

From 81e56922fe429fae6ebc33b2f518ef88b00ffc42 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 31 Jul 2025 15:18:02 +0200
Subject: [PATCH v1 6/6] WIP: Datum as struct

This changes Datum to a struct so that you can no longer rely on it
being implicitly convertable to other C types.

(I only skimmed this very very briefly)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index f98805fb5f7..e56e0db97c9 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -653,7 +653,7 @@ dblink_fetch(PG_FUNCTION_ARGS)
{
dblink_res_error(conn, conname, res, fail,
"while fetching from cursor \"%s\"", curname);
-		return (Datum) 0;
+		PG_RETURN_DUMMY();
}
else if (PQresultStatus(res) == PGRES_COMMAND_OK)
{
@@ -665,7 +665,7 @@ dblink_fetch(PG_FUNCTION_ARGS)
}
materializeResult(fcinfo, conn, res);
-	return (Datum) 0;
+	PG_RETURN_DUMMY();
}

/*

Probably worth splitting this type of change out into its own commit, this
is a pretty large change, and that could be done independently.

diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c
index 9c53877c4a5..d059b31231f 100644
--- a/contrib/hstore/hstore_io.c
+++ b/contrib/hstore/hstore_io.c
@@ -1113,7 +1113,7 @@ hstore_populate_record(PG_FUNCTION_ARGS)
{
for (i = 0; i < ncolumns; ++i)
{
-			values[i] = (Datum) 0;
+			values[i] = DummyDatum;
nulls[i] = true;
}
}

I don't think I like DummyDatum very much, but I don't really have a better
suggestion. It could be worth having a separate NullPointerDatum? Should
also probably be its own patch.

One thing that'd be nice is to mark the value as uninitialized, so that
valgrind etc could find cases where we rely on these dummy datums.

diff --git a/contrib/ltree/_ltree_gist.c b/contrib/ltree/_ltree_gist.c
index 286ad24fbe8..2d71cea7e5a 100644
--- a/contrib/ltree/_ltree_gist.c
+++ b/contrib/ltree/_ltree_gist.c
@@ -84,7 +84,7 @@ _ltree_compress(PG_FUNCTION_ARGS)
entry->rel, entry->page,
entry->offset, false);
}
-	else if (!LTG_ISALLTRUE(entry->key))
+	else if (!LTG_ISALLTRUE(entry->key.value))

This should be DatumGet*(), no?

diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c
index 996ce174454..5d57563ecb7 100644
--- a/contrib/sepgsql/label.c
+++ b/contrib/sepgsql/label.c
@@ -330,7 +330,7 @@ sepgsql_fmgr_hook(FmgrHookEventType event,
stack = palloc(sizeof(*stack));
stack->old_label = NULL;
stack->new_label = sepgsql_avc_trusted_proc(flinfo->fn_oid);
-				stack->next_private = 0;
+				stack->next_private.value = 0;

MemoryContextSwitchTo(oldcxt);

Probably should use DummyDatum.

Greetings,

Andres Freund

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#11)
Re: Datum as struct

Andres Freund <andres@anarazel.de> writes:

On 2025-07-31 16:02:35 +0200, Peter Eisentraut wrote:

diff --git a/contrib/ltree/_ltree_gist.c b/contrib/ltree/_ltree_gist.c
index 286ad24fbe8..2d71cea7e5a 100644
--- a/contrib/ltree/_ltree_gist.c
+++ b/contrib/ltree/_ltree_gist.c
@@ -84,7 +84,7 @@ _ltree_compress(PG_FUNCTION_ARGS)
entry->rel, entry->page,
entry->offset, false);
}
-	else if (!LTG_ISALLTRUE(entry->key))
+	else if (!LTG_ISALLTRUE(entry->key.value))

This should be DatumGet*(), no?

Indeed. I was just rebasing my 8-byte-Datum patch onto HEAD, and
noted this and one or two other places still missing DatumGetPointer.
I plan to go ahead and commit a cleanup patch with those fixes and
s/(Datum) NULL/(Datum) 0/g as soon as I've finished testing.

FWIW, I don't love the "DummyDatum" name either. Maybe InvalidDatum?

regards, tom lane

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#12)
Re: Datum as struct

On 08.08.25 22:55, Tom Lane wrote:

FWIW, I don't love the "DummyDatum" name either. Maybe InvalidDatum?

I specifically did not use InvalidDatum because, for example, the result
of Int32GetDatum(0) is by no means "invalid". Maybe something like
NullDatum or VoidDatum?

A few related thoughts:

The calls to PG_RETURN_DUMMY() in my patch are in set-returning
functions. I thought maybe a real API macro would be nice here, like
PG_RETURN_SRF_DONE().

Many changes are in the arguments of on_*_exit() functions. I always
thought it was weird layering to use Datum at that level. I mean, would
on_proc_exit(foo, Int64GetDatum(val)) even work correctly on 32-bit
platforms?

Another common coding pattern is something like

if (isnull)
{
d = (Datum) 0;
n = true;
}
else
{
d = actualvalue;
n = false;
}

sometimes with a comment /* keep compiler quiet */ or /* redundant, but
safe */.

I wonder whether it would in general be better to not initialize "d" in
the first case, allowing perhaps static analyzers to detect invalid
accesses later on. On the other hand, this might confuse regular
compilers into warnings, as the comments suggest.

So maybe finding some higher-level changes in these areas could reduce
the churn in the remaining patch significantly.

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#11)
Re: Datum as struct

On 08.08.25 22:30, Andres Freund wrote:

One thing that would be an interesting follow-up would be to make the struct
not just carry the datum, but also the type of the field, to be filled in by
tuple deforming and the *GetDatum() functions. Then we could assert that the
correct DatumGet*() functions are used. I think that'd allow us to detect a
rather large number of issues that we currently aren't finding

That would make Datum >=9 bytes? Is that something we would need to
worry about in terms of performance?

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#14)
Re: Datum as struct

Peter Eisentraut <peter@eisentraut.org> writes:

On 08.08.25 22:30, Andres Freund wrote:

One thing that would be an interesting follow-up would be to make the struct
not just carry the datum, but also the type of the field, to be filled in by
tuple deforming and the *GetDatum() functions. Then we could assert that the
correct DatumGet*() functions are used. I think that'd allow us to detect a
rather large number of issues that we currently aren't finding

That would make Datum >=9 bytes? Is that something we would need to
worry about in terms of performance?

It'd have to be a cassert-like option, not something you'd enable in
production.

regards, tom lane

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#10)
Re: pgaio_io_get_id() type (was Re: Datum as struct)

On 08.08.25 22:07, Andres Freund wrote:

On 2025-08-05 19:20:20 +0200, Peter Eisentraut wrote:

On 31.07.25 19:17, Tom Lane wrote:

Also I see a "// XXX" in pg_get_aios, which I guess is a note
to confirm the data type to use for ioh_id?

Yes, the stuff returned from pgaio_io_get_id() should be int, but some code
uses uint32, and also UINT32_MAX as a special marker. Here is a patch that
tries to straighten that out by using int consistently and -1 as a special
marker.

From e1bb97e13af226cd15c9a59929aef48bc92c4ac2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 5 Aug 2025 19:16:38 +0200
Subject: [PATCH] Use consistent type for pgaio_io_get_id() result

The result of pgaio_io_get_id() was being assigned to a mix of int and
uint32 variables. This fixes it to use int consistently, which seems
the most correct. Also change the queue empty special value in
method_worker.c to -1 from UINT32_MAX.

WFM. I guess I could see some value going with an unsigned value instead of a
signed one, but it really doesn't matter...

I have committed this.