Custom type's modifiers
Hi,
I’m defining a custom type “MyType” with additional functions and an custom aggregate in a C-coded extension. From a PostgreSQL perspective it is a base type that piggybacks on the bytea type, i.e. LIKE = BYTEA.
But now I need to (re)define MyType to support type modifiers (e.g. MyType(1,14,18)) and I got that done using CREATE TYPE’s TYPMOD_IN and TYPMOD_OUT parameters resulting in the correct packed value getting stored in pg_attribute when I define a column of that type.
But when I pass a MyType value to a function defined in my C extension how would I access the type modifier value for the argument which could have been drawn from the catalog or the result of a cast.
E.g. if I:
SELECT MyFunc(‘xDEADBEEF’::MyType(1,14,18));
In the C function MyFunc calls I get a pointer to the data using PG_GETARG_BYTEA_P(0) macro and its length using the VARSIZE macro but I also need the given type modifiers (1, 14 and 18 in the example) before I can process the data correctly. Clearly I'd have to unpack the component values myself from the 16-bit atttypemod value into which the TYPMOD_OUT function has packed it, but where would I get access to that value? My type is written in C to be as fast as possible having to go do some SPI-level lookup or involved logic would slow it right down again.
My searches to date only yielded results referring to the value stored for a table in pg_attribute with the possibility of there being a value in HeapTupleHeader obtained by using the PG_GETARG_HEAPTUPLEHEADER(0) macro but that assumes the parameter is a tuple, not the individual value it actually is. I struggle to imagine that the type modifier value isn't being maintained by whatever casts are applied and not getting passed through to the extension already, but where to find it?
Can someone please point me in the right direction here, the name of the structure containing the raw type modifier value, the name of the value in that structure, the name of a macro that accesses it, even if it’s just what keywords to search for in the documentation and/or archives? Even if it’s just a pointer to the code where e.g. the numeric type (which has type modifiers) is implemented so I can see how that code does it. Anything, I’m getting desperate. Perhaps not many before me needed to do this so it's not often mentioned, but sure it is in there somewhere, how else would type like numeric and even varchar actually work (given that the VARSIZE of a varlena gives its actual size, not the maximum as given when the column or value was created)?
Thank you in advance,
Marthin Laubscher
Marthin Laubscher <postgres@lobeshare.co.za> writes:
But now I need to (re)define MyType to support type modifiers (e.g. MyType(1,14,18)) and I got that done using CREATE TYPE’s TYPMOD_IN and TYPMOD_OUT parameters resulting in the correct packed value getting stored in pg_attribute when I define a column of that type.
OK ...
But when I pass a MyType value to a function defined in my C extension how would I access the type modifier value for the argument which could have been drawn from the catalog or the result of a cast.
You can't. Whatever info is needed by operations on the type had
better be embedded in the value.
regards, tom lane
On 2024/06/27, 17:06, "Tom Lane" <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote:
You can't. Whatever info is needed by operations on the type had better be embedded in the value.
OK, thanks, that's clear and easy enough. I'll ensure the the third parameter to the input function is embedded in my opaque value.
I don't see another function getting passed the value so I'd assume that (unless I return a MyType value from one of my own functions which would follow its internal logic to determine which type modifiers to use) the only way a MyType can get an initial value is via the input function. If the type is in a table column the input function would be called with the default value specified in external format if a value isn't specified during insert, but either way it would always originate from the eternal format. I suppose when a cast is involved it goes via the external format as well, right?
Are those sound assumptions to make or am I still way off base here?
--- Thanks for your time - Marthin Laubscher
On Thu, Jun 27, 2024 at 8:49 AM Marthin Laubscher <postgres@lobeshare.co.za>
wrote:
I suppose when a cast is involved it goes via the external format as well,
right?
A cast between two types is going to accept a concrete instance of the
input type, in memory, as its argument and then produces a concrete
instance of the output type, in memory, as output. If the input data is
serialized the constructor for the input type will handle deserialization.
See: create cast
https://www.postgresql.org/docs/current/sql-createcast.html
In particular the phrasing: identical to or binary-coercible to
David J.
Marthin Laubscher <postgres@lobeshare.co.za> writes:
I don't see another function getting passed the value so I'd assume that (unless I return a MyType value from one of my own functions which would follow its internal logic to determine which type modifiers to use) the only way a MyType can get an initial value is via the input function. If the type is in a table column the input function would be called with the default value specified in external format if a value isn't specified during insert, but either way it would always originate from the eternal format. I suppose when a cast is involved it goes via the external format as well, right?
The only code anywhere in the system that can produce a MyType value
is code you've written. So that has to come originally from your
input function, or cast functions or constructor functions you write.
You'd be well advised to read the CREATE CAST doco about how to
support notations like 'something'::MyType(x,y,z). Although the
input function is expected to be able to apply a typemod if it's
passed one, in most situations coercion to a specific typemod is
handled by invoking a multi-parameter cast function.
regards, tom lane
On 2024/06/27, 18:13, "David G. Johnston" <mailto:david.g.johnston@gmail.com> wrote:
A cast between two types is going to accept a concrete instance of the input type, in memory, as its argument and then produces a concrete instance of the output type, in memory, as output. If the input data is serialized the constructor for the input type will handle deserialization.
I confess to some uncertainty whether the PostgreSQL specific x::y notation and the standards based CAST(x AS y) would both be addressed by creating a cast. What you’re saying means both forms engage the same code and defining a cast would cover the :: syntax as well. Thanks for that.
If my understanding is accurate, it means that even when both values are of MyType the CAST function would still be invoked so the type logic can determine how to handle (or reject) the cast. Cast would (obviously) require the target type modifiers as well, and the good news is that it’s already there as the second parameter of the function. So that’s the other function that receives the type modifier that I was missing. It’s starting to make plenty sense.
To summarise:
- The type modifiers, encoded by the TYPMOD_IN function are passed directly as parameters to:-
- the type's input function (parameter 3), and
- any "with function" cast where the target type has type modifiers.
- Regardless of the syntax used to invoke the cast, the same function will be called.
- Depending on what casts are defined, converting from the external string format to a value of MyType will be handled either by the input function or a cast function. By default (without any casts) only values recognised by input can be converted to MyType values.
-- Thanks for your time – Marthin Laubscher
Marthin Laubscher <postgres@lobeshare.co.za> writes:
I confess to some uncertainty whether the PostgreSQL specific x::y notation and the standards based CAST(x AS y) would both be addressed by creating a cast.
Those notations are precisely equivalent, modulo concerns about
operator precedence (ie, if x isn't a single token you might have
to parenthesize x to get (x)::y to mean the same as CAST(x AS y)).
regards, tom lane