Working with PostgreSQL enums in C code

Started by Joseph Adamsover 15 years ago7 messages
#1Joseph Adams
joeyadams3.14159@gmail.com

I encountered a situation while implementing JSON support where I
needed to return an enum value from a C function. To clarify, here's
the SQL:

CREATE TYPE json_type_t AS ENUM ('null', 'string', 'number', 'bool',
'object', 'array');

CREATE OR REPLACE FUNCTION json_type(json)
RETURNS json_type_t
AS 'MODULE_PATHNAME','json_get_type'
LANGUAGE C STRICT IMMUTABLE;

I initially tried looking for another function returning an enum in
the PostgreSQL source tree, but I couldn't find any. I guess this is
because enums are a relatively new feature in PostgreSQL.

I learned that to return an enum value from C, one needs to return the
OID of the right row of the pg_enum table. I eventually managed to
write the code below, which is mostly based on the enum_in function in
src/backend/utils/adt/enum.c .

#define PG_RETURN_ENUM(typname, label) return enumLabelToOid(typname, label)

static Oid enumLabelToOid(const char *typname, const char *label)
{
Oid enumtypoid;
HeapTuple tup;
Oid ret;

enumtypoid = TypenameGetTypid(typname);
Assert(OidIsValid(enumtypoid));

tup = SearchSysCache2(ENUMTYPOIDNAME,
ObjectIdGetDatum(enumtypoid),
CStringGetDatum(label));
Assert(HeapTupleIsValid(tup));

ret = HeapTupleGetOid(tup);

ReleaseSysCache(tup);

return ret;
}

Feel free to nitpick the code above, as I'm still learning. Note that
I replaced the more robust validity checks of enum_in with (quicker?)
asserts, with the assumption that correct programs would only pass
valid values to PG_RETURN_ENUM .

The code using the method above can be found here:
http://git.postgresql.org/gitweb?p=json-datatype.git;a=tree;f=contrib/json;h=1dd813da4016b31f35cb39b01c6d5f0999da672e;hb=092fa046f95580dd7906a07370ca401692a1f818
. My testcases passed, so everything seems to work.

I suppose my PG_RETURN_ENUM macro is nice and simple, except for the
fact that the coder has to keep an enum names table in sync with the
SQL code and the C code. However, going the other way around
(PG_GETARG_ENUM) would need access to that enum names table. Hence,
it'd make sense to have macros for defining this table so both
PG_RETURN_ENUM and PG_GETARG_ENUM can reference it.

I believe that these macros would be a useful addition to the
PostgreSQL function manager API, as they would provide a decent way to
receive and return custom enums from C code. Anyone agree/disagree?

Joey Adams

#2Robert Haas
robertmhaas@gmail.com
In reply to: Joseph Adams (#1)
Re: Working with PostgreSQL enums in C code

On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:

I learned that to return an enum value from C, one needs to return the
OID of the right row of the pg_enum table.  I eventually managed to
write the code below, which is mostly based on the enum_in function in
src/backend/utils/adt/enum.c .

PG_RETURN macros shouldn't do any nontrivial processing (see the
existing ones for references). I assume you have the enum labels
declared in pg_enum.h, so I think you can just return the correct OID
values directly. Declare constants for them in pg_enum.h and then
just do PG_RETURN_OID(whatever).

#define JSONTypeNullOid ...
#define JSONTypeStringOid ...
#define JSONTypeNumberOid ...

It really shouldn't be necessary to do a catalog lookup to retrieve a constant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: Working with PostgreSQL enums in C code

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:

I learned that to return an enum value from C, one needs to return the
OID of the right row of the pg_enum table. �I eventually managed to
write the code below, which is mostly based on the enum_in function in
src/backend/utils/adt/enum.c .

PG_RETURN macros shouldn't do any nontrivial processing (see the
existing ones for references).

Yeah, that was my first reaction too. If we don't already have one,
it would be appropriate to provide a "lookup enum value" function
(functionally about the same as enum_in, but designed to be called
conveniently from C). Then, if you needed to work from a textual
enum label, you'd call that function and then PG_RETURN_OID.

However, for a built-in enum type, I agree with Robert's solution of
just #define-ing fixed OIDs for the values of the type.

regards, tom lane

#4Joseph Adams
joeyadams3.14159@gmail.com
In reply to: Tom Lane (#3)
Re: Working with PostgreSQL enums in C code

On Fri, May 28, 2010 at 9:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:

I learned that to return an enum value from C, one needs to return the
OID of the right row of the pg_enum table.  I eventually managed to
write the code below, which is mostly based on the enum_in function in
src/backend/utils/adt/enum.c .

PG_RETURN macros shouldn't do any nontrivial processing (see the
existing ones for references).

Yeah, that was my first reaction too.  If we don't already have one,
it would be appropriate to provide a "lookup enum value" function
(functionally about the same as enum_in, but designed to be called
conveniently from C).  Then, if you needed to work from a textual
enum label, you'd call that function and then PG_RETURN_OID.

Here is the function I wrote to look up enum values:

Oid enum_label_to_oid(const char *typname, const char *label)
{
Oid enumtypoid;
HeapTuple tup;
Oid ret;

enumtypoid = TypenameGetTypid(typname);
Assert(OidIsValid(enumtypoid));

tup = SearchSysCache2(ENUMTYPOIDNAME,
ObjectIdGetDatum(enumtypoid),
CStringGetDatum(label));
Assert(HeapTupleIsValid(tup));

ret = HeapTupleGetOid(tup);

ReleaseSysCache(tup);

return ret;
}

If this were added to PostgreSQL proper, what source/header files
would it make sense to put it in? enum.c/builtins.h ?

However, for a built-in enum type, I agree with Robert's solution of
just #define-ing fixed OIDs for the values of the type.

I don't know if the JSON datatype will be a contrib module or built-in
yet, but if it were contrib, would it still be better to use fixed
OIDs anyway? One issue with setting this precedent is that new
contrib modules using enums wouldn't be compatible with older versions
of PostgreSQL. One idea might be to insert extend CREATE TYPE name AS
ENUM so OIDs can be specified explicitly, but then that could lead to
OID clashes. That would be a really messy problem for users.

By the way, just curious: why can't the internal representation of an
enum just be an INT starting from 0 by default, like in C? That would
make a heck of a lot more sense, in my opinion. It might also allow
users to do things like this in the future:

CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3);

Joey Adams

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Joseph Adams (#4)
Re: Working with PostgreSQL enums in C code

Excerpts from Joseph Adams's message of lun jun 07 17:16:12 -0400 2010:

However, for a built-in enum type, I agree with Robert's solution of
just #define-ing fixed OIDs for the values of the type.

I don't know if the JSON datatype will be a contrib module or built-in
yet, but if it were contrib, would it still be better to use fixed
OIDs anyway?

If it were contrib, fixed OIDs wouldn't cut it precisely for that
reason. (A contrib module is not "built-in" for these purposes.)

One issue with setting this precedent is that new
contrib modules using enums wouldn't be compatible with older versions
of PostgreSQL. One idea might be to insert extend CREATE TYPE name AS
ENUM so OIDs can be specified explicitly, but then that could lead to
OID clashes. That would be a really messy problem for users.

Yeah. We've just defined an interface for pg_migrator-only usage,
allowing it to define the OID values of ENUMs; it wasn't considered a
good idea to expose the details to the user.

By the way, just curious: why can't the internal representation of an
enum just be an INT starting from 0 by default, like in C? That would
make a heck of a lot more sense, in my opinion. It might also allow
users to do things like this in the future:

CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3);

The problem is that the output function needs to be able to figure out
the value starting with only the datum value. If it had only the "1"
it couldn't know what enum it'd correspond to. The other alternative
would have been to make enums 64 bits wide, carrying the enum OID in 32
bits and the value in the other 32. This was dismissed as too wasteful.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Robert Haas
robertmhaas@gmail.com
In reply to: Joseph Adams (#4)
Re: Working with PostgreSQL enums in C code

On Mon, Jun 7, 2010 at 5:16 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote:

I don't know if the JSON datatype will be a contrib module or built-in
yet, but if it were contrib, would it still be better to use fixed
OIDs anyway?

Part of the point is that EXPLAIN (FORMAT JSON) should return json, so
this needs to be built-in. Otherwise, that won't work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Joseph Adams (#4)
Re: Working with PostgreSQL enums in C code

Joseph Adams wrote:

By the way, just curious: why can't the internal representation of an
enum just be an INT starting from 0 by default, like in C? That would
make a heck of a lot more sense, in my opinion. It might also allow
users to do things like this in the future:

Please review the debates over the internal representation from several
years ago when enums were implemented. Essentially the difficulty is
that the output function needs to get nothing more than the value
itself, and that means the representation needs to carry with it some
information about *which* enum set it is in.

cheers

andrew