Refactor construct_array() and deconstruct_array() for built-in types

Started by Peter Eisentrautalmost 4 years ago7 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

[for PG16]

There are many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcode the type attributes necessary to pass to these functions.

To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge. This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.

I also considered having genbki.pl generate lookup tables for these
hardcoded values, similar to schemapg.h, but that ultimately seemed
excessive.

Thoughts?

Attachments:

0001-Add-construct_array_builtin-deconstruct_array_builti.patchtext/plain; charset=UTF-8; name=0001-Add-construct_array_builtin-deconstruct_array_builti.patchDownload+276-299
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: Refactor construct_array() and deconstruct_array() for built-in types

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

There are many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcode the type attributes necessary to pass to these functions.

To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge. This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.

I also considered having genbki.pl generate lookup tables for these
hardcoded values, similar to schemapg.h, but that ultimately seemed
excessive.

+1 --- the added overhead of the switch statements is probably a
reasonable price to pay for the notational simplification and
bug-proofing.

One minor coding gripe is that compilers that don't know that elog(ERROR)
doesn't return will certainly generate "use of possibly-uninitialized
variable" complaints. Suggest inserting "return NULL;" or similar into
the default: cases. I'd also use more specific error wording to help
people find where they need to add code when they make use of a new type;
maybe like "type %u not supported by construct_array_builtin".

regards, tom lane

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#2)
Re: Refactor construct_array() and deconstruct_array() for built-in types

On 02.05.22 16:48, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

There are many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcode the type attributes necessary to pass to these functions.

To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge. This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.

I also considered having genbki.pl generate lookup tables for these
hardcoded values, similar to schemapg.h, but that ultimately seemed
excessive.

+1 --- the added overhead of the switch statements is probably a
reasonable price to pay for the notational simplification and
bug-proofing.

One minor coding gripe is that compilers that don't know that elog(ERROR)
doesn't return will certainly generate "use of possibly-uninitialized
variable" complaints. Suggest inserting "return NULL;" or similar into
the default: cases. I'd also use more specific error wording to help
people find where they need to add code when they make use of a new type;
maybe like "type %u not supported by construct_array_builtin".

I have pushed this with the improvements you had suggested.

In reply to: Peter Eisentraut (#3)
Re: Refactor construct_array() and deconstruct_array() for built-in types

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 02.05.22 16:48, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

There are many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcode the type attributes necessary to pass to these functions.

To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge. This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.

I also considered having genbki.pl generate lookup tables for these
hardcoded values, similar to schemapg.h, but that ultimately seemed
excessive.

+1 --- the added overhead of the switch statements is probably a
reasonable price to pay for the notational simplification and
bug-proofing.
One minor coding gripe is that compilers that don't know that
elog(ERROR)
doesn't return will certainly generate "use of possibly-uninitialized
variable" complaints. Suggest inserting "return NULL;" or similar into
the default: cases. I'd also use more specific error wording to help
people find where they need to add code when they make use of a new type;
maybe like "type %u not supported by construct_array_builtin".

I have pushed this with the improvements you had suggested.

I dind't pay much attention to this thread earlier, but I was struck by
the duplication of the switch statement determining the elemlen,
elembyval, and elemalign values between the construct and deconstruct
functions. How about a common function they can both call? Something
like:

static void builtin_type_details(Oid elemtype,
int *elemlen,
bool *elembyval,
char *elemalign);

- ilmari

In reply to: Dagfinn Ilmari Mannsåker (#4)
Re: Refactor construct_array() and deconstruct_array() for built-in types

Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> writes:

I dind't pay much attention to this thread earlier, but I was struck by
the duplication of the switch statement determining the elemlen,
elembyval, and elemalign values between the construct and deconstruct
functions. How about a common function they can both call? Something
like:

static void builtin_type_details(Oid elemtype,
int *elemlen,
bool *elembyval,
char *elemalign);

I just realised that this would require the error message to not include
the function name (which isn't really that critical, since it's a
developer-facing message), but an option would to make it return false
for unknown types, so each of the calling functions can emit their own
error message.

Show quoted text

- ilmari

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dagfinn Ilmari Mannsåker (#5)
Re: Refactor construct_array() and deconstruct_array() for built-in types

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:

Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> writes:

I dind't pay much attention to this thread earlier, but I was struck by
the duplication of the switch statement determining the elemlen,
elembyval, and elemalign values between the construct and deconstruct
functions. How about a common function they can both call?

I was wondering about that too while reading the committed version.
However, adding an additional function call would weaken the argument
that this adds just a tolerable amount of overhead, primarily because
you'd need to return a record or introduce pointers or the like.

I just realised that this would require the error message to not include
the function name (which isn't really that critical, since it's a
developer-facing message), but an option would to make it return false
for unknown types, so each of the calling functions can emit their own
error message.

Nah, because the point of that was just to direct people to where
to fix it when they need to. So the message need only refer to
the common function, if we were to change it.

Perhaps a good compromise could be to turn the duplicated code into
a macro that's instantiated in both places? But I don't actually
see anything much wrong with the code as Peter has it.

regards, tom lane

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: Refactor construct_array() and deconstruct_array() for built-in types

On 01.07.22 15:37, Tom Lane wrote:

Perhaps a good compromise could be to turn the duplicated code into
a macro that's instantiated in both places? But I don't actually
see anything much wrong with the code as Peter has it.

There are opportunities to refine this further. For example, there is
similar code in TupleDescInitBuiltinEntry(), and bootstrap.c also
contains hardcoded info on built-in types, and GetCCHashEqFuncs() is
also loosely related. As I mentioned earlier in the thread, one could
have genbki.pl generate support code for this.