DOMAIN/composite TYPE vs. base TYPE
Hello,
I'm considering creating a TYPE for what may be called a "possibly
imprecise date" (pidate). The most obvious use is for recording dates
such as births or deaths of historical individuals, where we may know
that someone died precisely on a given year-month-day, but the birth may
only be known down to year-month or just the year (or perhaps we know
precisely the baptism date [Adam Smith], but not the actual birth, so we
want to record the former but qualified so it can be annotated on
display). Another use is for publications, like magazines that are
issued on a monthly basis or journals that are issued on a quarterly or
seasonal basis.
We currently have two instances of this kind, using a standard DATE
column plus a CHAR(1) column that encodes (on a limited basis for now)
the YMD, YM or Y level of precision, and a simple SQL function to return
a textual representation of the pidate. It would be nice to generalize
this before going further.
The first option I explored was creating a composite type with the two
attributes, but that doesn't allow specification of DEFAULTs, NOT NULL
or CHECK expressions on the precision code attribute. It seems I'd have
to create a DOMAIN first, then use DATE and that domain to create a
composite TYPE, to finally use the latter in actual tables. That
layering looks cumbersome.
Another option, which I havent't tried, is to subvert PG by creating an
empty table, since that creates a "record type", but even if possible
that would be a hack.
Finally there's the base TYPE. This entails writing some seven
functions "in C or another low-level language" (does PG support *any*
other such language?), plus installing a library with those functions in
a production environment. Doable, yes, but not very friendly either.
Am I overlooking something or is the practice of creating abstractions
in object-relational databases mostly unchanged?
Regards,
Joe
On Sep 28, 2020, at 3:14 PM, Joe Abbate <jma@freedomcircle.com> wrote:
Hello,
I'm considering creating a TYPE for what may be called a "possibly imprecise date" (pidate). The most obvious use is for recording dates such as births or deaths of historical individuals, where we may know that someone died precisely on a given year-month-day, but the birth may only be known down to year-month or just the year (or perhaps we know precisely the baptism date [Adam Smith], but not the actual birth, so we want to record the former but qualified so it can be annotated on display). Another use is for publications, like magazines that are issued on a monthly basis or journals that are issued on a quarterly or seasonal basis.
We currently have two instances of this kind, using a standard DATE column plus a CHAR(1) column that encodes (on a limited basis for now) the YMD, YM or Y level of precision, and a simple SQL function to return a textual representation of the pidate. It would be nice to generalize this before going further.
The first option I explored was creating a composite type with the two attributes, but that doesn't allow specification of DEFAULTs, NOT NULL or CHECK expressions on the precision code attribute. It seems I'd have to create a DOMAIN first, then use DATE and that domain to create a composite TYPE, to finally use the latter in actual tables. That layering looks cumbersome.
Another option, which I havent't tried, is to subvert PG by creating an empty table, since that creates a "record type", but even if possible that would be a hack.
Finally there's the base TYPE. This entails writing some seven functions "in C or another low-level language" (does PG support *any* other such language?), plus installing a library with those functions in a production environment. Doable, yes, but not very friendly either.
Am I overlooking something or is the practice of creating abstractions in object-relational databases mostly unchanged?
Regards,
Joe
just record all three fields (day, month, year) with nulls and do the to-date as needed.
Joe Abbate <jma@freedomcircle.com> writes:
I'm considering creating a TYPE for what may be called a "possibly
imprecise date" (pidate).
The first option I explored was creating a composite type with the two
attributes, but that doesn't allow specification of DEFAULTs, NOT NULL
or CHECK expressions on the precision code attribute. It seems I'd have
to create a DOMAIN first, then use DATE and that domain to create a
composite TYPE, to finally use the latter in actual tables. That
layering looks cumbersome.
Agreed.
Another option, which I havent't tried, is to subvert PG by creating an
empty table, since that creates a "record type", but even if possible
that would be a hack.
Won't help. Even if the table has constraints, when its rowtype is used
in a standalone context, it only has the features that a standalone
composite type would have (ie, no constraints).
Am I overlooking something or is the practice of creating abstractions
in object-relational databases mostly unchanged?
Domain-over-composite might be a slightly simpler answer than your first
one. It's only available in relatively late-model PG, and I'm not sure
about its performance relative to your other design, but it is an
alternative to think about.
Note that attaching NOT NULL constraints at the domain level is almost
never a good idea, because then you find yourself with a semantically
impossible situation when, say, a column of that type is on the nullable
side of an outer join. We allow such constraints, but they will be
nominally violated in cases like that.
regards, tom lane
Hello Rob,
On 28/9/20 17:17, Rob Sargent wrote:
just record all three fields (day, month, year) with nulls and do the to-date as needed.
That is not sufficient. An earlier implementation had something like a
CHAR(8) to record YYYYMMDD, but how can you indicate, for example, an
issue date of a bimonthly magazine, say July-Aug 2020? We can store
2020-07-01 in the DATE attribute, but we need another attribute to
indicate it's really two months. Also, by storing three separate
columns, you loose the beauty of the PG DATE abstraction.
Joe
On 9/28/20 4:31 PM, Joe Abbate wrote:
Hello Rob,
On 28/9/20 17:17, Rob Sargent wrote:
just record all three fields (day, month, year) with nulls and do the
to-date as needed.That is not sufficient. An earlier implementation had something like a
CHAR(8) to record YYYYMMDD, but how can you indicate, for example, an
issue date of a bimonthly magazine, say July-Aug 2020? We can store
2020-07-01 in the DATE attribute, but we need another attribute to
indicate it's really two months. Also, by storing three separate columns,
you loose the beauty of the PG DATE abstraction.
The Gramps <https://gramps-project.org/blog/> genealogy program has figured
it out; maybe it's source code can lend you some clues.
--
Angular momentum makes the world go 'round.
On 29 Sep 2020, at 7:31, Joe Abbate wrote:
Hello Rob,
On 28/9/20 17:17, Rob Sargent wrote:
just record all three fields (day, month, year) with nulls and do the
to-date as needed.That is not sufficient. An earlier implementation had something like
a CHAR(8) to record YYYYMMDD, but how can you indicate, for example,
an issue date of a bimonthly magazine, say July-Aug 2020? We can
store 2020-07-01 in the DATE attribute, but we need another attribute
to indicate it's really two months. Also, by storing three separate
columns, you loose the beauty of the PG DATE abstraction.
This is only a partial “fix” and goes nowhere near solving the full
wrapper/abstraction problem…
Consider expressing all the component fields as a range. This allows you
the ability to be a precise as you need and still have the benefits of
well defined comparison functions.
Regards
Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a
well-known solution to every human problem — neat, plausible, and
wrong. The ancients, in the case at bar, laid the blame upon the gods:
sometimes they were remote and surly, and sometimes they were kind. In
the Middle Ages lesser powers took a hand in the matter, and so one
reads of works of art inspired by Our Lady, by the Blessed Saints, by
the souls of the departed, and even by the devil. H. L. Mencken, 1920
Hello Tom,
On 28/9/20 17:25, Tom Lane wrote:
Domain-over-composite might be a slightly simpler answer than your first
one. It's only available in relatively late-model PG, and I'm not sure
about its performance relative to your other design, but it is an
alternative to think about.
"Domain-over-composite" meaning create a TYPE first (DATE, CHAR(1)) and
then a DOMAIN based on that type? (1) How late model are we talking?
The DOMAIN syntax doesn't seem changed from PG 11 to PG 13? (2) Can a
CHECK constraint specify attributes of the composite?
Note that attaching NOT NULL constraints at the domain level is almost
never a good idea, because then you find yourself with a semantically
impossible situation when, say, a column of that type is on the nullable
side of an outer join. We allow such constraints, but they will be
nominally violated in cases like that.
NULLs: Tony Hoare's "billion dollars of pain and damage" transported to SQL.
Joe
Hello Gavan,
On 28/9/20 17:52, Gavan Schneider wrote:
Consider expressing all the component fields as a range. This allows you
the ability to be a precise as you need and still have the benefits of
well defined comparison functions.
I did consider that, but it's a tradeoff between 80% of the cases being
a single precise date, 18% being a single date with some imprecision
around a single month, and the rest with "ranges" of months or other
intervals.
Joe
On 9/28/20 2:58 PM, Joe Abbate wrote:
Hello Tom,
On 28/9/20 17:25, Tom Lane wrote:
Domain-over-composite might be a slightly simpler answer than your first
one. It's only available in relatively late-model PG, and I'm not sure
about its performance relative to your other design, but it is an
alternative to think about."Domain-over-composite" meaning create a TYPE first (DATE, CHAR(1)) and
then a DOMAIN based on that type? (1) How late model are we talking?
The DOMAIN syntax doesn't seem changed from PG 11 to PG 13? (2) Can a
CHECK constraint specify attributes of the composite?Note that attaching NOT NULL constraints at the domain level is almost
never a good idea, because then you find yourself with a semantically
impossible situation when, say, a column of that type is on the nullable
side of an outer join. We allow such constraints, but they will be
nominally violated in cases like that.NULLs: Tony Hoare's "billion dollars of pain and damage" transported to
SQL.
Except that the case Tom is talking about would occur due to something like:
select table_a left join table_b on table_a.id = table_b.id where
table_b.id is null;
That has been very useful to me and I'm not sure that how anything you
replace NULL with to represent 'unknown' would change the situation.
Joe
--
Adrian Klaver
adrian.klaver@aklaver.com
Joe Abbate <jma@freedomcircle.com> writes:
On 28/9/20 17:25, Tom Lane wrote:
Domain-over-composite might be a slightly simpler answer than your first
one. It's only available in relatively late-model PG, and I'm not sure
about its performance relative to your other design, but it is an
alternative to think about.
"Domain-over-composite" meaning create a TYPE first (DATE, CHAR(1)) and
then a DOMAIN based on that type?
Right.
regression=# create type t1 as (d date, t char(1));
CREATE TYPE
regression=# create domain dt1 as t1 check((value).t in ('a', 'b'));
CREATE DOMAIN
(1) How late model are we talking?
The DOMAIN syntax doesn't seem changed from PG 11 to PG 13?
Back to 11, looks like. The syntax didn't change, but v10 complains
ERROR: "t1" is not a valid base type for a domain
Note that attaching NOT NULL constraints at the domain level is almost
never a good idea, because then you find yourself with a semantically
impossible situation when, say, a column of that type is on the nullable
side of an outer join. We allow such constraints, but they will be
nominally violated in cases like that.
NULLs: Tony Hoare's "billion dollars of pain and damage" transported to SQL.
I dunno, outer joins are awfully useful. It is true that the SQL
committee has stuck too many not-quite-consistent meanings on NULL,
but on the other hand, several different kinds of NULL might be
worse.
regards, tom lane