FW: bitemporal functionality for PostgreSQL

Started by Luke Porterabout 18 years ago12 messageshackers
Jump to latest
#1Luke Porter
luke_porter@hotmail.com

All

Is there an interest in developing bitemporal functionality in PostgreSQL

Regards

Luke > From: bruce@momjian.us> Subject: Re: bitemporal functionality for PostgreSQL> To: luke_porter@hotmail.com> Date: Fri, 1 Feb 2008 10:08:03 -0500> > Luke Porter wrote:> > > > Bruce> > > > I have a compelling solution to provide a database with bitemporal> > functionality. It is a comprehensive spec. Is there any interest> > in PostgreSQL providing this functionality. The area is sometimes> > referred to as "temporal agility". It has recently been covered> > as an emerging requirement by Garther and C J Date has recently> > lectured on the area (My solution was demo'd at the same event> > he spoke at). It is a transforming approach to data management> > in that it completely removes the need for the developer to> > address the time dimension - the database does it all.> > Yes, I think there would be interest. I would ask on the hackers email> list.> > --> Bruce Momjian <bruce@momjian.us> http://momjian.us&gt; EnterpriseDB http://postgres.enterprisedb.com&gt; > + If your life is a hard drive, Christ can be your backup. +
_________________________________________________________________
Telly addicts unite!
http://www.searchgamesbox.com/tvtown.shtml

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Luke Porter (#1)
Re: FW: bitemporal functionality for PostgreSQL

Luke Porter <luke_porter@hotmail.com> writes:

Is there an interest in developing bitemporal functionality in PostgreSQL

Is that anything like the late, mostly unlamented "time travel" feature?

regards, tom lane

PS: in general, defining what you want with one word and assuming
everyone will grok the full meaning and implications of that is a good
way to get ignored on this list. Assuming we have never heard of
"bitemporal" is a better starting point.

#3A.M.
agentm@themactionfaction.com
In reply to: Luke Porter (#1)
Re: FW: bitemporal functionality for PostgreSQL

On Feb 1, 2008, at 10:42 AM, Luke Porter wrote:

All

Is there an interest in developing bitemporal functionality in
PostgreSQL

Regards

Luke

I can only speak for myself, but- definitely! Based on the googling I
did on "bitemporal database", I kind of do this already with
PostgreSQL. Some of my tables are insert-only and each row includes a
committed time timestamp. That way, I don't need a separate audit log
table, and "fixing" someone's mistake is as simple as copying old
rows. The downside to this is that I need a view to represent the
current "truth" and calculating the truth is more expensive than a
simple table would be.

Can you explain in more detail or provide references to how
PostgreSQL could potentially handle temporal data better?

One idea I had would be to blow the transaction ID up to 128 bits (no
more wrapping!) and have it represent the nanoseconds since the epoch.

Cheers,
M

#4Jeff Davis
pgsql@j-davis.com
In reply to: Luke Porter (#1)
Re: FW: bitemporal functionality for PostgreSQL

On Fri, 2008-02-01 at 15:42 +0000, Luke Porter wrote:

All

Is there an interest in developing bitemporal functionality in
PostgreSQL

I am very interested in this topic, and I maintain the pgsql-temporal
project at:

http://pgfoundry.org/projects/temporal/

It's missing good docs and a few other things that I'd like, but it
provides a good time interval type, including lots of useful operators,
and GiST index support functions.

For instance, you can do queries like:

SELECT att1 FROM mytable WHERE during @> '2001-05-11
01:01:01'::timestamptz;

which is a simple way to get all records where "during" contains the
point in time '2001-05-11 01:01:01'. It's also indexable with GiST,
meaning that query will perform well in a variety of situations.

I'm going to spend some time getting the docs up to speed so people can
actually use it.

Regards,
Jeff Davis

#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jeff Davis (#4)
Re: FW: bitemporal functionality for PostgreSQL

Jeff Davis wrote:

On Fri, 2008-02-01 at 15:42 +0000, Luke Porter wrote:

All

Is there an interest in developing bitemporal functionality in
PostgreSQL

I am very interested in this topic, and I maintain the pgsql-temporal
project at:

http://pgfoundry.org/projects/temporal/

It's missing good docs and a few other things that I'd like, but it
provides a good time interval type, including lots of useful operators,
and GiST index support functions.

For instance, you can do queries like:

SELECT att1 FROM mytable WHERE during @> '2001-05-11
01:01:01'::timestamptz;

which is a simple way to get all records where "during" contains the
point in time '2001-05-11 01:01:01'. It's also indexable with GiST,
meaning that query will perform well in a variety of situations.

I don't know what "bitemporal" is all about, but to me, the
pgsql-temporal approach is clearly the most flexible and attractive.
Good interval handling is not limited to just time, BTW, there's other
applications that deal with intervals of other types like floats, though
time intervals are the most common.

We can get pretty far with a pgfoundry project, providing a good
interval data type with operators, but there's a few things that would
need backend support. For example, queries like:

SELECT * FROM mytable1;
timeatt
-------
10 - 20
30 - 40

SELECT * FROM mytable2,
timeatt
-------
15 - 35

SELECT * FROM mytable1
INTERSECT
SELECT * FROM mytable2;
timeatt
-------
15 - 20
30 - 35

as well as declaring uniqueness constraints, so that there's no rows
with overlapping intervals, foreign key references on intervals etc.

I would suggest a book called "Temporal Data and the Relational Model",
by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's
interested in temporal issues. It presents a very elegant solution to
temporal issues, fully compatible with the relational model. Unlike
"time travel" kind of approaches.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#6Jeff Davis
pgsql@j-davis.com
In reply to: Heikki Linnakangas (#5)
Re: FW: bitemporal functionality for PostgreSQL

On Sun, 2008-02-03 at 10:48 +0000, Heikki Linnakangas wrote:

I don't know what "bitemporal" is all about, but to me, the
pgsql-temporal approach is clearly the most flexible and attractive.

I think what he means by "bitemporal" is what CJ Date, et al., refer to
as "fully temporal" (as opposed to semi-temporal), that is, dealing with
time intervals rather than time points.

SELECT * FROM mytable1;
timeatt
-------
10 - 20
30 - 40

SELECT * FROM mytable2,
timeatt
-------
15 - 35

SELECT * FROM mytable1
INTERSECT
SELECT * FROM mytable2;
timeatt
-------
15 - 20
30 - 35

as well as declaring uniqueness constraints, so that there's no rows
with overlapping intervals, foreign key references on intervals etc.

I'd like to see those things as well.

I think it would be relatively straightforward to define some kind of
generalized join operator (of which INTERSECT can be a special case),
but optimizing it properly would be difficult. We could extend type
definitions to optionally include the operators necessary to perform the
generalized join. We could do similar things for other relational
operators.

I think for the key constraint (which is not the same as uniqueness),
we'd need to make a new index access method (or perhaps extend GiST
somehow).

If you have any ideas, I'd like to hear them. Obviously, my ideas are
still far away from a feature proposal.

I would suggest a book called "Temporal Data and the Relational Model",
by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's
interested in temporal issues. It presents a very elegant solution to
temporal issues, fully compatible with the relational model. Unlike
"time travel" kind of approaches.

I second this statement. That's an excellent book, both analytical and
practical. I think my time interval type is fairly faithful to that
approach (although as I said, I really need to document it properly).

Regards,
Jeff Davis

#7Greg Smith
gsmith@gregsmith.com
In reply to: Jeff Davis (#6)
Re: FW: bitemporal functionality for PostgreSQL

On Sun, 3 Feb 2008, Jeff Davis wrote:

I think what he means by "bitemporal" is what CJ Date, et al., refer to
as "fully temporal" (as opposed to semi-temporal), that is, dealing with
time intervals rather than time points.

I think fully temporal is a step of complexity above how some people use
bitemporal but this terminology is slippery.

Bitemporal databases are ones where every transaction gets two timestamps:
one at transaction commit and a second that encodes what time that
transaction is valid as of. A classic example uses a bitemporal employee
promotion table to track the company org chart. With the "valid as of"
timestamp in there, if you structure the query right you can generate a
report as of any particular point in time and find out what the structure
of the organization was at that point. There's all sorts of applications
where being able to track the state not just of the current table but of
where it was at a previous/future time can be handy, and bitemporal
approaches can be easier to handle than something like a history table.

While there's useful syntax to add, I believe one challenge of bitemporal
databases from the -hackers perspective involves how to accelerate the
typical queries people run. Here's some research that looks into things
like how to index the data usefully, and that gives more references into
the bitemp literature: http://www.cs.ucr.edu/~tsotras/temporal.html

I would suggest a book called "Temporal Data and the Relational Model",
by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's
interested in temporal issues.

I think you need to be familiar with the work set down in both that one
and the Snodgrass/Jensen "Developing Time-Oriented Database Applications
in SQL" before you can even start do anything that's actually new in this
area. Bitemporal tables show up early in that book (P44 of the PDF
http://www.cs.arizona.edu/people/rts/tdbbook.pdf ) The way they use them,
the valid and transaction times are both intervals rather than points,
which I think makes the implementation there look more like Date's
fully-temporal tables.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#8Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Greg Smith (#7)
Re: FW: bitemporal functionality for PostgreSQL

Le dimanche 03 février 2008, Greg Smith a écrit :

On Sun, 3 Feb 2008, Jeff Davis wrote:

I think what he means by "bitemporal" is what CJ Date, et al., refer to
as "fully temporal" (as opposed to semi-temporal), that is, dealing with
time intervals rather than time points.

I would suggest a book called "Temporal Data and the Relational Model",
by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's
interested in temporal issues.

I think you need to be familiar with the work set down in both that one
and the Snodgrass/Jensen "Developing Time-Oriented Database Applications
in SQL" before you can even start do anything that's actually new in this
area. Bitemporal tables show up early in that book (P44 of the PDF
http://www.cs.arizona.edu/people/rts/tdbbook.pdf )

I found the following document quite useful to grasp the concepts involved, it
allowed me to decide whether I needed bitemporal feature or not (was not) :)
http://rueping.info/doc/Andreas Rüping -- 2D History.pdf

Hope this helps, regards,
--
dim

#9Jeff Davis
pgsql@j-davis.com
In reply to: Greg Smith (#7)
Re: FW: bitemporal functionality for PostgreSQL

On Sun, 2008-02-03 at 15:56 -0500, Greg Smith wrote:

On Sun, 3 Feb 2008, Jeff Davis wrote:

I think what he means by "bitemporal" is what CJ Date, et al., refer to
as "fully temporal" (as opposed to semi-temporal), that is, dealing with
time intervals rather than time points.

Bitemporal databases are ones where every transaction gets two timestamps:
one at transaction commit and a second that encodes what time that
transaction is valid as of.

Ah, so "bitemporal" means both valid and transactional time stored in
the relation. I believe the definition of "fully temporal" I was
referring to was that it used intervals rather than time points.

It's interesting that you're using time points to represent that data. I
would think time intervals would be much more useful.

I think you need to be familiar with the work set down in both that one
and the Snodgrass/Jensen "Developing Time-Oriented Database Applications
in SQL" before you can even start do anything that's actually new in this
area. Bitemporal tables show up early in that book (P44 of the PDF
http://www.cs.arizona.edu/people/rts/tdbbook.pdf ) The way they use them,
the valid and transaction times are both intervals rather than points,
which I think makes the implementation there look more like Date's
fully-temporal tables.

Although I haven't read the whole Snodgrass book, out of what I did read
it seemed much less helpful. They mix a lot of the analysis of time data
management with SQL idiosyncrasies (and specific implementations), which
is not nearly so useful in something like PostgreSQL where we can make
our own data types.

Also, they use "period" to mean interval, and "interval" to mean
duration -- which is wrong, in my opinion; interval already has a well-
defined mathematical meaning.

Regards,
Jeff Davis

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jeff Davis (#9)
Re: FW: bitemporal functionality for PostgreSQL

Jeff Davis wrote:

Also, they use "period" to mean interval, and "interval" to mean
duration -- which is wrong, in my opinion; interval already has a well-
defined mathematical meaning.

Agreed, but that mistake actually originates from the SQL standard. The
SQL INTERVAL data type is really a duration, so I guess they had to come
up with a new term for interval in the mathematical meaning.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#110123 zyxw
0123zyxw@gmail.com
In reply to: Heikki Linnakangas (#10)
Re: FW: bitemporal functionality for PostgreSQL

Heikki Linnakangas wrote:

Jeff Davis wrote:

Also, they use "period" to mean interval, and "interval" to mean
duration -- which is wrong, in my opinion; interval already has a well-
defined mathematical meaning.

Agreed, but that mistake actually originates from the SQL standard. The

And plenty of other ISO standards for around the same time, such
as ISO 8601 which defines ways to specify both intervals that have
Start and End times as well as Duration Only intervals.

#12Jeff Davis
pgsql@j-davis.com
In reply to: Heikki Linnakangas (#10)
Re: FW: bitemporal functionality for PostgreSQL

On Mon, 2008-02-04 at 20:50 +0000, Heikki Linnakangas wrote:

Jeff Davis wrote:

Also, they use "period" to mean interval, and "interval" to mean
duration -- which is wrong, in my opinion; interval already has a well-
defined mathematical meaning.

Agreed, but that mistake actually originates from the SQL standard. The
SQL INTERVAL data type is really a duration, so I guess they had to come
up with a new term for interval in the mathematical meaning.

Fair enough. However, that is just an example of the SQL idiosyncrasies
that I was referring to.

I think that kind of thing gets in the way a lot more than it helps,
which is why I found Temporal Data and the Relational Model so
refreshing.

Regards,
Jeff Davis