How to deal with almost recurring data?

Started by Matthias Kleinover 21 years ago6 messagesgeneral
Jump to latest
#1Matthias Klein
matthias@cmklein.de

I am creating a database which is supposed to contain many data entries
(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and have
another table containing all the dates (all Mondays, Tuesdays and Sundays
between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the described
manner.

That is why I would like to know if there is a better way to store and
manage such information?

Thanks

Matt

#2Csaba Nagy
nagy@ecircle-ag.com
In reply to: Matthias Klein (#1)
Re: How to deal with almost recurring data?

I would say it all depends on what you want to do with the data.
If you want to look up all the possible occurences for an event, it
might be useful to have the simple solution you described. If you will
only look up the next n occurences starting from a given date, you might
be better off storing the rule to derive the dates, and then calculate
them in your software, but this will make your software more complicated
for sure (dealing with date arithmetics can be incredibly complex, I've
been there). I would go with the simple solution as long as there is
enough space in the DB to store all the occurences...

HTH,
Csaba.

Show quoted text

On Tue, 2004-11-16 at 15:53, matthias@cmklein.de wrote:

I am creating a database which is supposed to contain many data entries
(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and have
another table containing all the dates (all Mondays, Tuesdays and Sundays
between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the described
manner.

That is why I would like to know if there is a better way to store and
manage such information?

Thanks

Matt

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#3Richard Huxton
dev@archonet.com
In reply to: Matthias Klein (#1)
Re: How to deal with almost recurring data?

matthias@cmklein.de wrote:

I am creating a database which is supposed to contain many data entries
(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and have
another table containing all the dates (all Mondays, Tuesdays and Sundays
between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the described
manner.

This seems like a reasonable approach. Don't forget that PostgreSQL is
designed to store large amounts of data.

Of course, if there is some pattern to the dates then you might be
better off storing the pattern rather than the individual dates.

--
Richard Huxton
Archonet Ltd

#4Matthias Klein
matthias@cmklein.de
In reply to: Richard Huxton (#3)
Re: How to deal with almost recurring data?

Thanks for the answer.
If I go that way, though, I am afraid I will have very long query times.
And I have no idea how to handle large databases of that kind without
losing performance.

If I store the pattern - is there a way to still perform the query in the
database? Or do I have to do that in my software? (Matching the resultset
to the pattern)

Thanks

Matt

 --- Ursprüngliche Nachricht --- 
Datum: 16.11.2004 16:42
Von: Richard Huxton <dev@archonet.com>
An: matthias@cmklein.de
Betreff: Re: [GENERAL] How to deal with almost recurring data?

matthias@cmklein.de wrote:

I am creating a database which is supposed to contain many data

entries

(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and

have

another table containing all the dates (all Mondays, Tuesdays and

Sundays

between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>

(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the

described

manner.

This seems like a reasonable approach. Don't forget that PostgreSQL is
designed to store large amounts of data.

Of course, if there is some pattern to the dates then you might be
better off storing the pattern rather than the individual dates.

--
Richard Huxton
Archonet Ltd

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#5Matthias Klein
matthias@cmklein.de
In reply to: Matthias Klein (#4)
Re: How to deal with almost recurring data?

Thanks
The standard case is: I enter a date and 2 parameters of the event and I
get the result set.
Even though with Java date arithmetics are quite simple and I could
therefore extend my software in that fashion, I would much rather perform
all queries entirely in the database.
So in case I go about it the way I described, I will have 4 million
entries in the table that stores the dates and the references to the
event.
Now - how do I handle that database if I want to maintain very good query
times (<1-3 seconds)?
I have no idea how to handle a 1GB database in terms of query performance,
like what tricks to use, what to do and what to avoid.
Thanks

Matt

 --- Ursprüngliche Nachricht --- 
Datum: 16.11.2004 16:32
Von: Csaba Nagy <nagy@ecircle-ag.com>
An: matthias@cmklein.de
Betreff: Re: [GENERAL] How to deal with almost recurring data?

I would say it all depends on what you want to do with the data.
If you want to look up all the possible occurences for an event, it
might be useful to have the simple solution you described. If you will
only look up the next n occurences starting from a given date, you

might

be better off storing the rule to derive the dates, and then calculate
them in your software, but this will make your software more

complicated

for sure (dealing with date arithmetics can be incredibly complex, I've
been there). I would go with the simple solution as long as there is
enough space in the DB to store all the occurences...

HTH,
Csaba.

On Tue, 2004-11-16 at 15:53, matthias@cmklein.de wrote:

I am creating a database which is supposed to contain many data

entries

(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and

have

another table containing all the dates (all Mondays, Tuesdays and

Sundays

between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>

(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the

described

manner.

That is why I would like to know if there is a better way to store

and

manage such information?

Thanks

Matt

---------------------------(end of

broadcast)---------------------------

TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to

majordomo@postgresql.org)

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 7: don't forget to increase your free space map settings

#6Csaba Nagy
nagy@ecircle-ag.com
In reply to: Matthias Klein (#5)
Re: How to deal with almost recurring data?

When I've said date arithmetics, I was referring to Java date
arithmetics... I've been there, and I can tell you that it's not
pleasant to have the application go haywire 2 times a year on
sommer/winter time switching, and this is not the only thing which can
go wrong - and you'll only see it after half a year when it makes your
production system unusable... that said, I managed finally to get it
right, but don't expect that Java date arithmetics will just work.
If you go for storing rules instead of dates, that will mean that you
will have to program around your queries, either in the client or in
some procedural language on the server. Java would be easier here I
guess, but make sure you test every possible date/hour/second for the
next 5 years :-)
Regarding speed for the solution storing dates, I guess it could work
fine if you have the right indexes... but I don't know too much about
this, I'm just pretty sure it will be up to your requirements if you
tune it properly.

Cheers,
Csaba.

Show quoted text

On Tue, 2004-11-16 at 18:38, matthias@cmklein.de wrote:

Thanks
The standard case is: I enter a date and 2 parameters of the event and I
get the result set.
Even though with Java date arithmetics are quite simple and I could
therefore extend my software in that fashion, I would much rather perform
all queries entirely in the database.
So in case I go about it the way I described, I will have 4 million
entries in the table that stores the dates and the references to the
event.
Now - how do I handle that database if I want to maintain very good query
times (<1-3 seconds)?
I have no idea how to handle a 1GB database in terms of query performance,
like what tricks to use, what to do and what to avoid.
Thanks

Matt

--- Ursprüngliche Nachricht --- 
Datum: 16.11.2004 16:32
Von: Csaba Nagy <nagy@ecircle-ag.com>
An: matthias@cmklein.de
Betreff: Re: [GENERAL] How to deal with almost recurring data?

I would say it all depends on what you want to do with the data.
If you want to look up all the possible occurences for an event, it
might be useful to have the simple solution you described. If you will
only look up the next n occurences starting from a given date, you

might

be better off storing the rule to derive the dates, and then calculate
them in your software, but this will make your software more

complicated

for sure (dealing with date arithmetics can be incredibly complex, I've
been there). I would go with the simple solution as long as there is
enough space in the DB to store all the occurences...

HTH,
Csaba.

On Tue, 2004-11-16 at 15:53, matthias@cmklein.de wrote:

I am creating a database which is supposed to contain many data

entries

(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and

have

another table containing all the dates (all Mondays, Tuesdays and

Sundays

between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the

described

manner.

That is why I would like to know if there is a better way to store

and

manage such information?

Thanks

Matt

---------------------------(end of

broadcast)---------------------------

TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to

majordomo@postgresql.org)

---------------------------(end of

broadcast)---------------------------

TIP 7: don't forget to increase your free space map settings