Thread: How to deal with almost recurring data?

How to deal with almost recurring data?

From
matthias@cmklein.de
Date:
I am creating a database which is supposed to contain many data entries
(events) that differ only in the date they occur.

So let's say event 1 occurs every Monday, Tuesday and Sunday between
January 1st and May 30th 2005.

How do I store and manage such data in a meaningful way?

The simple idea would be to store the event itself in one table and have
another table containing all the dates (all Mondays, Tuesdays and Sundays
between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
(date, event_id).

The problem is that we are dealing with several tenthousand events,
resulting in several million single dates if I stored it in the described
manner.

That is why I would like to know if there is a better way to store and
manage such information?

Thanks

Matt

Re: How to deal with almost recurring data?

From
Csaba Nagy
Date:
I would say it all depends on what you want to do with the data.
If you want to look up all the possible occurences for an event, it
might be useful to have the simple solution you described. If you will
only look up the next n occurences starting from a given date, you might
be better off storing the rule to derive the dates, and then calculate
them in your software, but this will make your software more complicated
for sure (dealing with date arithmetics can be incredibly complex, I've
been there). I would go with the simple solution as long as there is
enough space in the DB to store all the occurences...

HTH,
Csaba.

On Tue, 2004-11-16 at 15:53, matthias@cmklein.de wrote:
> I am creating a database which is supposed to contain many data entries
> (events) that differ only in the date they occur.
>
> So let's say event 1 occurs every Monday, Tuesday and Sunday between
> January 1st and May 30th 2005.
>
> How do I store and manage such data in a meaningful way?
>
> The simple idea would be to store the event itself in one table and have
> another table containing all the dates (all Mondays, Tuesdays and Sundays
> between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
> (date, event_id).
>
> The problem is that we are dealing with several tenthousand events,
> resulting in several million single dates if I stored it in the described
> manner.
>
> That is why I would like to know if there is a better way to store and
> manage such information?
>
> Thanks
>
> Matt
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: How to deal with almost recurring data?

From
Richard Huxton
Date:
matthias@cmklein.de wrote:
> I am creating a database which is supposed to contain many data entries
> (events) that differ only in the date they occur.
>
> So let's say event 1 occurs every Monday, Tuesday and Sunday between
> January 1st and May 30th 2005.
>
> How do I store and manage such data in a meaningful way?
>
> The simple idea would be to store the event itself in one table and have
> another table containing all the dates (all Mondays, Tuesdays and Sundays
> between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
> (date, event_id).
>
> The problem is that we are dealing with several tenthousand events,
> resulting in several million single dates if I stored it in the described
> manner.

This seems like a reasonable approach. Don't forget that PostgreSQL is
designed to store large amounts of data.

Of course, if there is some pattern to the dates then you might be
better off storing the pattern rather than the individual dates.

--
   Richard Huxton
   Archonet Ltd

Re: How to deal with almost recurring data?

From
matthias@cmklein.de
Date:
Thanks for the answer.
If I go that way, though, I am afraid I will have very long query times.
And I have no idea how to handle large databases of that kind without
losing performance.

If I store the pattern - is there a way to still perform the query in the
database? Or do I have to do that in my software? (Matching the resultset
to the pattern)

Thanks

Matt


 --- Ursprüngliche Nachricht ---
Datum: 16.11.2004 16:42
Von: Richard Huxton <dev@archonet.com>
An: matthias@cmklein.de
Betreff: Re: [GENERAL] How to deal with almost recurring data?

> matthias@cmklein.de wrote:
> > I am creating a database which is supposed to contain many data
entries
> > (events) that differ only in the date they occur.
> >
> > So let's say event 1 occurs every Monday, Tuesday and Sunday between
> > January 1st and May 30th 2005.
> >
> > How do I store and manage such data in a meaningful way?
> >
> > The simple idea would be to store the event itself in one table and
have
> > another table containing all the dates (all Mondays, Tuesdays and
Sundays
> > between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>

> > (date, event_id).
> >
> > The problem is that we are dealing with several tenthousand events,
> > resulting in several million single dates if I stored it in the
described
> > manner.
>
> This seems like a reasonable approach. Don't forget that PostgreSQL is
> designed to store large amounts of data.
>
> Of course, if there is some pattern to the dates then you might be
> better off storing the pattern rather than the individual dates.
>
> --
>    Richard Huxton
>    Archonet Ltd
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: How to deal with almost recurring data?

From
matthias@cmklein.de
Date:
Thanks
The standard case is: I enter a date and 2 parameters of the event and I
get the result set.
Even though with Java date arithmetics are quite simple and I could
therefore extend my software in that fashion, I would much rather perform
all queries entirely in the database.
So in case I go about it the way I described, I will have 4 million
entries in the table that stores the dates and the references to the
event.
Now - how do I handle that database if I want to maintain very good query
times (<1-3 seconds)?
I have no idea how to handle a 1GB database in terms of query performance,
like what tricks to use, what to do and what to avoid.
Thanks

Matt


 --- Ursprüngliche Nachricht ---
Datum: 16.11.2004 16:32
Von: Csaba Nagy <nagy@ecircle-ag.com>
An: matthias@cmklein.de
Betreff: Re: [GENERAL] How to deal with almost recurring data?

> I would say it all depends on what you want to do with the data.
> If you want to look up all the possible occurences for an event, it
> might be useful to have the simple solution you described. If you will
> only look up the next n occurences starting from a given date, you
might
> be better off storing the rule to derive the dates, and then calculate
> them in your software, but this will make your software more
complicated
> for sure (dealing with date arithmetics can be incredibly complex, I've
> been there). I would go with the simple solution as long as there is
> enough space in the DB to store all the occurences...
>
> HTH,
> Csaba.
>
> On Tue, 2004-11-16 at 15:53, matthias@cmklein.de wrote:
> > I am creating a database which is supposed to contain many data
entries
> > (events) that differ only in the date they occur.
> >
> > So let's say event 1 occurs every Monday, Tuesday and Sunday between
> > January 1st and May 30th 2005.
> >
> > How do I store and manage such data in a meaningful way?
> >
> > The simple idea would be to store the event itself in one table and
have
> > another table containing all the dates (all Mondays, Tuesdays and
Sundays
> > between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>

> > (date, event_id).
> >
> > The problem is that we are dealing with several tenthousand events,
> > resulting in several million single dates if I stored it in the
described
> > manner.
> >
> > That is why I would like to know if there is a better way to store
and
> > manage such information?
> >
> > Thanks
> >
> > Matt
> >
> > ---------------------------(end of
broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>


Re: How to deal with almost recurring data?

From
Csaba Nagy
Date:
When I've said date arithmetics, I was referring to Java date
arithmetics... I've been there, and I can tell you that it's not
pleasant to have the application go haywire 2 times a year on
sommer/winter time switching, and this is not the only thing which can
go wrong - and you'll only see it after half a year when it makes your
production system unusable... that said, I managed finally to get it
right, but don't expect that Java date arithmetics will just work.
If you go for storing rules instead of dates, that will mean that you
will have to program around your queries, either in the client or in
some procedural language on the server. Java would be easier here I
guess, but make sure you test every possible date/hour/second for the
next 5 years :-)
Regarding speed for the solution storing dates, I guess it could work
fine if you have the right indexes... but I don't know too much about
this, I'm just pretty sure it will be up to your requirements if you
tune it properly.

Cheers,
Csaba.


On Tue, 2004-11-16 at 18:38, matthias@cmklein.de wrote:
> Thanks
> The standard case is: I enter a date and 2 parameters of the event and I
> get the result set.
> Even though with Java date arithmetics are quite simple and I could
> therefore extend my software in that fashion, I would much rather perform
> all queries entirely in the database.
> So in case I go about it the way I described, I will have 4 million
> entries in the table that stores the dates and the references to the
> event.
> Now - how do I handle that database if I want to maintain very good query
> times (<1-3 seconds)?
> I have no idea how to handle a 1GB database in terms of query performance,
> like what tricks to use, what to do and what to avoid.
> Thanks
>
> Matt
>
>
>  --- Ursprüngliche Nachricht ---
> Datum: 16.11.2004 16:32
> Von: Csaba Nagy <nagy@ecircle-ag.com>
> An: matthias@cmklein.de
> Betreff: Re: [GENERAL] How to deal with almost recurring data?
>
> > I would say it all depends on what you want to do with the data.
> > If you want to look up all the possible occurences for an event, it
> > might be useful to have the simple solution you described. If you will
> > only look up the next n occurences starting from a given date, you
> might
> > be better off storing the rule to derive the dates, and then calculate
> > them in your software, but this will make your software more
> complicated
> > for sure (dealing with date arithmetics can be incredibly complex, I've
> > been there). I would go with the simple solution as long as there is
> > enough space in the DB to store all the occurences...
> >
> > HTH,
> > Csaba.
> >
> > On Tue, 2004-11-16 at 15:53, matthias@cmklein.de wrote:
> > > I am creating a database which is supposed to contain many data
> entries
> > > (events) that differ only in the date they occur.
> > >
> > > So let's say event 1 occurs every Monday, Tuesday and Sunday between
> > > January 1st and May 30th 2005.
> > >
> > > How do I store and manage such data in a meaningful way?
> > >
> > > The simple idea would be to store the event itself in one table and
> have
> > > another table containing all the dates (all Mondays, Tuesdays and
> Sundays
> > > between 2005-01-01 and 2005-05-30) plus a foreign key to event_ID =>
> > > (date, event_id).
> > >
> > > The problem is that we are dealing with several tenthousand events,
> > > resulting in several million single dates if I stored it in the
> described
> > > manner.
> > >
> > > That is why I would like to know if there is a better way to store
> and
> > > manage such information?
> > >
> > > Thanks
> > >
> > > Matt
> > >
> > > ---------------------------(end of
> broadcast)---------------------------
> > > TIP 2: you can get off all lists at once with the unregister command
> > >     (send "unregister YourEmailAddressHere" to
> majordomo@postgresql.org)
> >
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 7: don't forget to increase your free space map settings
> >
>