Thread: Efficient date range search?

Efficient date range search?

From
mvh@ix.netcom.com
Date:
Does anybode know a good (efficient) algorithm for the following?

Imagine that I have a lot of entries of the form (sorry if the SQL is
messed up):

CREATE TABLE "pets" (
    name VARCHAR(20);
    "born" timestamp;
    "died" timestamp;
);

and I have a LOT of pets (let's say millions) and some don't live too
long (mice, fruitflies, whatever), and some do (parrots, elephants).

I would like to make a query to say

on july 4 of last year, what pets were alive?

and I would like to make this query right to the minute

on july 4 of last year at 7:01 PM what pets were alive?

I can't figure out how to index or query this in a manner that isn't
going to devolve into a linear search, which would be too slow.

Anybody run into this problem before?  Is there a known algorithm to
solve it?  Can I twist the geographic data and algorithms around to
support this?

Thanks,

Mike H.


Re: Efficient date range search?

From
"Shridhar Daithankar"
Date:
On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote:

> CREATE TABLE "pets" (
>     name VARCHAR(20);
>     "born" timestamp;
>     "died" timestamp;
> );
>
> and I have a LOT of pets (let's say millions) and some don't live too
> long (mice, fruitflies, whatever), and some do (parrots, elephants).
>
> I would like to make a query to say
>
> on july 4 of last year, what pets were alive?
>
> and I would like to make this query right to the minute
>
> on july 4 of last year at 7:01 PM what pets were alive?

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

These will be alive pets then.. Should be pretty efficient.

Bye
 Shridhar

--
QOTD:    Money isn't everything, but at least it keeps the kids in touch.


Re: Efficient date range search?

From
Jean-Luc Lachance
Date:
If the pet is still alive today died would be NULL and the where clause
would not be true.

How about this:

On insert to pets, set the date to 9999-12-31.
On the deth of a pet update the died field.

Create an index on died.

select * from pets where died > {whatever date}

will return the pets that were alive on that date.


JLL


Shridhar Daithankar wrote:
>
> On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote:
>
> > CREATE TABLE "pets" (
> >       name VARCHAR(20);
> >       "born" timestamp;
> >       "died" timestamp;
> > );
> >
> > and I have a LOT of pets (let's say millions) and some don't live too
> > long (mice, fruitflies, whatever), and some do (parrots, elephants).
> >
> > I would like to make a query to say
> >
> > on july 4 of last year, what pets were alive?
> >
> > and I would like to make this query right to the minute
> >
> > on july 4 of last year at 7:01 PM what pets were alive?
>
> Create an index on died field. And query like
>
> select * from pets where died < "last year july 4 7:01 PM;
>
> These will be alive pets then.. Should be pretty efficient.
>
> Bye
>  Shridhar
>
> --
> QOTD:   Money isn't everything, but at least it keeps the kids in touch.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Efficient date range search?

From
Alvaro Herrera
Date:
On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:
> Shridhar Daithankar wrote:
> > Create an index on died field. And query like
> >
> > select * from pets where died < "last year july 4 7:01 PM;

> If the pet is still alive today died would be NULL and the where clause
> would not be true.

In that case check for NULL explicitly,

select * from pets where died > [date] or died is null;

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?"  (Mafalda)

Re: Efficient date range search?

From
"Nigel J. Andrews"
Date:

SELECT *
 FROM pets
 WHERE
  born <= '2001-07-04 07:01:00+00'
 AND
  (
   died > '2001-07-04 07:01:00+00'
  OR
   died is NULL
  )

Efficient? Well that depends on data distribution, indexes and the 'goodness'
of choice by the planner. One presumes given the data set that can be rewritten
numerous ways to experiment on obtaining the best like spliting each half of
the died test into two queries combined using UNION.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants


On Mon, 7 Oct 2002, Jean-Luc Lachance wrote:

> If the pet is still alive today died would be NULL and the where clause
> would not be true.
>
> How about this:
>
> On insert to pets, set the date to 9999-12-31.
> On the deth of a pet update the died field.
>
> Create an index on died.
>
> select * from pets where died > {whatever date}
>
> will return the pets that were alive on that date.
>
>
> JLL
>
>
> Shridhar Daithankar wrote:
> >
> > On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote:
> >
> > > CREATE TABLE "pets" (
> > >       name VARCHAR(20);
> > >       "born" timestamp;
> > >       "died" timestamp;
> > > );
> > >
> > > and I have a LOT of pets (let's say millions) and some don't live too
> > > long (mice, fruitflies, whatever), and some do (parrots, elephants).
> > >
> > > I would like to make a query to say
> > >
> > > on july 4 of last year, what pets were alive?
> > >
> > > and I would like to make this query right to the minute
> > >
> > > on july 4 of last year at 7:01 PM what pets were alive?
> >
> > Create an index on died field. And query like
> >
> > select * from pets where died < "last year july 4 7:01 PM;
> >
> > These will be alive pets then.. Should be pretty efficient.
> >
> > Bye
> >  Shridhar
> >
> > --
> > QOTD:   Money isn't everything, but at least it keeps the kids in touch.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>



Re: Efficient date range search?

From
Jean-Luc Lachance
Date:
Alvaro Herrera wrote:
>
> On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:
> > Shridhar Daithankar wrote:
> > > Create an index on died field. And query like
> > >
> > > select * from pets where died < "last year july 4 7:01 PM;
>
> > If the pet is still alive today died would be NULL and the where clause
> > would not be true.
>
> In that case check for NULL explicitly,
>
> select * from pets where died > [date] or died is null;


Then you're back to whole table scan... :(

>
> --
> Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
> "Porque francamente, si para saber manejarse a uno mismo hubiera que
> rendir examen... ¿Quién es el machito que tendría carnet?"  (Mafalda)

Re: Efficient date range search?

From
Jean-Luc Lachance
Date:
And that is supposed to be more efficient then
select * from pets where died > {whatever date};

C'mon...


"Nigel J. Andrews" wrote:
>
> SELECT *
>  FROM pets
>  WHERE
>   born <= '2001-07-04 07:01:00+00'
>  AND
>   (
>    died > '2001-07-04 07:01:00+00'
>   OR
>    died is NULL
>   )
>
> Efficient? Well that depends on data distribution, indexes and the 'goodness'
> of choice by the planner. One presumes given the data set that can be rewritten
> numerous ways to experiment on obtaining the best like spliting each half of
> the died test into two queries combined using UNION.
>
> --
> Nigel J. Andrews
> Director
>
> ---
> Logictree Systems Limited
> Computer Consultants
>
> On Mon, 7 Oct 2002, Jean-Luc Lachance wrote:
>
> > If the pet is still alive today died would be NULL and the where clause
> > would not be true.
> >
> > How about this:
> >
> > On insert to pets, set the date to 9999-12-31.
> > On the deth of a pet update the died field.
> >
> > Create an index on died.
> >
> > select * from pets where died > {whatever date}
> >
> > will return the pets that were alive on that date.
> >
> >
> > JLL

Re: Efficient date range search?

From
Bruno Wolff III
Date:
On Mon, Oct 07, 2002 at 13:32:09 -0400,
  Jean-Luc Lachance <jllachan@nsd.ca> wrote:
> Alvaro Herrera wrote:
> >
> > On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:
> > > Shridhar Daithankar wrote:
> > > > Create an index on died field. And query like
> > > >
> > > > select * from pets where died < "last year july 4 7:01 PM;
> >
> > > If the pet is still alive today died would be NULL and the where clause
> > > would not be true.
> >
> > In that case check for NULL explicitly,
> >
> > select * from pets where died > [date] or died is null;
>
>
> Then you're back to whole table scan... :(

You could use 'infinity'::timestamp as a code for pets that are currently
alive instead of null.

Re: Efficient date range search?

From
Jean-Luc Lachance
Date:
DEFAULT 'infinity' is much better than my  DEFAULT '9999-12-31', I
agree.



Bruno Wolff III wrote:
>
> On Mon, Oct 07, 2002 at 13:32:09 -0400,
>   Jean-Luc Lachance <jllachan@nsd.ca> wrote:
> > Alvaro Herrera wrote:
> > >
> > > On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:
> > > > Shridhar Daithankar wrote:
> > > > > Create an index on died field. And query like
> > > > >
> > > > > select * from pets where died < "last year july 4 7:01 PM;
> > >
> > > > If the pet is still alive today died would be NULL and the where clause
> > > > would not be true.
> > >
> > > In that case check for NULL explicitly,
> > >
> > > select * from pets where died > [date] or died is null;
> >
> >
> > Then you're back to whole table scan... :(
>
> You could use 'infinity'::timestamp as a code for pets that are currently
> alive instead of null.

Re: Efficient date range search?

From
"Shridhar Daithankar"
Date:
On 7 Oct 2002 at 13:32, Jean-Luc Lachance wrote:

> > select * from pets where died > [date] or died is null;
> Then you're back to whole table scan... :(

Well, if there is an index on died, it's rather an indexed scan as opposed to a
sequential scan as you are mentioning(hopefully).

Indexed scan are usually pretty quick if you select small amount of data from
entire table set. e.g.  10 pets from a million tuples..

Bye
 Shridhar

--
character density, n.:    The number of very weird people in the office.


Re: Efficient date range search?

From
"Nigel J. Andrews"
Date:
Actually it is supposed to give correct results

As a bonus it follows standard DB practices rather than falling into the 9/9/99
trap which everyone was so worried about a couple of years ago.


On Mon, 7 Oct 2002, Jean-Luc Lachance wrote:

> And that is supposed to be more efficient then
> select * from pets where died > {whatever date};
>
> C'mon...
>
>
> "Nigel J. Andrews" wrote:
> >
> > SELECT *
> >  FROM pets
> >  WHERE
> >   born <= '2001-07-04 07:01:00+00'
> >  AND
> >   (
> >    died > '2001-07-04 07:01:00+00'
> >   OR
> >    died is NULL
> >   )