Thread: Efficient date range search?
Does anybode know a good (efficient) algorithm for the following? Imagine that I have a lot of entries of the form (sorry if the SQL is messed up): CREATE TABLE "pets" ( name VARCHAR(20); "born" timestamp; "died" timestamp; ); and I have a LOT of pets (let's say millions) and some don't live too long (mice, fruitflies, whatever), and some do (parrots, elephants). I would like to make a query to say on july 4 of last year, what pets were alive? and I would like to make this query right to the minute on july 4 of last year at 7:01 PM what pets were alive? I can't figure out how to index or query this in a manner that isn't going to devolve into a linear search, which would be too slow. Anybody run into this problem before? Is there a known algorithm to solve it? Can I twist the geographic data and algorithms around to support this? Thanks, Mike H.
On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote: > CREATE TABLE "pets" ( > name VARCHAR(20); > "born" timestamp; > "died" timestamp; > ); > > and I have a LOT of pets (let's say millions) and some don't live too > long (mice, fruitflies, whatever), and some do (parrots, elephants). > > I would like to make a query to say > > on july 4 of last year, what pets were alive? > > and I would like to make this query right to the minute > > on july 4 of last year at 7:01 PM what pets were alive? Create an index on died field. And query like select * from pets where died < "last year july 4 7:01 PM; These will be alive pets then.. Should be pretty efficient. Bye Shridhar -- QOTD: Money isn't everything, but at least it keeps the kids in touch.
If the pet is still alive today died would be NULL and the where clause would not be true. How about this: On insert to pets, set the date to 9999-12-31. On the deth of a pet update the died field. Create an index on died. select * from pets where died > {whatever date} will return the pets that were alive on that date. JLL Shridhar Daithankar wrote: > > On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote: > > > CREATE TABLE "pets" ( > > name VARCHAR(20); > > "born" timestamp; > > "died" timestamp; > > ); > > > > and I have a LOT of pets (let's say millions) and some don't live too > > long (mice, fruitflies, whatever), and some do (parrots, elephants). > > > > I would like to make a query to say > > > > on july 4 of last year, what pets were alive? > > > > and I would like to make this query right to the minute > > > > on july 4 of last year at 7:01 PM what pets were alive? > > Create an index on died field. And query like > > select * from pets where died < "last year july 4 7:01 PM; > > These will be alive pets then.. Should be pretty efficient. > > Bye > Shridhar > > -- > QOTD: Money isn't everything, but at least it keeps the kids in touch. > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote: > Shridhar Daithankar wrote: > > Create an index on died field. And query like > > > > select * from pets where died < "last year july 4 7:01 PM; > If the pet is still alive today died would be NULL and the where clause > would not be true. In that case check for NULL explicitly, select * from pets where died > [date] or died is null; -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Porque francamente, si para saber manejarse a uno mismo hubiera que rendir examen... ¿Quién es el machito que tendría carnet?" (Mafalda)
SELECT * FROM pets WHERE born <= '2001-07-04 07:01:00+00' AND ( died > '2001-07-04 07:01:00+00' OR died is NULL ) Efficient? Well that depends on data distribution, indexes and the 'goodness' of choice by the planner. One presumes given the data set that can be rewritten numerous ways to experiment on obtaining the best like spliting each half of the died test into two queries combined using UNION. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants On Mon, 7 Oct 2002, Jean-Luc Lachance wrote: > If the pet is still alive today died would be NULL and the where clause > would not be true. > > How about this: > > On insert to pets, set the date to 9999-12-31. > On the deth of a pet update the died field. > > Create an index on died. > > select * from pets where died > {whatever date} > > will return the pets that were alive on that date. > > > JLL > > > Shridhar Daithankar wrote: > > > > On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote: > > > > > CREATE TABLE "pets" ( > > > name VARCHAR(20); > > > "born" timestamp; > > > "died" timestamp; > > > ); > > > > > > and I have a LOT of pets (let's say millions) and some don't live too > > > long (mice, fruitflies, whatever), and some do (parrots, elephants). > > > > > > I would like to make a query to say > > > > > > on july 4 of last year, what pets were alive? > > > > > > and I would like to make this query right to the minute > > > > > > on july 4 of last year at 7:01 PM what pets were alive? > > > > Create an index on died field. And query like > > > > select * from pets where died < "last year july 4 7:01 PM; > > > > These will be alive pets then.. Should be pretty efficient. > > > > Bye > > Shridhar > > > > -- > > QOTD: Money isn't everything, but at least it keeps the kids in touch. > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Alvaro Herrera wrote: > > On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote: > > Shridhar Daithankar wrote: > > > Create an index on died field. And query like > > > > > > select * from pets where died < "last year july 4 7:01 PM; > > > If the pet is still alive today died would be NULL and the where clause > > would not be true. > > In that case check for NULL explicitly, > > select * from pets where died > [date] or died is null; Then you're back to whole table scan... :( > > -- > Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) > "Porque francamente, si para saber manejarse a uno mismo hubiera que > rendir examen... ¿Quién es el machito que tendría carnet?" (Mafalda)
And that is supposed to be more efficient then select * from pets where died > {whatever date}; C'mon... "Nigel J. Andrews" wrote: > > SELECT * > FROM pets > WHERE > born <= '2001-07-04 07:01:00+00' > AND > ( > died > '2001-07-04 07:01:00+00' > OR > died is NULL > ) > > Efficient? Well that depends on data distribution, indexes and the 'goodness' > of choice by the planner. One presumes given the data set that can be rewritten > numerous ways to experiment on obtaining the best like spliting each half of > the died test into two queries combined using UNION. > > -- > Nigel J. Andrews > Director > > --- > Logictree Systems Limited > Computer Consultants > > On Mon, 7 Oct 2002, Jean-Luc Lachance wrote: > > > If the pet is still alive today died would be NULL and the where clause > > would not be true. > > > > How about this: > > > > On insert to pets, set the date to 9999-12-31. > > On the deth of a pet update the died field. > > > > Create an index on died. > > > > select * from pets where died > {whatever date} > > > > will return the pets that were alive on that date. > > > > > > JLL
On Mon, Oct 07, 2002 at 13:32:09 -0400, Jean-Luc Lachance <jllachan@nsd.ca> wrote: > Alvaro Herrera wrote: > > > > On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote: > > > Shridhar Daithankar wrote: > > > > Create an index on died field. And query like > > > > > > > > select * from pets where died < "last year july 4 7:01 PM; > > > > > If the pet is still alive today died would be NULL and the where clause > > > would not be true. > > > > In that case check for NULL explicitly, > > > > select * from pets where died > [date] or died is null; > > > Then you're back to whole table scan... :( You could use 'infinity'::timestamp as a code for pets that are currently alive instead of null.
DEFAULT 'infinity' is much better than my DEFAULT '9999-12-31', I agree. Bruno Wolff III wrote: > > On Mon, Oct 07, 2002 at 13:32:09 -0400, > Jean-Luc Lachance <jllachan@nsd.ca> wrote: > > Alvaro Herrera wrote: > > > > > > On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote: > > > > Shridhar Daithankar wrote: > > > > > Create an index on died field. And query like > > > > > > > > > > select * from pets where died < "last year july 4 7:01 PM; > > > > > > > If the pet is still alive today died would be NULL and the where clause > > > > would not be true. > > > > > > In that case check for NULL explicitly, > > > > > > select * from pets where died > [date] or died is null; > > > > > > Then you're back to whole table scan... :( > > You could use 'infinity'::timestamp as a code for pets that are currently > alive instead of null.
On 7 Oct 2002 at 13:32, Jean-Luc Lachance wrote: > > select * from pets where died > [date] or died is null; > Then you're back to whole table scan... :( Well, if there is an index on died, it's rather an indexed scan as opposed to a sequential scan as you are mentioning(hopefully). Indexed scan are usually pretty quick if you select small amount of data from entire table set. e.g. 10 pets from a million tuples.. Bye Shridhar -- character density, n.: The number of very weird people in the office.
Actually it is supposed to give correct results As a bonus it follows standard DB practices rather than falling into the 9/9/99 trap which everyone was so worried about a couple of years ago. On Mon, 7 Oct 2002, Jean-Luc Lachance wrote: > And that is supposed to be more efficient then > select * from pets where died > {whatever date}; > > C'mon... > > > "Nigel J. Andrews" wrote: > > > > SELECT * > > FROM pets > > WHERE > > born <= '2001-07-04 07:01:00+00' > > AND > > ( > > died > '2001-07-04 07:01:00+00' > > OR > > died is NULL > > )