Efficient date range search?

Started by Mike Hardingover 23 years ago11 messagesgeneral
Jump to latest
#1Mike Harding
mvh@ix.netcom.com

Does anybode know a good (efficient) algorithm for the following?

Imagine that I have a lot of entries of the form (sorry if the SQL is
messed up):

CREATE TABLE "pets" (
name VARCHAR(20);
"born" timestamp;
"died" timestamp;
);

and I have a LOT of pets (let's say millions) and some don't live too
long (mice, fruitflies, whatever), and some do (parrots, elephants).

I would like to make a query to say

on july 4 of last year, what pets were alive?

and I would like to make this query right to the minute

on july 4 of last year at 7:01 PM what pets were alive?

I can't figure out how to index or query this in a manner that isn't
going to devolve into a linear search, which would be too slow.

Anybody run into this problem before? Is there a known algorithm to
solve it? Can I twist the geographic data and algorithms around to
support this?

Thanks,

Mike H.

#2Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Mike Harding (#1)
Re: Efficient date range search?

On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote:

CREATE TABLE "pets" (
name VARCHAR(20);
"born" timestamp;
"died" timestamp;
);

and I have a LOT of pets (let's say millions) and some don't live too
long (mice, fruitflies, whatever), and some do (parrots, elephants).

I would like to make a query to say

on july 4 of last year, what pets were alive?

and I would like to make this query right to the minute

on july 4 of last year at 7:01 PM what pets were alive?

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

These will be alive pets then.. Should be pretty efficient.

Bye
Shridhar

--
QOTD: Money isn't everything, but at least it keeps the kids in touch.

#3Jean-Luc Lachance
jllachan@nsd.ca
In reply to: Shridhar Daithankar (#2)
Re: Efficient date range search?

If the pet is still alive today died would be NULL and the where clause
would not be true.

How about this:

On insert to pets, set the date to 9999-12-31.
On the deth of a pet update the died field.

Create an index on died.

select * from pets where died > {whatever date}

will return the pets that were alive on that date.

JLL

Shridhar Daithankar wrote:

Show quoted text

On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote:

CREATE TABLE "pets" (
name VARCHAR(20);
"born" timestamp;
"died" timestamp;
);

and I have a LOT of pets (let's say millions) and some don't live too
long (mice, fruitflies, whatever), and some do (parrots, elephants).

I would like to make a query to say

on july 4 of last year, what pets were alive?

and I would like to make this query right to the minute

on july 4 of last year at 7:01 PM what pets were alive?

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

These will be alive pets then.. Should be pretty efficient.

Bye
Shridhar

--
QOTD: Money isn't everything, but at least it keeps the kids in touch.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#4Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Jean-Luc Lachance (#3)
Re: Efficient date range search?

On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:

Shridhar Daithankar wrote:

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

If the pet is still alive today died would be NULL and the where clause
would not be true.

In that case check for NULL explicitly,

select * from pets where died > [date] or died is null;

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... �Qui�n es el machito que tendr�a carnet?" (Mafalda)

#5Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Jean-Luc Lachance (#3)
Re: Efficient date range search?

SELECT *
FROM pets
WHERE
born <= '2001-07-04 07:01:00+00'
AND
(
died > '2001-07-04 07:01:00+00'
OR
died is NULL
)

Efficient? Well that depends on data distribution, indexes and the 'goodness'
of choice by the planner. One presumes given the data set that can be rewritten
numerous ways to experiment on obtaining the best like spliting each half of
the died test into two queries combined using UNION.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

On Mon, 7 Oct 2002, Jean-Luc Lachance wrote:

Show quoted text

If the pet is still alive today died would be NULL and the where clause
would not be true.

How about this:

On insert to pets, set the date to 9999-12-31.
On the deth of a pet update the died field.

Create an index on died.

select * from pets where died > {whatever date}

will return the pets that were alive on that date.

JLL

Shridhar Daithankar wrote:

On 4 Oct 2002 at 23:35, mvh@ix.netcom.com wrote:

CREATE TABLE "pets" (
name VARCHAR(20);
"born" timestamp;
"died" timestamp;
);

and I have a LOT of pets (let's say millions) and some don't live too
long (mice, fruitflies, whatever), and some do (parrots, elephants).

I would like to make a query to say

on july 4 of last year, what pets were alive?

and I would like to make this query right to the minute

on july 4 of last year at 7:01 PM what pets were alive?

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

These will be alive pets then.. Should be pretty efficient.

Bye
Shridhar

--
QOTD: Money isn't everything, but at least it keeps the kids in touch.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#6Jean-Luc Lachance
jllachan@nsd.ca
In reply to: Shridhar Daithankar (#2)
Re: Efficient date range search?

Alvaro Herrera wrote:

On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:

Shridhar Daithankar wrote:

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

If the pet is still alive today died would be NULL and the where clause
would not be true.

In that case check for NULL explicitly,

select * from pets where died > [date] or died is null;

Then you're back to whole table scan... :(

Show quoted text

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?" (Mafalda)

#7Jean-Luc Lachance
jllachan@nsd.ca
In reply to: Nigel J. Andrews (#5)
Re: Efficient date range search?

And that is supposed to be more efficient then
select * from pets where died > {whatever date};

C'mon...

"Nigel J. Andrews" wrote:

Show quoted text

SELECT *
FROM pets
WHERE
born <= '2001-07-04 07:01:00+00'
AND
(
died > '2001-07-04 07:01:00+00'
OR
died is NULL
)

Efficient? Well that depends on data distribution, indexes and the 'goodness'
of choice by the planner. One presumes given the data set that can be rewritten
numerous ways to experiment on obtaining the best like spliting each half of
the died test into two queries combined using UNION.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

On Mon, 7 Oct 2002, Jean-Luc Lachance wrote:

If the pet is still alive today died would be NULL and the where clause
would not be true.

How about this:

On insert to pets, set the date to 9999-12-31.
On the deth of a pet update the died field.

Create an index on died.

select * from pets where died > {whatever date}

will return the pets that were alive on that date.

JLL

#8Bruno Wolff III
bruno@wolff.to
In reply to: Jean-Luc Lachance (#6)
Re: Efficient date range search?

On Mon, Oct 07, 2002 at 13:32:09 -0400,
Jean-Luc Lachance <jllachan@nsd.ca> wrote:

Alvaro Herrera wrote:

On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:

Shridhar Daithankar wrote:

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

If the pet is still alive today died would be NULL and the where clause
would not be true.

In that case check for NULL explicitly,

select * from pets where died > [date] or died is null;

Then you're back to whole table scan... :(

You could use 'infinity'::timestamp as a code for pets that are currently
alive instead of null.

#9Jean-Luc Lachance
jllachan@nsd.ca
In reply to: Shridhar Daithankar (#2)
Re: Efficient date range search?

DEFAULT 'infinity' is much better than my DEFAULT '9999-12-31', I
agree.

Bruno Wolff III wrote:

Show quoted text

On Mon, Oct 07, 2002 at 13:32:09 -0400,
Jean-Luc Lachance <jllachan@nsd.ca> wrote:

Alvaro Herrera wrote:

On Mon, Oct 07, 2002 at 12:11:35PM -0400, Jean-Luc Lachance wrote:

Shridhar Daithankar wrote:

Create an index on died field. And query like

select * from pets where died < "last year july 4 7:01 PM;

If the pet is still alive today died would be NULL and the where clause
would not be true.

In that case check for NULL explicitly,

select * from pets where died > [date] or died is null;

Then you're back to whole table scan... :(

You could use 'infinity'::timestamp as a code for pets that are currently
alive instead of null.

#10Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Jean-Luc Lachance (#6)
Re: Efficient date range search?

On 7 Oct 2002 at 13:32, Jean-Luc Lachance wrote:

select * from pets where died > [date] or died is null;

Then you're back to whole table scan... :(

Well, if there is an index on died, it's rather an indexed scan as opposed to a
sequential scan as you are mentioning(hopefully).

Indexed scan are usually pretty quick if you select small amount of data from
entire table set. e.g. 10 pets from a million tuples..

Bye
Shridhar

--
character density, n.: The number of very weird people in the office.

#11Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Jean-Luc Lachance (#7)
Re: Efficient date range search?

Actually it is supposed to give correct results

As a bonus it follows standard DB practices rather than falling into the 9/9/99
trap which everyone was so worried about a couple of years ago.

On Mon, 7 Oct 2002, Jean-Luc Lachance wrote:

Show quoted text

And that is supposed to be more efficient then
select * from pets where died > {whatever date};

C'mon...

"Nigel J. Andrews" wrote:

SELECT *
FROM pets
WHERE
born <= '2001-07-04 07:01:00+00'
AND
(
died > '2001-07-04 07:01:00+00'
OR
died is NULL
)