query for a time interval

Started by Markover 20 years ago12 messagesgeneral

sendmailtomark@yahoo.com

over 20 years ago

Hello everybody,

I'm looking for an elegant SQL statement that will work in
Postgresql, MySQL and ORACLE.
The query will be executed by Java client.

To have this query for Postgresql is priority number one.

In this query I try to get a list of message Ids that expired.

time_to_live is in seconds.

SELECT id
FROM mq
WHERE now - start_date > time_to_live;

I have a following table:

CREATE TABLE mq
{
msg_id INTEGER,
retry_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1),
start_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1),
time_to_live INTEGER
}

Thanks!
Mark.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

SunWuKung

Balazs.Klein@axelero.hu

over 20 years ago

In reply to: Mark (#1)

view or index to optimize performance

Hello everybody,

I have a table that stores responses to questions in different
questionnaires. This table will grow to millions of rows.

My problem is that while most of the data in the table are rarely used in
queries one type of response will be used quite often: biodata - name,
gender, e-mail and this sort of. This data is also collected as responses to
questionnaires.

My question: what is the best strategy if I wanted to quickly retrieve
biodata from this table:

CREATE TABLE "public"."itemresponse" (
"testoccasionid" INTEGER NOT NULL,
"itemorder" SMALLINT NOT NULL,
"response" TEXT NOT NULL,
"bio" INTEGER DEFAULT 0 NOT NULL,
"datatype" SMALLINT NOT NULL,
CONSTRAINT "ItemResponseText_pk" PRIMARY KEY("testoccasionid",
"itemorder"),
CONSTRAINT "ItemResponseText_TestOccasionID_fkey" FOREIGN KEY
("testoccasionid")
REFERENCES "public"."testoccasion"("testoccasionid")
ON DELETE NO ACTION
ON UPDATE NO ACTION
NOT DEFERRABLE
) WITH OIDS;

I can store the fact that it is biodata in the bio field - it is biodata if
the value of that field is not 0 and I can index that field and simply use
that as one of the conditions in queries.

Or should I instead create a view that contains only the biodata and select
from that? But will postgres use the indexes than? Would that be a better
approach?

Thanks for the help.
SWK

Peter Eisentraut

peter_e@gmx.net

over 20 years ago

In reply to: SunWuKung (#2)

Re: view or index to optimize performance

Am Mittwoch, 21. Dezember 2005 21:27 schrieb Klein Balázs:

Or should I instead create a view that contains only the biodata and select
from that? But will postgres use the indexes than? Would that be a better
approach?

Whether the query is executed by a view or typed in in its full form by hand
is completely irrelevant to the question whether indexes are used or should
be created. Views do not optimize anything.

SunWuKung

Balazs.Klein@axelero.hu

over 20 years ago

In reply to: Peter Eisentraut (#3)

Re: view or index to optimize performance

I thought that if I used a view to retrieve data its content might be cached
so it would make the query faster.

Am Mittwoch, 21. Dezember 2005 21:27 schrieb Klein Balázs:

Or should I instead create a view that contains only the biodata and

select

from that? But will postgres use the indexes than? Would that be a better
approach?

Whether the query is executed by a view or typed in in its full form by hand

is completely irrelevant to the question whether indexes are used or should
be created. Views do not optimize anything.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Richard Huxton

dev@archonet.com

over 20 years ago

In reply to: SunWuKung (#2)

Re: view or index to optimize performance

Klein Balázs wrote:

Hello everybody,

I have a table that stores responses to questions in different
questionnaires. This table will grow to millions of rows.

[snip]

I can store the fact that it is biodata in the bio field - it is biodata if
the value of that field is not 0 and I can index that field and simply use
that as one of the conditions in queries.

Or should I instead create a view that contains only the biodata and select
from that? But will postgres use the indexes than? Would that be a better
approach?

Create the index on the table and a view should use it.

You might like to read up on partial indexes where you can do something
like:

CREATE INDEX my_index ON itemresponse (testoccasionid,itemorder)
WHERE bio > 0;

So long as your query/view definition has WHERE bio > 0 in it then this
index can be used.

HTH
--
Richard Huxton
Archonet Ltd

Jim Nasby

Jim.Nasby@BlueTreble.com

over 20 years ago

In reply to: SunWuKung (#4)

Re: view or index to optimize performance

On Wed, Dec 21, 2005 at 10:49:29PM +0100, Klein Bal?zs wrote:

I thought that if I used a view to retrieve data its content might be cached
so it would make the query faster.

No. A view is essentially exactly the same as inserting the view
definition into the query that's using it. IE:

CREATE VIEW v AS SELECT * FROM t;

SELECT * FROM v becomes:

SELECT * FROM (SELECT * FROM t) v;

What you could do is partition the table so that critical information is
stored in a smaller table while everything else goes to a larger table.
You can then do a UNION ALL view on top of that to 'glue' the two tables
together. You can even define rules so that you can do updates on the
view. http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ has an
example that's similar to this. Note that you'll need an appropriate
index on the large table so that PostgreSQL can quickly tell it doesn't
contain values that are in the small table. Or, in 8.1 you could use a
constraint. You could also do this with inherited tables instead of
views.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

Jim Nasby

Jim.Nasby@BlueTreble.com

over 20 years ago

In reply to: Mark (#1)

Re: query for a time interval

On Wed, Dec 21, 2005 at 11:52:56AM -0800, Mark wrote:

SELECT id
FROM mq
WHERE now - start_date > time_to_live;

The problem is you can't use an index on this, because you'd need to
index on (now() - start_date), which obviously wouldn't work. Instead,
re-write the WHERE as:

WHERE start_date < now() - time_to_live
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

SunWuKung

Balazs.Klein@axelero.hu

over 20 years ago

In reply to: Jim Nasby (#6)

Re: view or index to optimize performance

thanks for the help

What you could do is partition the table so that critical information is
stored in a smaller table while everything else goes to a larger table.

I was thinking the other way round - maybe I can split the large table by
creating a materialized view. But than I read that it is maybe unnecessary
to create a materialized view because pg

"Materialized views sound a useful workaround, if your database doesn't have
a query cache. If you do have a query cache, then you already effectively
have eager or lazy materialized views (depending on your isolation level):
Just use your normal view (or query) and let the database figure it out."
Quote from Farce Pest in
http://spyced.blogspot.com/2005/05/materialized-views-in-postgresql.html

But later in the same blog it seems to indicate that there is a choice to
either use or not use the query cache of pg.

So I don't know now how this cache works and whether it could help me in
this.

SWK

Tom Lane

tgl@sss.pgh.pa.us

over 20 years ago

In reply to: SunWuKung (#8)

Re: view or index to optimize performance

=?iso-8859-1?Q?Klein_Bal=E1zs?= <Balazs.Klein@axelero.hu> writes:

But later in the same blog it seems to indicate that there is a choice to
either use or not use the query cache of pg.

Hm? There is no query cache in PG.

regards, tom lane

#10

Michael Fuhr

mike@fuhr.org

over 20 years ago

In reply to: Jim Nasby (#7)

Re: query for a time interval

On Thu, Dec 22, 2005 at 09:47:11AM -0600, Jim C. Nasby wrote:

On Wed, Dec 21, 2005 at 11:52:56AM -0800, Mark wrote:

SELECT id
FROM mq
WHERE now - start_date > time_to_live;

The problem is you can't use an index on this, because you'd need to
index on (now() - start_date), which obviously wouldn't work. Instead,
re-write the WHERE as:

WHERE start_date < now() - time_to_live

Unless I'm missing something that wouldn't use an index either,
because the planner wouldn't know what value to compare start_date
against without hitting each row to find that row's time_to_live.
But something like this should be able to use an expression index
on (start_date + time_to_live):

WHERE start_date + time_to_live < now()

--
Michael Fuhr

#11

Tom Lane

tgl@sss.pgh.pa.us

over 20 years ago

In reply to: Michael Fuhr (#10)

Re: query for a time interval

Michael Fuhr <mike@fuhr.org> writes:

Unless I'm missing something that wouldn't use an index either,
because the planner wouldn't know what value to compare start_date
against without hitting each row to find that row's time_to_live.
But something like this should be able to use an expression index
on (start_date + time_to_live):

WHERE start_date + time_to_live < now()

Or perhaps simpler, get rid of time_to_live in favor of an end_date
column, which could be indexed directly.

regards, tom lane

#12

Bruno Wolff III

bruno@wolff.to

over 20 years ago

In reply to: Mark (#1)

Re: query for a time interval

On Wed, Dec 21, 2005 at 11:52:56 -0800,
Mark <sendmailtomark@yahoo.com> wrote:

Hello everybody,

I'm looking for an elegant SQL statement that will work in
Postgresql, MySQL and ORACLE.
The query will be executed by Java client.

To have this query for Postgresql is priority number one.

In this query I try to get a list of message Ids that expired.

time_to_live is in seconds.

SELECT id
FROM mq
WHERE now - start_date > time_to_live;

An interval is not going to be comparable to an integer without casting.
If you use explicit casting the query probably won't be portable. If there
is some implicit cast that makes the above SQL valid, it is probably a cast
to text which won't do what you want.

I have a following table:

CREATE TABLE mq
{
msg_id INTEGER,
retry_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1),
start_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1),
time_to_live INTEGER
}

Can you make time_to_live an interval?

Show quoted text

Thanks!
Mark.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match