Select "todays" timestamps in an index friendly way

Started by Lutz Hornover 7 years ago13 messagesgeneral
Jump to latest
#1Lutz Horn
lutz.horn@posteo.de

Hi,

I am looking for a way to select all timestamps that are "today" in an
index friendly way. This select should not depend on the concrete value
of "today".

Given a table

create temporary table t (
id SERIAL primary key,
ts timestamp not null default now()
);

with some data

insert into t (ts)
select ts
from generate_series(
'2018-01-01T00:00:01'::timestamp,
'2018-12-31T23:59:59'::timestamp,
'2 minutes')
as ts;

and an index

create index on t (ts, id);

I can of course make an explicit select for `ts` values that are
"today":

select ts, id
from t
where ts >= '2018-10-23T00:00:00'::timestamp
and ts <= '2018-10-23T23:59:59'::timestamp;

This uses an Bitmap Index Scan on `t_ts_id_idx`. Good.

But the where conditions depends on concrete values of "today" which
will not return the intended result if I execute it tomorrow. I will
have to change the where condition. Not good.

I am looking for a way to make the where condition independed of the
date of execution. I can create a function

create function is_today(timestamp) returns boolean as $$
select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD');
$$ language sql;

that converts the timestamps to text. But using this function

select * from t where is_today(ts);

will not benefit from the existing index. A Seq Scan on `t` will be
used. Not good.

Is there a way to have both: be independed of the concrete value of
"today" *and* use the index on the timestamp column?

Lutz

#2Francisco Olarte
folarte@peoplecall.com
In reply to: Lutz Horn (#1)
Re: Select "todays" timestamps in an index friendly way

Hi Lutz.

On Tue, Oct 23, 2018 at 11:38 AM, Lutz Horn <lutz.horn@posteo.de> wrote:

I am looking for a way to select all timestamps that are "today" in an
index friendly way. This select should not depend on the concrete value
of "today".

Given a table
create temporary table t (
id SERIAL primary key,
ts timestamp not null default now()
);

with some data

insert into t (ts)
select ts
from generate_series(
'2018-01-01T00:00:01'::timestamp,
'2018-12-31T23:59:59'::timestamp,
'2 minutes')
as ts;

and an index

create index on t (ts, id);

I can of course make an explicit select for `ts` values that are
"today":

select ts, id
from t
where ts >= '2018-10-23T00:00:00'::timestamp
and ts <= '2018-10-23T23:59:59'::timestamp;

1st remark. Do NOT use closed interval for timestamps. Always use
half-open or you'll run into problems ( i.e., you are going to miss
2018-10-23T23:59:59.25 in that query ). For real like things ( which
timestamps are, they identify a point on the time line ) use half-open
( you can cover a line with non-overlapping half-open segments, not
with closed ones ).

I.e., your query will better be stated as

where ts >= '2018-10-23T00:00:00'::timestamp
and ts < '2018-10-24T00:00:00'::timestamp;

Which, as a nice bonus, can rely on the time part defaulting to 0:

where ts >= '2018-10-23'::timestamp
and ts < '2018-10-24'::timestamp;

and then be expressed in other ways, like

where ts >= '2018-10-23'::timestamp
and ts < ('2018-10-23'::timestamp + '1 day'::interval)

This uses an Bitmap Index Scan on `t_ts_id_idx`. Good.

But the where conditions depends on concrete values of "today" which
will not return the intended result if I execute it tomorrow. I will
have to change the where condition. Not good.

I am looking for a way to make the where condition independed of the
date of execution. I can create a function

create function is_today(timestamp) returns boolean as $$
select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD');
$$ language sql;

This is not a good way to deal with timestamp values, they are just
numbers, play with them as such. Try using something like

date_trunc('day',now()) = date_trunc('day',$1)

which states your purposes more clearly.

that converts the timestamps to text. But using this function

select * from t where is_today(ts);

will not benefit from the existing index. A Seq Scan on `t` will be
used. Not good.

Is there a way to have both: be independed of the concrete value of
"today" *and* use the index on the timestamp column?

Well, if your definition of today is 'same value as now() when
truncated to days' we can use part of what I've written above,
1st calculate today and tomorrow with same timestamp arithmetic and date_trunc:

select now(), date_trunc('day',now()) as today,
date_trunc('day',now()+'1 day') as tomorrow;
now | today | tomorrow
-------------------------------+------------------------+------------------------
2018-10-23 11:58:01.699407+02 | 2018-10-23 00:00:00+02 | 2018-10-24 00:00:00+02
(1 row)

Then plug that result in your query ( using the half-open technique )
described above:

where ts >= date_trunc('day',now())
and ts < date_trunc('day',now()+'1 day') as tomorrow;

IIRC this should use the index, you can RTFM in case you prefer using
current_timestamp and her cousins, but bear in mind if you use
something like current_date you should convert it to timestamp, not
convert ts to date, to get easy index usage.

Francisco Olarte.

#3Thomas Kellerer
spam_eater@gmx.net
In reply to: Lutz Horn (#1)
Re: Select "todays" timestamps in an index friendly way

Lutz Horn schrieb am 23.10.2018 um 11:38:

I can of course make an explicit select for `ts` values that are
"today":

select ts, id
from t
where ts >= '2018-10-23T00:00:00'::timestamp
and ts <= '2018-10-23T23:59:59'::timestamp;

This uses an Bitmap Index Scan on `t_ts_id_idx`. Good.

But the where conditions depends on concrete values of "today" which
will not return the intended result if I execute it tomorrow. I will
have to change the where condition. Not good.

I am looking for a way to make the where condition independed of the
date of execution. I can create a function

create function is_today(timestamp) returns boolean as $$
select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD');
$$ language sql;

that converts the timestamps to text. But using this function

select * from t where is_today(ts);

will not benefit from the existing index. A Seq Scan on `t` will be
used. Not good.

Is there a way to have both: be independed of the concrete value of
"today" *and* use the index on the timestamp column?

I typically use:

where ts >= date '2018-10-23'
and ts < date '2018-10-23' + 1

#4Lutz Horn
lutz.horn@posteo.de
In reply to: Francisco Olarte (#2)
Re: Select "todays" timestamps in an index friendly way

Hi Francisco,

On Tue, Oct 23, 2018 at 12:05:17PM +0200, Francisco Olarte wrote:

1st remark. Do NOT use closed interval for timestamps. Always use
half-open or you'll run into problems

Good point, thanks.

where ts >= date_trunc('day',now())
and ts < date_trunc('day',now()+'1 day') as tomorrow;

IIRC this should use the index

And it does! Thanks!

Lutz

#5Lutz Horn
lutz.horn@posteo.de
In reply to: Thomas Kellerer (#3)
Re: Select "todays" timestamps in an index friendly way

Hi Thomas,

On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote:

I typically use:

where ts >= date '2018-10-23'
and ts < date '2018-10-23' + 1

But here the date is an explicit value. Francisco reworded my question:

if your definition of today is 'same value as now() when truncated to
days'

That's what I am (was, thanks to Francisco) looking for.

Lutz

#6Thomas Kellerer
spam_eater@gmx.net
In reply to: Lutz Horn (#5)
Re: Select "todays" timestamps in an index friendly way

Lutz Horn schrieb am 23.10.2018 um 12:19:

Hi Thomas,

On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote:

I typically use:

where ts >= date '2018-10-23'
and ts < date '2018-10-23' + 1

But here the date is an explicit value. Francisco reworded my question:

if your definition of today is 'same value as now() when truncated to
days'

That's what I am (was, thanks to Francisco) looking for.

Then use current_date:

where ts >= current_date
and ts < current_date + 1

#7Hellmuth Vargas
hivs77@gmail.com
In reply to: Thomas Kellerer (#6)
Re: Select "todays" timestamps in an index friendly way

Hi

El mar., 23 de oct. de 2018 a la(s) 05:41, Thomas Kellerer (
spam_eater@gmx.net) escribió:

Lutz Horn schrieb am 23.10.2018 um 12:19:

Hi Thomas,

On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote:

I typically use:

where ts >= date '2018-10-23'
and ts < date '2018-10-23' + 1

But here the date is an explicit value. Francisco reworded my question:

if your definition of today is 'same value as now() when truncated to
days'

That's what I am (was, thanks to Francisco) looking for.

Then use current_date:

where ts >= current_date
and ts < current_date + 1

this is equally valid?

where ts >= current_date

cordialmente:

Hellmuth Vargas

#8Francisco Olarte
folarte@peoplecall.com
In reply to: Hellmuth Vargas (#7)
Re: Select "todays" timestamps in an index friendly way

On Tue, Oct 23, 2018 at 3:05 PM, Hellmuth Vargas <hivs77@gmail.com> wrote:
...

Then use current_date:

where ts >= current_date
and ts < current_date + 1

this is equally valid?

where ts >= current_date

It'is not as the problem was stated. Although ts defaulted to now(),
and it is probably defaulted, nothing prohibits him from inserting
timestamps in the future.

Also, I'll point the table used in the sample ( bigserial+timestamp)
does not seem like a real one and the "timestamps in today" pattern
is commonly used in calendaring applications, which usually insert
appointments in the future and recover this way to print "todays
schedule".

Francisco Olarte.

#9Lutz Horn
lutz.horn@posteo.de
In reply to: Francisco Olarte (#8)
Re: Select "todays" timestamps in an index friendly way

On Tue, Oct 23, 2018 at 03:50:14PM +0200, Francisco Olarte wrote:

It'is not as the problem was stated. Although ts defaulted to now(),
and it is probably defaulted, nothing prohibits him from inserting
timestamps in the future.

Yes, this table is only used as an example for the technical question.
In my real use case there are columns like "due_date" which usually
contain future dates inserted by application code.

the "timestamps in today" pattern is commonly used in calendaring
applications, which usually insert appointments in the future and
recover this way to print "todays schedule".

Exactly. The application must be able to execute queries like "give me
all my tasks due today" without having to use a concrete value for
"today".

Lutz

#10Francisco Olarte
folarte@peoplecall.com
In reply to: Lutz Horn (#9)
Re: Select "todays" timestamps in an index friendly way

On Tue, Oct 23, 2018 at 3:57 PM, Lutz Horn <lutz.horn@posteo.de> wrote:

On Tue, Oct 23, 2018 at 03:50:14PM +0200, Francisco Olarte wrote:

It'is not as the problem was stated. Although ts defaulted to now(),
and it is probably defaulted, nothing prohibits him from inserting
timestamps in the future.

Yes, this table is only used as an example for the technical question.
In my real use case there are columns like "due_date" which usually
contain future dates inserted by application code.

If your real table uses dates instead of timestamps modify the code
accordingly, they are not the same ( dates are countable, instants in
time are not (they are in the computer, with finite precision, but you
see the difference )) Although I supose they really are timestamps, or
you would have just used "date_column=current_date".

the "timestamps in today" pattern is commonly used in calendaring
applications, which usually insert appointments in the future and
recover this way to print "todays schedule".

Exactly. The application must be able to execute queries like "give me
all my tasks due today" without having to use a concrete value for
"today".

Been there, done that. With an IBM 84 ( instructional use. It was, not
surprissingly, easier but slower, ).

Happy hacking.
Francisco Olarte.

#11Mike Rylander
mrylander@gmail.com
In reply to: Lutz Horn (#1)
Re: Select "todays" timestamps in an index friendly way

On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn <lutz.horn@posteo.de> wrote:

Hi,

I am looking for a way to select all timestamps that are "today" in an
index friendly way. This select should not depend on the concrete value
of "today".

Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html
on table 8.13, you can use special input values:

SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp;

HTH,

--
Mike Rylander
| Executive Director
| Equinox Open Library Initiative
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@equinoxinitiative.org
| web: http://equinoxinitiative.org

#12David Rowley
dgrowleyml@gmail.com
In reply to: Mike Rylander (#11)
Re: Select "todays" timestamps in an index friendly way

On 24 October 2018 at 07:14, Mike Rylander <mrylander@gmail.com> wrote:

On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn <lutz.horn@posteo.de> wrote:

I am looking for a way to select all timestamps that are "today" in an
index friendly way. This select should not depend on the concrete value
of "today".

Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html
on table 8.13, you can use special input values:

SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp;

Of course, you'd need to be careful never to use that in a view or
even a PREPAREd statement. Those abbreviations are evaluated when the
query is parsed. In those cases, you'd just get the results for
whatever day you did CREATE VIEW or PREPARE.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#13Steven Lembark
lembark@wrkhors.com
In reply to: Lutz Horn (#1)
Re: Select "todays" timestamps in an index friendly way

create temporary table t (
id SERIAL primary key,
ts timestamp not null default now()
);

* add date( ts ) as a field and index date = now()::date.

* Depending on the amount of data in your table the date
may not be seletive enough to be worth using, at which
point the index may be present and ignored. Only way to
be sure is analyze it.

* Might be worth looking at a partial index using >= 00:00:00
and < 24:00:00 (PG grocks the 2400 notation for "midnight at
the end of today) or

where ts && tsrange( ... 00:00:00, ... 24:00:00, '[)] )

Nice thing about the partial index is that you can create it
on all of the non-ts fields for fast lookup by whatever and
only index the portion for today.

* Think about using a materialized view rather than a temp
table. May prove simpler to query.

--
Steven Lembark 3920 10th Ave South
Workhorse Computing Birmingham, AL 35222
lembark@wrkhors.com +1 888 359 3508