Create index on the year of a date column

Started by Nick Barralmost 23 years ago10 messagesgeneral
Jump to latest
#1Nick Barr
nick.barr@webbased.co.uk

Hi,

I am trying to create an index on the year of a date field, to speed up
some queries. Table structure is as follows

------------------------------------------------------------------------
----
CREATE SEQUENCE "sm_item_id_seq" start 1 increment 1 maxvalue
9223372036854775807 minvalue 1 cache 1;
CREATE TABLE "sm_item" (
"item_id" int4 DEFAULT nextval('"sm_item_id_seq"'::text) NOT
NULL,
"item_created_date_start" date,
CONSTRAINT "sm_item_pkey" PRIMARY KEY ("item_id")
) WITHOUT OIDS;
------------------------------------------------------------------------
----

And I have tried the following to create the actual index

------------------------------------------------------------------------
----
create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
------------------------------------------------------------------------
----

The response I get from psql is

------------------------------------------------------------------------
----
sm_live=# create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
ERROR: parser: parse error at or near "(" at character 52
------------------------------------------------------------------------
----

This relates to the brackets surrounding the "year from
item_created_date_start" bit.

Am I doing anything blatantly wrong? Can I actually use the extract
function for an index? Would I still get a speed improvement if I were
to just index the whole of the field, rather than just the year?

An example query that I have been running is:

------------------------------------------------------------------------
----SELECT item_id, item_created_date_start FROM sm_item WHERE
extract(year FROM item_created_date_start) = 1685;
------------------------------------------------------------------------
----

Which of course has been using a seq scan, as there is absolutely no
index on this column as yet.

Kind Regards,

Nick Barr
WebBased Ltd.

This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nick Barr (#1)
Re: Create index on the year of a date column

"Nick Barr" <nick.barr@webbased.co.uk> writes:

SELECT item_id, item_created_date_start FROM sm_item WHERE
extract(year FROM item_created_date_start) = 1685;

As of 7.4 you will actually be able to build an index on an expression
like that:

regression=# CREATE TABLE "sm_item" ("item_created_date_start" date);
CREATE TABLE
regression=# create index sm_item_cdates_idx ON sm_item ((extract(year from item_created_date_start)));
CREATE INDEX
regression=# explain SELECT * FROM sm_item WHERE
regression-# extract(year FROM item_created_date_start) = 1685;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Index Scan using sm_item_cdates_idx on sm_item (cost=0.00..17.09 rows=5 width=4)
Index Cond: (date_part('year'::text, (item_created_date_start)::timestamp without time zone) = 1685::double precision)
(2 rows)

But in current releases the best bet is a range inquiry using a plain
index:

regression=# create index fooi on sm_item (item_created_date_start);
CREATE INDEX
regression=# explain SELECT * FROM sm_item WHERE
regression-# item_created_date_start BETWEEN '1685-01-01' AND '1685-12-31';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Scan using fooi on sm_item (cost=0.00..17.08 rows=5 width=4)
Index Cond: ((item_created_date_start >= '1685-01-01'::date) AND (item_created_date_start <= '1685-12-31'::date))
(2 rows)

regards, tom lane

#3Oleg Bartunov
oleg@sai.msu.su
In reply to: Nick Barr (#1)
Re: Create index on the year of a date column

What if you create sql-function without arguments and use it to create
functional index ?

Oleg
On Thu, 5 Jun 2003, Nick Barr wrote:

Hi,

I am trying to create an index on the year of a date field, to speed up
some queries. Table structure is as follows

------------------------------------------------------------------------
----
CREATE SEQUENCE "sm_item_id_seq" start 1 increment 1 maxvalue
9223372036854775807 minvalue 1 cache 1;
CREATE TABLE "sm_item" (
"item_id" int4 DEFAULT nextval('"sm_item_id_seq"'::text) NOT
NULL,
"item_created_date_start" date,
CONSTRAINT "sm_item_pkey" PRIMARY KEY ("item_id")
) WITHOUT OIDS;
------------------------------------------------------------------------
----

And I have tried the following to create the actual index

------------------------------------------------------------------------
----
create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
------------------------------------------------------------------------
----

The response I get from psql is

------------------------------------------------------------------------
----
sm_live=# create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
ERROR: parser: parse error at or near "(" at character 52
------------------------------------------------------------------------
----

This relates to the brackets surrounding the "year from
item_created_date_start" bit.

Am I doing anything blatantly wrong? Can I actually use the extract
function for an index? Would I still get a speed improvement if I were
to just index the whole of the field, rather than just the year?

An example query that I have been running is:

------------------------------------------------------------------------
----SELECT item_id, item_created_date_start FROM sm_item WHERE
extract(year FROM item_created_date_start) = 1685;
------------------------------------------------------------------------
----

Which of course has been using a seq scan, as there is absolutely no
index on this column as yet.

Kind Regards,

Nick Barr
WebBased Ltd.

This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#4Mattias Kregert
mattias@kregert.se
In reply to: Nick Barr (#1)
Possible bug in CREATE INDEX? Was: Re: Create index on the year of a date column

I am trying to create an index on the year of a date field, to speed up
some queries. Table structure is as follows

[snip]

------------------------------------------------------------------------
And I have tried the following to create the actual index
------------------------------------------------------------------------
create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
------------------------------------------------------------------------
The response I get from psql is
------------------------------------------------------------------------
sm_live=# create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
ERROR: parser: parse error at or near "(" at character 52
------------------------------------------------------------------------

I think this is a bug. Some functions/builtins can not be used in a functional index. date_part() does not work either...

However, if you wrap it in another function it works like it should:
CREATE FUNCTION get_year (date) RETURNS double precision AS '
SELECT extract(year from $1)' LANGUAGE SQL IMMUTABLE;
CREATE INDEX sm_item_cdates_idx ON sm_item (get_year(item_created_date_start));

But I think it would be faster to simply put an index on the date, not the year part. A direct
index should be faster than a functional index.

Am I doing anything blatantly wrong? Can I actually use the extract
function for an index? Would I still get a speed improvement if I were
to just index the whole of the field, rather than just the year?
An example query that I have been running is:
------------------------------------------------------------------------
SELECT item_id, item_created_date_start FROM sm_item WHERE
extract(year FROM item_created_date_start) = 1685;
------------------------------------------------------------------------
Nick Barr

I am not sure the index code is intelligent enought to realize that the index can be used if you do the "extract()" thing, but it will work if you do it like this:
CREATE INDEX sm_item_cdates_idx ON sm_item (item_created_date_start);
SELECT item_id, item_created_date_start FROM sm_item
WHERE (item_created_date_start >= '1685-01-01' AND item_created_date_start <= '1685-12-31');

/Mattias

#5Nick Barr
nick.barr@webbased.co.uk
In reply to: Oleg Bartunov (#3)
Re: Create index on the year of a date column

Hi,

Thanks guys for all the responses, they are really helpful. To
summarise:

1. Use an index on the entire date column and use ranges to make use of
the index.
2. Create a dummy function that wraps the extract function call, and use
this as the index.
3. Create another column, for the year, filled in automatically using a
trigger and index that.
4. Wait until 7.4 which will allow me to do this.

Brilliant. I suspect I will probably choose the first one for now, and
see how that fares. I suspect, as per normal with Postgres, it will
absolutely fly ;-). However, if performance becomes more of an issue,
which it probably wont, then 7.4 will be out by then so I will use that.

Thanks again for your time and responses. Much appreciated

Kind Regards,

Nick Barr
WebBased Ltd.

This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 06 June 2003 06:29
To: Nick Barr
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Create index on the year of a date column

"Nick Barr" <nick.barr@webbased.co.uk> writes:

SELECT item_id, item_created_date_start FROM sm_item WHERE
extract(year FROM item_created_date_start) = 1685;

As of 7.4 you will actually be able to build an index on an expression
like that:

regression=# CREATE TABLE "sm_item" ("item_created_date_start" date);
CREATE TABLE
regression=# create index sm_item_cdates_idx ON sm_item ((extract(year
from item_created_date_start)));
CREATE INDEX
regression=# explain SELECT * FROM sm_item WHERE
regression-# extract(year FROM item_created_date_start) = 1685;
QUERY PLAN

------------------------------------------------------------------------
--

------------------------------------------------
Index Scan using sm_item_cdates_idx on sm_item (cost=0.00..17.09

rows=5

width=4)
Index Cond: (date_part('year'::text,
(item_created_date_start)::timestamp without time zone) = 1685::double
precision)
(2 rows)

But in current releases the best bet is a range inquiry using a plain
index:

regression=# create index fooi on sm_item (item_created_date_start);
CREATE INDEX
regression=# explain SELECT * FROM sm_item WHERE
regression-# item_created_date_start BETWEEN '1685-01-01' AND

'1685-12-

31';
QUERY PLAN

------------------------------------------------------------------------
--

Show quoted text

-------------------------------------------
Index Scan using fooi on sm_item (cost=0.00..17.08 rows=5 width=4)
Index Cond: ((item_created_date_start >= '1685-01-01'::date) AND
(item_created_date_start <= '1685-12-31'::date))
(2 rows)

regards, tom lane

#6Ron Johnson
ron.l.johnson@cox.net
In reply to: Nick Barr (#1)
Re: Create index on the year of a date column

On Thu, 2003-06-05 at 11:38, Nick Barr wrote:

Hi,

I am trying to create an index on the year of a date field, to speed up
some queries. Table structure is as follows

If, for example, you want all records where item_created_start_date
in calendar year 2001, why not create the index on i_c_s_d and say
WHERE item_created_start_date between '2001-01-01' and '2001-12-31'

-- 
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| Regarding war zones: "There's nothing sacrosanct about a  |
| hotel with a bunch of journalists in it."                 |
|     Marine Lt. Gen. Bernard E. Trainor (Retired)          |
+-----------------------------------------------------------+
#7Bruno Wolff III
bruno@wolff.to
In reply to: Nick Barr (#1)
Re: Create index on the year of a date column

On Thu, Jun 05, 2003 at 17:38:21 +0100,
Nick Barr <nick.barr@webbased.co.uk> wrote:

Hi,

I am trying to create an index on the year of a date field, to speed up
some queries. Table structure is as follows

Wouldn't an index on the plain date work as well? You would have to
make queries for years be range queries instead of equal, but the index
could be used for other things (like ordering by date).

#8Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Nick Barr (#1)
Re: Create index on the year of a date column

On Thu, 5 Jun 2003, Nick Barr wrote:

CREATE SEQUENCE "sm_item_id_seq" start 1 increment 1 maxvalue
9223372036854775807 minvalue 1 cache 1;
CREATE TABLE "sm_item" (
"item_id" int4 DEFAULT nextval('"sm_item_id_seq"'::text) NOT
NULL,
"item_created_date_start" date,
CONSTRAINT "sm_item_pkey" PRIMARY KEY ("item_id")
) WITHOUT OIDS;
------------------------------------------------------------------------
----

And I have tried the following to create the actual index

------------------------------------------------------------------------
----
create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
------------------------------------------------------------------------

In 7.4, I believe you'll be allowed to say
sm_item((extract(year from item_created_date_start)))

For now, you'd need to make an immutable function that does the extract
year from $1 and use that in the index and query, although I agree with
the other response that it might just be better to index the whole value
and use range queries instead.

#9Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Mattias Kregert (#4)
Re: Possible bug in CREATE INDEX? Was: Re: Create index

On Fri, 6 Jun 2003, Mattias Kregert wrote:

I am trying to create an index on the year of a date field, to speed up
some queries. Table structure is as follows

[snip]

------------------------------------------------------------------------
And I have tried the following to create the actual index
------------------------------------------------------------------------
create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
------------------------------------------------------------------------
The response I get from psql is
------------------------------------------------------------------------
sm_live=# create index sm_item_cdates_idx ON sm_item (extract(year from
item_created_date_start));
ERROR: parser: parse error at or near "(" at character 52
------------------------------------------------------------------------

I think this is a bug. Some functions/builtins can not be used in a

It's not, just a limitation that hadn't been taken out yet (until 7.4
basically). The syntax for 7.3 says something to the effect
of func_name(column [, ...]) so all the arguments must be columns in the
table. This means that date_part and substring and other functions which
need constants to be useful or things that aren't strictly speaking
functions really can't be used without wrapping.

functional index. date_part() does not work either...

Sure it does, if you've got another column that has the timestamp unit
string in it, not that that's really useful as a meaningful index. ;)

#10Vincent Hikida
vhikida@inreach.com
In reply to: Nick Barr (#1)
Re: Create index on the year of a date column

regression=# explain SELECT * FROM sm_item WHERE
regression-# item_created_date_start BETWEEN '1685-01-01' AND

'1685-12-31';

QUERY PLAN

Hmm, if date includes the time, shouldn't the query be:

SELECT *
FROM sm_item
WHERE item_created_date_start >= '1685-01-01'
AND item_created_date_start < '1686-01-01'

Vincent Hikida,
Member of Technical Staff - Urbana Software, Inc.
"A Personalized Learning Experience"

www.UrbanaSoft.com