Index on points
Hello.
If I have a table like this
create table fleet ( ship_id integer, location point);
and fill it with a lot of ships and their locations and then want to
create an index on this to speed up operations on finding ships within
a certain region (let's say its a rectangular region), how do I do
this?
I tried:
CREATE INDEX my_index ON fleet USING gist ( box(location,location)); ?
and that command worked, but I have no idea why and what I have just
done. The docs are a little to advanced.
How should one do it?
On Thu, 2010-09-23 at 12:45 +0200, A B wrote:
Hello.
If I have a table like this
create table fleet ( ship_id integer, location point);
I recommend taking a look into PostGIS: http://postgis.org
Regards,
Jeff Davis
But how do I do it without Postgis?
Right now I have a table
fleet (id bigserial primary key, location point);
and I have filled it with random data and then tried selecting
explain analyze select count(1) from fleet where location <@
box(point(300,300),point(600,600));
to gather runningtime data, and then I have created and index (or I
think I have atleast)
CREATE INDEX fleet_location ON fleet USING GIST ( box(location,location) );
but I still get almost exaclty the same run time of the query
explain analyze select count(1) from fleet where location <@
box(point(300,300),point(600,600));
Aggregate (cost=100597.89..100597.90 rows=1 width=0) (actual
time=706.604..706.605 rows=1 loops=1)
-> Seq Scan on fleet (cost=0.00..94347.90 rows=2499996 width=0)
(actual time=0.252..701.624 rows=4533 loops=1)
Filter: (location <@ '(600,600),(300,300)'::box)
Total runtime: 706.675 ms
I have then tried to avoid the seq. scan by
set enable_seqscan=off;
set seq_page_cost=4000; (which would make it more expensive to scan,
wouldn't it?)
and the result is the same
Aggregate (cost=10127460749.89..10127460749.90 rows=1 width=0)
(actual time=799.077..799.078 rows=1 loops=1)
-> Seq Scan on fleet (cost=10000000000.00..10127454499.90
rows=2499996 width=0) (actual time=0.221..792.374 rows=4533 loops=1)
Filter: (location <@ '(600,600),(300,300)'::box)
Total runtime: 799.117 ms
So how do I create an index that gets used?
(I've run the queries a thousand times to make sure the total runtime
is consistent, and it is)
2010/9/23 Jeff Davis <pgsql@j-davis.com>:
Show quoted text
On Thu, 2010-09-23 at 12:45 +0200, A B wrote:
Hello.
If I have a table like this
create table fleet ( ship_id integer, location point);
I recommend taking a look into PostGIS: http://postgis.org
Regards,
Jeff Davis
On 23/09/10 11:45, A B wrote:
Hello.
If I have a table like this
create table fleet ( ship_id integer, location point);
and fill it with a lot of ships and their locations and then want to
create an index on this to speed up operations on finding ships within
a certain region (let's say its a rectangular region), how do I do
this?I tried:
CREATE INDEX my_index ON fleet USING gist ( box(location,location)); ?
That's the idea, but you'll need to be careful about how you're
searching against it. Remember, the index is on a box based on the
location, not the point location itself.
CREATE TABLE fleet (ship int, locn point);
INSERT INTO fleet SELECT (x*1000 + y), point(x,y)
FROM generate_series(0,999) x, generate_series(0,999) y;
CREATE INDEX fleet_locn_idx ON fleet USING gist( box(locn,locn) );
ANALYSE fleet;
EXPLAIN ANALYSE SELECT count(*) FROM fleet
WHERE box(locn,locn) <@ box '(10,10),(20,20)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2654.84..2654.85 rows=1 width=0) (actual
time=4.611..4.612 rows=1 loops=1)
-> Bitmap Heap Scan on fleet (cost=44.34..2652.33 rows=1000
width=0) (actual time=4.344..4.491 rows=121 loops=1)
Recheck Cond: (box(locn, locn) <@ '(20,20),(10,10)'::box)
-> Bitmap Index Scan on fleet_locn_idx (cost=0.00..44.09
rows=1000 width=0) (actual time=4.311..4.311 rows=121 loops=1)
Index Cond: (box(locn, locn) <@ '(20,20),(10,10)'::box)
Total runtime: 4.694 ms
(6 rows)
DROP INDEX fleet_locn_idx;
EXPLAIN ANALYSE SELECT count(*) FROM fleet WHERE box(locn,locn) <@ box
'(10,10),(20,20)';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=20885.50..20885.51 rows=1 width=0) (actual
time=551.756..551.757 rows=1 loops=1)
-> Seq Scan on fleet (cost=0.00..20883.00 rows=1000 width=0)
(actual time=5.142..551.624 rows=121 loops=1)
Filter: (box(locn, locn) <@ '(20,20),(10,10)'::box)
Total runtime: 551.831 ms
(4 rows)
--
Richard Huxton
Archonet Ltd
On Sat, 2010-09-25 at 09:18 +0200, A B wrote:
fleet (id bigserial primary key, location point);
...
CREATE INDEX fleet_location ON fleet USING GIST ( box(location,location) );
but I still get almost exaclty the same run time of the query
explain analyze select count(1) from fleet where location <@
box(point(300,300),point(600,600));Aggregate (cost=100597.89..100597.90 rows=1 width=0) (actual
time=706.604..706.605 rows=1 loops=1)
-> Seq Scan on fleet (cost=0.00..94347.90 rows=2499996 width=0)
(actual time=0.252..701.624 rows=4533 loops=1)
Filter: (location <@ '(600,600),(300,300)'::box)
Total runtime: 706.675 ms
...
So how do I create an index that gets used?
(I've run the queries a thousand times to make sure the total runtime
is consistent, and it is)
Looks like there's no entry in the box_ops opclass for point <@ box, but
there is an entry for box <@ box.
So, try:
explain analyze
select count(1) from fleet
where box(location,location) <@ box(point(300,300),point(600,600));
There's no reason that there couldn't be a point <@ box operator in the
opclass, but nobody really uses these geometric types that come with
core postgres (at least, not that I can tell). PostGIS is a dedicated
project that has removed most of the justification for trying to improve
the built-in geometric types. However, keep in mind that PostGIS is
under a different license (GPL, I think).
Regards,
Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes:
There's no reason that there couldn't be a point <@ box operator in the
opclass, but nobody really uses these geometric types that come with
core postgres (at least, not that I can tell).
Actually, as of 9.0 there is a point_ops opclass for GIST, with these
indexable operators:
^(point,point)
<<(point,point)
(point,point)
<^(point,point)
~=(point,point)
<@(point,box)
<@(point,polygon)
<@(point,circle)
I agree that for any more than light-duty geometric work, you ought
to look at PostGIS.
regards, tom lane
2010/9/25 Tom Lane <tgl@sss.pgh.pa.us>:
Jeff Davis <pgsql@j-davis.com> writes:
There's no reason that there couldn't be a point <@ box operator in the
opclass, but nobody really uses these geometric types that come with
core postgres (at least, not that I can tell).Actually, as of 9.0 there is a point_ops opclass for GIST, with these
indexable operators:>^(point,point)
<<(point,point)
>>(point,point)
<^(point,point)
~=(point,point)
<@(point,box)
<@(point,polygon)
<@(point,circle)I agree that for any more than light-duty geometric work, you ought
to look at PostGIS.regards, tom lane
Thank you Jeff for your reply, that solved the problem.
Tom, would you like to elaborate on that PostGIS should be used for
other than "light-duty" geometric work?
Is it speed, accuracy or features that is the difference?
For this project I think <@(point,box) is sufficient. What would it
take to motivate a switch to PostGIS for that?
Best wishes.
Sorry, Gmail made med confused, my biggest "thank you" was to Richard
Huxton, who showed me code that worked.
2010/9/26 A B <gentosaker@gmail.com>:
Show quoted text
2010/9/25 Tom Lane <tgl@sss.pgh.pa.us>:
Jeff Davis <pgsql@j-davis.com> writes:
There's no reason that there couldn't be a point <@ box operator in the
opclass, but nobody really uses these geometric types that come with
core postgres (at least, not that I can tell).Actually, as of 9.0 there is a point_ops opclass for GIST, with these
indexable operators:>^(point,point)
<<(point,point)
>>(point,point)
<^(point,point)
~=(point,point)
<@(point,box)
<@(point,polygon)
<@(point,circle)I agree that for any more than light-duty geometric work, you ought
to look at PostGIS.regards, tom lane
Thank you Jeff for your reply, that solved the problem.
Tom, would you like to elaborate on that PostGIS should be used for
other than "light-duty" geometric work?
Is it speed, accuracy or features that is the difference?
For this project I think <@(point,box) is sufficient. What would it
take to motivate a switch to PostGIS for that?Best wishes.