GiST rtree logic is not right
It occurred to me to wonder whether contrib/rtree_gist fixes the rtree
bug documented here:
http://archives.postgresql.org/pgsql-general/2004-03/msg01143.php
The answer is unfortunately "no". In the regression database,
install rtree_gist and do:
regression=# create table gist_emp4000 as select * from slow_emp4000;
SELECT
regression=# create index grect2ind ON gist_emp4000 USING gist (home_base);
CREATE INDEX
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2144
(1 row)
The correct answer is
regression=# select count(*) from slow_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2214
(1 row)
Now this is noticeably better than the rtree implementation, which finds
only 1363 rows, but broken is still broken :-(
This is doubtless not as high priority as the concurrency stuff you
are working on, but it'd be good to fix anyway. I was thinking of
proposing that we move rtree_gist into the core --- but the case for
it would be stronger if it worked correctly ...
regards, tom lane
I'll look at problem after GiST concurrency. Fixing rtree_gist is bug a fix, not
a new feature, so I'm not limited by 1 July.
This is doubtless not as high priority as the concurrency stuff you
are working on, but it'd be good to fix anyway. I was thinking of
proposing that we move rtree_gist into the core --- but the case for
it would be stronger if it worked correctly ...
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
I'll look at problem after GiST concurrency. Fixing
rtree_gist is bug a fix, not a new feature, so I'm not
limited by 1 July.
Wont fixing rtree(_gist) require initdb, since the behaviour of the
operators will change?
... John
Import Notes
Resolved by subject fallback
Hmm,
something is broken in HEAD. I recall it worked in STABLE, here is what I have
Welcome to psql 8.0.3, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
regression=# create index grect2ind ON gist_emp4000 USING gist (home_base);
CREATE INDEX
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404)
,(35546,5360)';
count
-------
2214
(1 row)
btw, Teodor, I noticed gist_print from gevel package doesn't works in HEAD
regression=# select * from gist_print('grect2ind') as t(level int, a box) where
level =1;
ERROR: function return row and query-specified return row do not match
DETAIL: Returned row contains 3 attributes, but query expects 2.
Oleg
On Wed, 22 Jun 2005, Tom Lane wrote:
It occurred to me to wonder whether contrib/rtree_gist fixes the rtree
bug documented here:
http://archives.postgresql.org/pgsql-general/2004-03/msg01143.phpThe answer is unfortunately "no". In the regression database,
install rtree_gist and do:regression=# create table gist_emp4000 as select * from slow_emp4000;
SELECT
regression=# create index grect2ind ON gist_emp4000 USING gist (home_base);
CREATE INDEX
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2144
(1 row)The correct answer is
regression=# select count(*) from slow_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2214
(1 row)Now this is noticeably better than the rtree implementation, which finds
only 1363 rows, but broken is still broken :-(This is doubtless not as high priority as the concurrency stuff you
are working on, but it'd be good to fix anyway. I was thinking of
proposing that we move rtree_gist into the core --- but the case for
it would be stronger if it worked correctly ...regards, tom lane
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes:
something is broken in HEAD. I recall it worked in STABLE, here is what I have
No, actually it is broken in 8.0 too. The difference is that 8.0
defaults to not using an indexscan for this query. In 8.0 I see:
regression=# set enable_seqscan TO 1;
SET
regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
QUERY PLAN
---------------------------------------------------------------------
Aggregate (cost=65.53..65.53 rows=1 width=0)
-> Seq Scan on gist_emp4000 (cost=0.00..64.75 rows=310 width=0)
Filter: (home_base << '(35565,5404),(35546,5360)'::box)
(3 rows)
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2214
(1 row)
regression=# set enable_seqscan TO 0;
SET
regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=112.96..112.96 rows=1 width=0)
-> Index Scan using grect2ind on gist_emp4000 (cost=0.00..112.18 rows=310 width=0)
Index Cond: (home_base << '(35565,5404),(35546,5360)'::box)
(3 rows)
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2144
(1 row)
In CVS tip the default plan is:
regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
QUERY PLAN
--------------------------------------------------------------------------------
Aggregate (cost=35.74..35.74 rows=1 width=0)
-> Bitmap Heap Scan on gist_emp4000 (cost=5.08..34.96 rows=310 width=0)
Recheck Cond: (home_base << '(35565,5404),(35546,5360)'::box)
-> Bitmap Index Scan on grect2ind (cost=0.00..5.08 rows=310 width=0)
Index Cond: (home_base << '(35565,5404),(35546,5360)'::box)
(5 rows)
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
count
-------
2144
(1 row)
regards, tom lane
"John Hansen" <john@geeknet.com.au> writes:
I'll look at problem after GiST concurrency. Fixing
rtree_gist is bug a fix, not a new feature, so I'm not
limited by 1 July.
Wont fixing rtree(_gist) require initdb, since the behaviour of the
operators will change?
Possibly, but we never guarantee no initdb until final release anyway.
Teodor is right that this is a bug fix and so can be postponed on its
own terms. But moving rtree_gist into the core looks like a feature
change to me, so if that's going to happen it has to happen before
1 July. It would be a lot easier to sell that if it gave the right
answers ;-)
regards, tom lane