Another Ltree/GiST problem
I've a table that stores hierarchical information in a ltree field for fast access, indexed with a GiST index.
Interesting Problem with ltree/GiST index: sometimes, the index will get corrupt, as seen by the examples below:
WRONG
db=# SELECT path from strukturelemente where path ~ '142.2330445.2330526.5358672.5358675.5358752.*';
path
------
(0 rows)
RIGHT
db=# SELECT path from strukturelemente where path ~ '*.2330445.2330526.5358672.5358675.5358752.*';
path
---------------------------------------------
142.2330445.2330526.5358672.5358675.5358752
(1 row)
db=# SELECT path from strukturelemente where path ~ '*.5358752.*'; path
---------------------------------------------
142.2330445.2330526.5358672.5358675.5358752
(1 row)
db=# EXPLAIN ANALYZE SELECT path from strukturelemente where path ~ '*.5358752.*';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on strukturelemente (cost=2.05..27.99 rows=14 width=45) (actual time=4.617..4.618 rows=1 loops=1)
Recheck Cond: (path ~ '*.5358752.*'::lquery)
-> Bitmap Index Scan on str_uk4 (cost=0.00..2.05 rows=14 width=0) (actual time=4.604..4.604 rows=1 loops=1)
Index Cond: (path ~ '*.5358752.*'::lquery)
Total runtime: 4.690 ms
(5 rows)
db=# EXPLAIN ANALYZE SELECT path from strukturelemente where path ~ '142.2330445.2330526.5358672.5358675.5358752.*';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on strukturelemente (cost=2.05..27.99 rows=14 width=45) (actual time=0.030..0.030 rows=0 loops=1)
Recheck Cond: (path ~ '142.2330445.2330526.5358672.5358675.5358752.*'::lquery)
-> Bitmap Index Scan on str_uk4 (cost=0.00..2.05 rows=14 width=0) (actual time=0.027..0.027 rows=0 loops=1)
Index Cond: (path ~ '142.2330445.2330526.5358672.5358675.5358752.*'::lquery)
Total runtime: 0.081 ms
(5 rows)
db=# SELECT path from strukturelemente where path ~ '142.2330445.2330526.5358672.5358675.*.5358752.*';
path
------
(0 rows)
When doing sequential scans (set enable_indexscan to off and set enable_bitmapscan to off) everything works as expected.
db=# SELECT path from strukturelemente where path ~ '142.2330445.2330526.5358672.*.5358675.5358752.*';
path
---------------------------------------------
142.2330445.2330526.5358672.5358675.5358752
(1 row)
After an update (and turning index access on), everything will work fine again:
UPDATE strukturelemente set id=id where id=5358752;
I get this:
db=# SELECT path from strukturelemente where path ~ '142.2330445.2330526.5358672.5358675.5358752.*';
path
---------------------------------------------
142.2330445.2330526.5358672.5358675.5358752
(1 row)
db=# SELECT path from strukturelemente where path ~ '142.2330445.2330526.5358672.5358675.*';
path
---------------------------------------------
142.2330445.2330526.5358672.5358675.5358752
(1 row)
Reindexing the index also fixes the problem, but regular reindexing cannot be a solution, since there is still a timeframe where wrong results are returned.
I've made a copy of the index-file when it was broken, after the update and after reindexing. Is there a tool for getting a human-readable dump of the index?
Using PostgreSQL 8.1.4 with ltree version from CVS (since there's another bug fixed in CVS). Problem occurs on a 2 node cluster of 2 Quad-CPU system (Xeon, 2 physical CPU's, 2 hyperthreading), one node running the database, the other one the webserver.
Any ideas?
Best regards,
Mario Weilguni
Reindexing the index also fixes the problem, but regular reindexing cannot be a solution, since there is still a timeframe where wrong results are returned.
I've made a copy of the index-file when it was broken, after the update and after reindexing. Is there a tool for getting a human-readable dump of the index?
You can play with gevel module
http://www.sai.msu.su/~megera/postgres/gist/
http://www.sai.msu.su/~megera/postgres/gist/gevel/README.gevel
Using PostgreSQL 8.1.4 with ltree version from CVS (since there's another bug fixed in CVS). Problem occurs on a 2 node cluster of 2 Quad-CPU system (Xeon, 2 physical CPU's, 2 hyperthreading), one node running the database, the other one the webserver.
Any ideas?
Oops, wait a bit: problem is founded in gist_between() call.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Oops, wait a bit: problem is founded in gist_between() call.
Fixed in 7.3 - HEAD versions. Thank you for feedback: those bugs can be found
only on high-loaded boxes.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/