BUG #6232: hstore operator ? no longer uses indexes
The following bug has been logged online:
Bug reference: 6232
Logged by: Pierre Ducroquet
Email address: p.psql@pinaraf.info
PostgreSQL version: 9.1.1
Operating system: Linux Debian, amd64
Description: hstore operator ? no longer uses indexes
Details:
The following code, when executed with postgresql 8.4 or 9.0, uses the
idx_toto_h index, while it is unable to do so under postgresql 9.1
-- Sample code
DROP TABLE toto;
CREATE TABLE toto (id integer, h hstore);
INSERT INTO toto SELECT cast( random() * 1000 as integer) as i,
hstore(cast(cast( random() * 1000 as integer) as
text), 'a')
FROM generate_series(1,100000);
CREATE INDEX idx_toto_h ON toto USING gist(h);
ANALYZE toto;
EXPLAIN ANALYZE SELECT * from toto where h ? '500';
-- PostgreSQL 9.0 output
QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------
Bitmap Heap Scan on toto (cost=5.05..271.70 rows=100 width=20) (actual
time=0.627..1.273 rows=89 loops=1)
Recheck Cond: (h ? '500'::text)
-> Bitmap Index Scan on idx_toto_h (cost=0.00..5.03 rows=100 width=0)
(actual time=0.554..0.554 rows=669 loops=1)
Index Cond: (h ? '500'::text)
Total runtime: 1.373 ms
(5 rows)
-- postgresql 9.1 output
QUERY PLAN
----------------------------------------------------------------------------
--------------------------
Seq Scan on toto (cost=0.00..1887.00 rows=100 width=20) (actual
time=0.433..57.834 rows=91 loops=1)
Filter: (h ? '500'::text)
Total runtime: 57.929 ms
(3 rows)
Thanks
On Wed, Sep 28, 2011 at 3:50 PM, Pierre Ducroquet <p.psql@pinaraf.info> wrote:
The following bug has been logged online:
Bug reference: 6232
Logged by: Pierre Ducroquet
Email address: p.psql@pinaraf.info
PostgreSQL version: 9.1.1
Operating system: Linux Debian, amd64
Description: hstore operator ? no longer uses indexes
Details:The following code, when executed with postgresql 8.4 or 9.0, uses the
idx_toto_h index, while it is unable to do so under postgresql 9.1-- Sample code
DROP TABLE toto;
CREATE TABLE toto (id integer, h hstore);
INSERT INTO toto SELECT cast( random() * 1000 as integer) as i,
hstore(cast(cast( random() * 1000 as integer) as
text), 'a')
FROM generate_series(1,100000);
CREATE INDEX idx_toto_h ON toto USING gist(h);
ANALYZE toto;
EXPLAIN ANALYZE SELECT * from toto where h ? '500';-- PostgreSQL 9.0 output
QUERY PLAN----------------------------------------------------------------------------
-------------------------------------------
Bitmap Heap Scan on toto (cost=5.05..271.70 rows=100 width=20) (actual
time=0.627..1.273 rows=89 loops=1)
Recheck Cond: (h ? '500'::text)
-> Bitmap Index Scan on idx_toto_h (cost=0.00..5.03 rows=100 width=0)
(actual time=0.554..0.554 rows=669 loops=1)
Index Cond: (h ? '500'::text)
Total runtime: 1.373 ms
(5 rows)-- postgresql 9.1 output
QUERY PLAN----------------------------------------------------------------------------
--------------------------
Seq Scan on toto (cost=0.00..1887.00 rows=100 width=20) (actual
time=0.433..57.834 rows=91 loops=1)
Filter: (h ? '500'::text)
Total runtime: 57.929 ms
(3 rows)
confirmed the problem (even with seq scan disable). note @> is still
working, and ? operator is still being loaded into the operator class.
hm.
merlin
"Pierre Ducroquet" <p.psql@pinaraf.info> writes:
[ the "hstore ? text" operator no longer matches an hstore GIST index ]
Hmm ... this doesn't seem to be specific to either hstore or GIST; it's
a collation problem. The index is marked as having no collation, which
is reasonable since hstore is a collation-less type. However, the ?
OpExpr gets marked as having "default" collation because it has one
collatable input, namely the text constant. And then,
match_clause_to_indexcol decides the clause doesn't match the index.
Not sure what to do about this. Is it okay to suppose that collation
can be ignored when matching to a collation-less index? If not, what's
the correct rule? I don't like the idea of concluding that hstore has
to be forcibly assigned a collation just because it has some operators
that accept text ...
regards, tom lane
On ons, 2011-09-28 at 22:42 -0400, Tom Lane wrote:
"Pierre Ducroquet" <p.psql@pinaraf.info> writes:
[ the "hstore ? text" operator no longer matches an hstore GIST index ]
Hmm ... this doesn't seem to be specific to either hstore or GIST; it's
a collation problem. The index is marked as having no collation, which
is reasonable since hstore is a collation-less type. However, the ?
OpExpr gets marked as having "default" collation because it has one
collatable input, namely the text constant. And then,
match_clause_to_indexcol decides the clause doesn't match the index.Not sure what to do about this. Is it okay to suppose that collation
can be ignored when matching to a collation-less index?
That sounds correct on first reading.
On Thu, Sep 29, 2011 at 4:22 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
Not sure what to do about this. Is it okay to suppose that collation
can be ignored when matching to a collation-less index?That sounds correct on first reading.
Doesn't this depend on the semantics of the ? operator?
Hypothetically if there was an operator like ?< which returned a list
of hstore keys that were < the argument then ?< would indeed depend on
the collation used even if hstore didn't do collations. If there was
an index type on hstore which could handle ?< then it would need to
have the right collation to be usable.
Of course we know ? doesn't depend on the collation but where is that
information? I suspect we should actually have an explicit flag for
each operator.
--
greg
Greg Stark <stark@mit.edu> writes:
On Thu, Sep 29, 2011 at 4:22 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
Not sure what to do about this. �Is it okay to suppose that collation
can be ignored when matching to a collation-less index?
That sounds correct on first reading.
Doesn't this depend on the semantics of the ? operator?
Well, yeah. But if the operator requires a particular collation, what's
it doing in an operator class for a collation-less indexed datatype?
I think the operator could actually depend on its input collation with
respect to some part of the processing it does on its collatable
right-hand input. But if the left-hand column (the indexed column) is
of a non-collatable type, it's hard to see how the operator could depend
on the index having a collation.
Also, I've now tested a patch along these lines and it passes core and
contrib regression tests, so there's not anything too obviously broken
about the idea.
regards, tom lane