index refuses to build

Started by Jean-Yves F. Barbierabout 14 years ago8 messages
#1Jean-Yves F. Barbier
12ukwn@gmail.com

Hi list,

I'm wrong somewhere, but where?:

CREATE INDEX tst1m_name_lu_ix ON tst1m(lower(unaccent(name)));
ERROR: functions in index expression must be marked IMMUTABLE

Decomposing it reveals that it is the 'unaccent' part that
blocks:

CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name));
ERROR: functions in index expression must be marked IMMUTABLE

From what I found on the web I also tried to cast it to text,
but the error's still here:(

JY
--
One of the signs of Napoleon's greatness is the fact that he once
had a publisher shot.
-- Siegfried Unseld

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Jean-Yves F. Barbier (#1)
Re: index refuses to build

On Thu, Dec 29, 2011 at 5:10 PM, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

Hi list,

I'm wrong somewhere, but where?:

CREATE INDEX tst1m_name_lu_ix ON tst1m(lower(unaccent(name)));
ERROR:  functions in index expression must be marked IMMUTABLE

Decomposing it reveals that it is the 'unaccent' part that
blocks:

CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name));
ERROR:  functions in index expression must be marked IMMUTABLE

From what I found on the web I also tried to cast it to text,
but the error's still here:(

your problem is the unaccent function. it's defined stable because
the rules function it depends on can change after the index is built
-- that would effectively introduce index corruption. it's possible
to bypass that restriction, but are you sure that's what you want to
do?

merlin

#3Jean-Yves F. Barbier
12ukwn@gmail.com
In reply to: Merlin Moncure (#2)
Re: index refuses to build

On Thu, 29 Dec 2011 17:16:22 -0600
Merlin Moncure <mmoncure@gmail.com> wrote:

Woops, sorry: reposting on the ML.

Decomposing it reveals that it is the 'unaccent' part that
blocks:

CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name));
ERROR:  functions in index expression must be marked IMMUTABLE

From what I found on the web I also tried to cast it to text,
but the error's still here:(

your problem is the unaccent function. it's defined stable because
the rules function it depends on can change after the index is built
-- that would effectively introduce index corruption. it's possible
to bypass that restriction, but are you sure that's what you want to
do?

Well, I don't know how to achieve what I want another way.

Overriding this restriction can effectively become a concern
as I use unaccent.rules (modified for fr and de) and I can't be
absolutely sure it won't evolve since the DB is to be used by
(mostly) CE people - so, if I don't have all modifications for
v1.0, there's a risk.

My goal is to have a case insensitive + unaccented index.

JY
--
As they say about Dungeons and Dragons, "Life's a die, and then you
bitch."

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#2)
Re: [NOVICE] index refuses to build

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Dec 29, 2011 at 5:10 PM, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name));
ERROR: functions in index expression must be marked IMMUTABLE

your problem is the unaccent function. it's defined stable because
the rules function it depends on can change after the index is built
-- that would effectively introduce index corruption. it's possible
to bypass that restriction, but are you sure that's what you want to
do?

Hmm ... it's clear why unaccent(text) is only stable, because it depends
on the current search_path to find the "unaccent" dictionary. But I
wonder whether it was an oversight that unaccent(regdictionary, text)
is stable and not immutable. We don't normally mark functions as stable
just because you could in principle change their behavior by altering
some outside-the-database configuration files.

regards, tom lane

#5Jean-Yves F. Barbier
12ukwn@gmail.com
In reply to: Jean-Yves F. Barbier (#1)
Re: index refuses to build [finally SOLVED, but still some questions]

On Fri, 30 Dec 2011 00:10:01 +0100
"Jean-Yves F. Barbier" <12ukwn@gmail.com> wrote:

Sooo, for those who are interested, this is how I did it:

I build my own function as:

CREATE FUNCTION erpunaccent(text) RETURNS text AS $$
SELECT unaccent($1);
$$ LANGUAGE sql IMMUTABLE;

indexed my test table with:

CREATE INDEX tst1m_name_lu_ix ON tst1m(lower(erpunaccent(name));

picked a known row:

SELECT * FROM tst1m WHERE id=33;
id | name | note
----+-------------------+-------------------------------------------
33 | oGvvÀtÖiÉsWMtWqma | CËÁANfKaáMàÑaiLd TtBGsCpwÉCKJrFëöTyPiTmèU
(1 ligne)

and launched the query:

SELECT * FROM tst1m WHERE lower(erpunaccent(name)) = 'ogvvatoieswmtwqma';
id | name | note
----+-------------------+-------------------------------------------
33 | oGvvÀtÖiÉsWMtWqma | CËÁANfKaáMàÑaiLd TtBGsCpwÉCKJrFëöTyPiTmèU
(1 ligne)

However, you must be very careful to use your function and not the
original one, otherwise the index' not used (it feels weird, as the
result's exactly the same! Tooom, whhhyyy??).

My function plan:

EXPLAIN ANALYZE SELECT * FROM tst1m WHERE lower(erpunaccent(name)) = 'ogvvatoieswmtwqma';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tst1m (cost=16.40..1210.37 rows=500 width=100) (actual time=0.093..0.094 rows=1 loops=1)
Recheck Cond: (lower(erpunaccent((name)::text)) = 'ogvvatoieswmtwqma'::text)
-> Bitmap Index Scan on tst1m_name_lu_key (cost=0.00..16.27 rows=500 width=0) (actual time=0.074..0.074 rows=1 loops=1)
Index Cond: (lower(erpunaccent((name)::text)) = 'ogvvatoieswmtwqma'::text)
Total runtime: 0.177 ms
(5 lignes)

Original function plan:

EXPLAIN ANALYZE SELECT * FROM tst1m WHERE lower(unaccent(name)) = 'ogvvatoieswmtwqma';
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Seq Scan on tst1m (cost=0.00..3367.02 rows=500 width=100) (actual time=0.466..1162.568 rows=1 loops=1)
Filter: (lower(unaccent((name)::text)) = 'ogvvatoieswmtwqma'::text)
Total runtime: 1162.656 ms
(3 lignes)

There's a drawback though: a research with LIKE is much slower on this
index than on the regular index despite the fact it uses the new
index and I can't understand why (!??)
Except if the index doesn't contain data but is recalculated on the fly?

LIKE research using new index:

EXPLAIN ANALYZE SELECT * FROM tst1m WHERE lower(erpunaccent(name)) LIKE 'ogvvatoies%';
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on tst1m (cost=0.00..28117.27 rows=500 width=100) (actual time=1.284..3569.742 rows=1 loops=1)
Filter: (lower(erpunaccent((name)::text)) ~~ 'ogvvatoies%'::text)
Total runtime: 3569.815 ms
(3 lignes)

LIKE research using normal index:

EXPLAIN ANALYZE SELECT * FROM tst1m WHERE name LIKE 'oGvvÀtÖiÉ%';
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on tst1m (cost=0.00..2867.01 rows=10 width=100) (actual time=0.071..140.336 rows=1 loops=1)
Filter: ((name)::text ~~ 'oGvvÀtÖiÉ%'::text)
Total runtime: 140.418 ms
(3 lignes)

If some are interested in file unaccent.rules I can post it here (*nix UTF-8)

JY
--
Sometimes you get an almost irresistible urge to go on living.

#6Jean-Yves F. Barbier
12ukwn@gmail.com
In reply to: Jean-Yves F. Barbier (#5)
Re: index refuses to build [DEFINITELY SOLVED :-]

On Fri, 30 Dec 2011 07:34:28 +0100
"Jean-Yves F. Barbier" <12ukwn@gmail.com> wrote:

Ok, I found the answer on postgresql.fr forum and here the mod'op:

* Delete my old function that used TEXT for I/O,
* Recreate it using VARCHAR for I/O,
(not mandatory, as explain talks about ::text and test show the
same results w/ either TEXT or VARCHAR),
* Delete the index,
* Recreate it with some specialization salt:

CREATE INDEX tst1m_name_lu_key ON tst1m(lower(jyunaccent(name)) varchar_pattern_ops);

* Retest:

EXPLAIN ANALYZE SELECT * FROM tst1m WHERE lower(erpunaccent(name)) LIKE 'ogvvatoie%';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tst1m (cost=17.90..1211.87 rows=500 width=100) (actual time=0.254..0.256 rows=1 loops=1)
Filter: (lower((erpunaccent(name))::text) ~~ 'ogvvatoie%'::text)
-> Bitmap Index Scan on tst1m_name_lu_key (cost=0.00..17.78 rows=500 width=0) (actual time=0.054..0.054 rows=1 loops=1)
Index Cond: ((lower((erpunaccent(name))::text) ~>=~ 'ogvvatoie'::text) AND (lower((erpunaccent(name))::text) ~<~ 'ogvvatoif'::text))
Total runtime: 0.338 ms
(5 lignes)

Which isn't bad on a table w/10,000 rows and a column randomly filled length [14-32].

Hehe.

--
I don't think it's worth washing hogs over.
-- Larry Wall in <199710060253.TAA09723@wall.org>

#7Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#4)
Re: [NOVICE] index refuses to build

On Thu, Dec 29, 2011 at 10:40:19PM -0500, Tom Lane wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Dec 29, 2011 at 5:10 PM, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name));
ERROR: functions in index expression must be marked IMMUTABLE

your problem is the unaccent function. it's defined stable because
the rules function it depends on can change after the index is built
-- that would effectively introduce index corruption. it's possible
to bypass that restriction, but are you sure that's what you want to
do?

Hmm ... it's clear why unaccent(text) is only stable, because it depends
on the current search_path to find the "unaccent" dictionary. But I
wonder whether it was an oversight that unaccent(regdictionary, text)
is stable and not immutable. We don't normally mark functions as stable
just because you could in principle change their behavior by altering
some outside-the-database configuration files.

Should we change the function signature for unaccent(regdictionary,
text)?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#8Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#7)
Re: [HACKERS] index refuses to build

On Sun, Aug 26, 2012 at 09:47:01AM -0400, Bruce Momjian wrote:

On Thu, Dec 29, 2011 at 10:40:19PM -0500, Tom Lane wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Dec 29, 2011 at 5:10 PM, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name));
ERROR: functions in index expression must be marked IMMUTABLE

your problem is the unaccent function. it's defined stable because
the rules function it depends on can change after the index is built
-- that would effectively introduce index corruption. it's possible
to bypass that restriction, but are you sure that's what you want to
do?

Hmm ... it's clear why unaccent(text) is only stable, because it depends
on the current search_path to find the "unaccent" dictionary. But I
wonder whether it was an oversight that unaccent(regdictionary, text)
is stable and not immutable. We don't normally mark functions as stable
just because you could in principle change their behavior by altering
some outside-the-database configuration files.

Should we change the function signature for unaccent(regdictionary,
text)?

Did we decide not to do this?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice