LIKE vs regex queries

Started by Dan Grahamalmost 22 years ago2 messages

graham@molbio.uoregon.edu

almost 22 years ago

I have a database with about 250,000 entries in a table, PG 7.5, One
of the fields is text. LIKE queries on this field execute much faster
than the equivalent regex queries.

Is this what you would expect? Should I prefer LIKE to regex?
(I'm a regex fan, but the performance hit seems steep.)

I've pasted sample output in below.

================================================================
order=# explain select dnum from item where description LIKE '%Ushio%';
QUERY PLAN
----------------------------------------------------------
Seq Scan on item (cost=0.00..7330.40 rows=349 width=16)
Filter: (description ~~ '%Ushio%'::text)
(2 rows)

order=# select dnum from item where description LIKE '%Ushio%';
dnum
--------------
B521479
MB105921
MB109239
MB110491
MB111390
MB111983
MB112854
MB115020
MB115020
MB120247
MB121532
(11 rows)

Time: 855.540 ms

==================================================================

order=# explain select dnum from item where description ~ 'Ushio';
QUERY PLAN
----------------------------------------------------------
Seq Scan on item (cost=0.00..7330.40 rows=349 width=16)
Filter: (description ~ 'Ushio'::text)
(
order=# select dnum from item where description ~ 'Ushio';
dnum
--------------
B521479
MB105921
MB109239
MB110491
MB111390
MB111983
MB112854
MB115020
MB115020
MB120247
MB121532
(11 rows)

Time: 2409.043 ms
=========================================================================

Daniel Graham
graham@molbio.uoregon.edu

Fabien COELHO

coelho@cri.ensmp.fr

almost 22 years ago

In reply to: Dan Graham (#1)

Re: [PATCHES] LIKE vs regex queries

I have a database with about 250,000 entries in a table, PG 7.5, One
of the fields is text. LIKE queries on this field execute much faster
than the equivalent regex queries.

Is this what you would expect? Should I prefer LIKE to regex?
(I'm a regex fan, but the performance hit seems steep.)

I've pasted sample output in below.

order=# select dnum from item where description LIKE '%Ushio%';
Time: 855.540 ms

order=# select dnum from item where description ~ 'Ushio';
Time: 2409.043 ms

From the source code, the computed automaton for ~ seems to be cached, so
this is not an explanation. However the RE engine is quite generic.

The code for LIKE is quite hardwired to the special case, as there is only
_ and % to be taken care of.

So it seems that if a LIKE regular expression is enough, then it is a much
better choice. SIMILAR TO is translated to ~, so it should be slow.

--
Fabien Coelho - coelho@cri.ensmp.fr