Regular Expression for 'and' instead of 'or'

Started by Samuel J. Sutjionoabout 24 years ago10 messagesgeneral
Jump to latest
#1Samuel J. Sutjiono
ssutjiono@wc-group.com

This expression matches the word socks or shoes or nike in product category
where productdescr ~* '(socks|shoes|nike)'

Does anybody know what the expression should be if I want to do 'and' of those key words instead of 'or' ?

Thanks,
Sam

#2Fernando Schapachnik
fschapachnik@vianetworks.com.ar
In reply to: Samuel J. Sutjiono (#1)
Re: Regular Expression for 'and' instead of 'or'

En un mensaje anterior, Samuel J. Sutjiono escribi�:

This expression matches the word socks or shoes or nike in product category
where productdescr ~* '(socks|shoes|nike)'

Does anybody know what the expression should be if I want to do 'and' of those key words instead of 'or' ?

You have to use something like:

'(socks.*shoes.*nike)|(socks.*nike.*shoes)|...'

where ... is every other possible combination. Not pretty, but that's
a limitation of finite automata (or regular expressions, which are
the same).

Regards.

Fernando P. Schapachnik
Gerente de tecnolog�a de red
y sistemas de informaci�n
VIA NET.WORKS ARGENTINA S.A.
fschapachnik@vianetworks.com.ar
Tel.: (54-11) 4323-3381

#3Samuel J. Sutjiono
ssutjiono@wc-group.com
In reply to: Samuel J. Sutjiono (#1)
Re: [SQL] Regular Expression for 'and' instead of 'or'

Thank you very much Fernando. Do you know whether this expression (regex)
use index scan ?

Regards,
Sam
----- Original Message -----
From: "Fernando Schapachnik" <fschapachnik@vianetworks.com.ar>
To: "Samuel J. Sutjiono" <ssutjiono@wc-group.com>
Cc: <pgsql-sql@postgresql.org>; <pgsql-general@postgresql.org>
Sent: Friday, February 22, 2002 1:15 PM
Subject: Re: [SQL] [GENERAL] Regular Expression for 'and' instead of 'or'

En un mensaje anterior, Samuel J. Sutjiono escribi�:

This expression matches the word socks or shoes or nike in product

category

where productdescr ~* '(socks|shoes|nike)'

Does anybody know what the expression should be if I want to do 'and'

of those key words instead of 'or' ?

Show quoted text

You have to use something like:

'(socks.*shoes.*nike)|(socks.*nike.*shoes)|...'

where ... is every other possible combination. Not pretty, but that's
a limitation of finite automata (or regular expressions, which are
the same).

Regards.

Fernando P. Schapachnik
Gerente de tecnolog�a de red
y sistemas de informaci�n
VIA NET.WORKS ARGENTINA S.A.
fschapachnik@vianetworks.com.ar
Tel.: (54-11) 4323-3381

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#4Fernando Schapachnik
fschapachnik@vianetworks.com.ar
In reply to: Samuel J. Sutjiono (#3)
Re: [SQL] Regular Expression for 'and' instead of 'or'

En un mensaje anterior, Samuel J. Sutjiono escribi�:

Thank you very much Fernando. Do you know whether this expression (regex)
use index scan ?

Not sure, but probably an EXPLAIN will tell.

Good luck.

Fernando P. Schapachnik
Gerente de tecnolog�a de red
y sistemas de informaci�n
VIA NET.WORKS ARGENTINA S.A.
fschapachnik@vianetworks.com.ar
Tel.: (54-11) 4323-3381

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fernando Schapachnik (#2)
Re: Regular Expression for 'and' instead of 'or'

Fernando Schapachnik <fschapachnik@vianetworks.com.ar> writes:

You have to use something like:

'(socks.*shoes.*nike)|(socks.*nike.*shoes)|...'

where ... is every other possible combination. Not pretty, but that's
a limitation of finite automata (or regular expressions, which are
the same).

Seems a lot easier to do

WHERE field ~* 'socks' AND field ~* 'shoes' AND field ~* 'nike'

regards, tom lane

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Samuel J. Sutjiono (#1)
Re: [SQL] Regular Expression for 'and' instead of 'or'

Samuel J. Sutjiono writes:

This expression matches the word socks or shoes or nike in product category
where productdescr ~* '(socks|shoes|nike)'

Does anybody know what the expression should be if I want to do 'and'
of those key words instead of 'or' ?

You probably want something like

productdescr ~* 'socks' AND productdescr ~* 'shoes' AND productdescr ~* 'nike'

I don't think it's possible to do this for the general case with just a
regular expression.

--
Peter Eisentraut peter_e@gmx.net

#7Medi Montaseri
medi@cybershell.com
In reply to: Samuel J. Sutjiono (#1)
Re: Regular Expression for 'and' instead of 'or'

Based on generic rules of Regular Expression, the default is a AND.
For example

RE-1RE-2RE-3 is really (RE-1) and (RE-2) and (RE-3). However Regular
Expression
also provides the "Alternation Metachar" to change it from a default AND
to a OR.

The problem however is if you have target-1 junk target-2 junk junk
target-3
Then the simple (target-1target-2target-3) would not work, as you are
implying
they are following each other. So the fix could be

productdescr ~* '(socks)' AND productdescr ~* '(shoes)' AND productdescr
~*(nike)'

Effectively you are creating a composite expression consisting of three
sub-expression.

"Samuel J. Sutjiono" wrote:

This expression matches the word socks or shoes or nike in product
categorywhere productdescr ~* '(socks|shoes|nike)' Does anybody know
what the expression should be if I want to do 'and' of those key
words instead of 'or' ? Thanks,Sam

--
-------------------------------------------------------------------------
Medi Montaseri medi@CyberShell.com
Unix Distributed Systems Engineer HTTP://www.CyberShell.com
CyberShell Engineering
-------------------------------------------------------------------------

#8Rajesh Kumar Mallah
mallah@trade-india.com
In reply to: Samuel J. Sutjiono (#1)
Re: Regular Expression for 'and' instead of 'or'

How about using the contrib/tsearch module
that uses gist indexes.

moreover it has builtin support for word stemming,morphology
and stuff. (to my knowledge)

you would ofcourse require to upgrade to PGSQL7.2

excrepts from README.tsearch contrib module.
================
and now you can search all titles with words 'patch' and 'gist':
select title from titles where titleidx ## 'patch&gist';

Here, ## is a new operation defined for type 'txtidx' which could use
index
(if exists) built on titleidx. This operator uses morphology to
expand query, i.e.
## 'patches&gist' will find titles with 'patch' and 'gist' also.
If you want to provide query as is, use operator @@ instead:
select title from titles where titleidx @@ 'patch&gist';

"Samuel J. Sutjiono" wrote:

Show quoted text

This expression matches the word socks or shoes or nike in product
categorywhere productdescr ~* '(socks|shoes|nike)' Does anybody know
what the expression should be if I want to do 'and' of those key
words instead of 'or' ? Thanks,Sam

#9Brian Knox
laotse@aol.net
In reply to: Tom Lane (#5)
Strange Postgresql Indexing Behavior

I'm having an interesting time trying to figure out some behavior with
postgresql indexes that I am trying to understand.

I have a table, historyticket. In that table, I have a column,
fk_opener_id, which is an integer column. I have an index (default btree
index) on the fk_opener_id column in that table.

When I select from this table with the following query:

select * from historyticket where fk_opener_id = ?

The query sometimes uses the index, and sometimes does a sequential scan.
I experimented for a little bit and found out that if the number of rows
that match the query is greater than a certain number (somewhere around
1000 rows from what I can tell) then the index is not used.

For example:

=============
testing=# explain select * from historyticket where fk_opener_id = 67;
NOTICE: QUERY PLAN:

Seq Scan on historyticket (cost=0.00..768.62 rows=1246 width=419)

testing=# select count(*) from historyticket where fk_opener_id = 67;
count
-------
1158
(1 row)
---------
testing=# explain select * from historyticket where fk_opener_id = 4;
NOTICE: QUERY PLAN:

Index Scan using fk_opener_id on historyticket (cost=0.00..179.47 rows=47
width=419)

testing=# select count(*) from historyticket where fk_opener_id = 4;
count
-------
79
(1 row)
===============

I did more queries and confirmed that when the number of rows returned is
below a certain number (I don't have enough data to determine the exact
number) the index is used, and when it is above a certain number, it is
not used.

Can anyone explain to me what is happening / why it is happening / how to
make the indexes work correctly?

Thanks.

Brian

#10Martijn van Oosterhout
kleptog@svana.org
In reply to: Brian Knox (#9)
Re: Strange Postgresql Indexing Behavior

On Wed, Mar 13, 2002 at 04:28:00PM -0500, Brian Knox wrote:

I'm having an interesting time trying to figure out some behavior with
postgresql indexes that I am trying to understand.

[snip]

I did more queries and confirmed that when the number of rows returned is
below a certain number (I don't have enough data to determine the exact
number) the index is used, and when it is above a certain number, it is
not used.

Can anyone explain to me what is happening / why it is happening / how to
make the indexes work correctly?

Well, checking a tuple from an index is more expensive than checking a tuple
from a sequential scan. So, if you want to select 50% of the table, it's
faster to read the whole table than it is to use the index.

The planner tries to guess where the break-even point is. Above, seq scan,
below index scan.

This is a FAQ, IIRC.

HTH,
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

Show quoted text

11/9/2001 - a new beginning or the beginning of the end?