phonetic and/or synonym search

Started by Frank Joerdensover 24 years ago3 messagesgeneral
Jump to latest
#1Frank Joerdens
frank@joerdens.de

Does anyone know how to do a phonetic and/or synonym search (this would
be for the German language mostly)? What's the approach in theory?

Cheers, Frank

#2Stuart Bishop
zen@shangri-la.dropbear.id.au
In reply to: Frank Joerdens (#1)
Re: phonetic and/or synonym search

On Sunday, October 7, 2001, at 01:47 AM, Frank Joerdens wrote:

Does anyone know how to do a phonetic and/or synonym search (this would
be for the German language mostly)? What's the approach in theory?

The first thing you need to do is track down an algorithm that converts
a word into a code representing how the word sounds. This is language
specific,
and all the ones I know of are English specific or surname specific.
These
might do the job well enough in German, but you would need to test.
Examples
of the algorithms you are after are soundex (an implementation exists in
the contrib
directory of your postgresql source) and NYSIIS (I've got a C
implementation for
PostgreSQL if there is interest). Once you have reduced a word to its
code, you
can test if two words sound alike by simply comparing them
( nysiis('katie') == nysiis('city').

Examples on how to use this might be simply to create the relevant
function index on
your table, if your value contains only one word:

create index idx_blah on people (nysiis(surname))

You can then just use "nysiis('smith') = nysiis(surname)" in your where
clause of your SQL.

If you have multiple words in your value, you need triggers to split the
phrases into
words and store them in another table. You then use this table to
perform your phonetic
searches.

--
Stuart Bishop <zen@shangri-la.dropbear.id.au>

#3Bruce Momjian
bruce@momjian.us
In reply to: Stuart Bishop (#2)
Re: phonetic and/or synonym search

We have /contrib/soundex and items to help here. 7.2 will have even
more of them.

On Sunday, October 7, 2001, at 01:47 AM, Frank Joerdens wrote:

Does anyone know how to do a phonetic and/or synonym search (this would
be for the German language mostly)? What's the approach in theory?

The first thing you need to do is track down an algorithm that converts
a word into a code representing how the word sounds. This is language
specific,
and all the ones I know of are English specific or surname specific.
These
might do the job well enough in German, but you would need to test.
Examples
of the algorithms you are after are soundex (an implementation exists in
the contrib
directory of your postgresql source) and NYSIIS (I've got a C
implementation for
PostgreSQL if there is interest). Once you have reduced a word to its
code, you
can test if two words sound alike by simply comparing them
( nysiis('katie') == nysiis('city').

Examples on how to use this might be simply to create the relevant
function index on
your table, if your value contains only one word:

create index idx_blah on people (nysiis(surname))

You can then just use "nysiis('smith') = nysiis(surname)" in your where
clause of your SQL.

If you have multiple words in your value, you need triggers to split the
phrases into
words and store them in another table. You then use this table to
perform your phonetic
searches.

--
Stuart Bishop <zen@shangri-la.dropbear.id.au>

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026