Full text search - How to build a filtering dictionary

Started by Antonio Franzosoabout 14 years ago3 messagesgeneral
Jump to latest
#1Antonio Franzoso
antoniofranzoso@yahoo.it

Hi all,
I need to build a synonym dictionary that performs a normalization of
tokens just like a filtering dictionary does. I've searched for a
filtering dictionary template but I've found it. Where Can I find it?
Or, if there isn't such a template, How can I build a simple filter
dictionary that simply maps a term with another (in a synonym dict-like
way)?

Thanks in advance,
Antonio

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Antonio Franzoso (#1)
Re: Full text search - How to build a filtering dictionary

Antonio,

you can see contrib/unaccent dictionary, which is a filtering
dictionary. I have a page about it - http://mira.sai.msu.su/~megera/wiki/unaccent

Oleg
On Wed, 18 Jan 2012, Antonio Franzoso wrote:

Hi all,
I need to build a synonym dictionary that performs a normalization of
tokens just like a filtering dictionary does. I've searched for a filtering
dictionary template but I've found it. Where Can I find it? Or, if there
isn't such a template, How can I build a simple filter dictionary that
simply maps a term with another (in a synonym dict-like way)?

Thanks in advance,
Antonio

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#3Antonio Franzoso
antoniofranzoso@yahoo.it
In reply to: Oleg Bartunov (#2)
Re: Full text search - How to build a filtering dictionary

Thanks for reply,
there is any simplest way? I have to do just a simple map (in a similar
way of synonym dictionary), set the TSL_FILTER flag (if there's a map
for a token) and then pass the normalized token to my own thesaurus
dictionary. I'm working on Windows and I've to write a C library to do
these operations (and I cannot see the unaccent code because it's a dll
file).
If there is no other solution, I though that I can integrate this
filtering dictionary in the thesaurus in a similar way:

token: lemma, term1, term2,....

where token is the denormalized term, lemma is one entry of thesaurus
and term1, term2,... are terms associated with lemma in the original
thesaurus structure. What do you think about this solution?

Il 18/01/2012 17:40, Oleg Bartunov ha scritto:

Show quoted text

Antonio,

you can see contrib/unaccent dictionary, which is a filtering
dictionary. I have a page about it -
http://mira.sai.msu.su/~megera/wiki/unaccent

Oleg
On Wed, 18 Jan 2012, Antonio Franzoso wrote:

Hi all,
I need to build a synonym dictionary that performs a normalization of
tokens just like a filtering dictionary does. I've searched for a
filtering dictionary template but I've found it. Where Can I find it?
Or, if there isn't such a template, How can I build a simple filter
dictionary that simply maps a term with another (in a synonym
dict-like way)?

Thanks in advance,
Antonio

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83