Weird problem concerning tsearch functions built into postgres 8.3, assistance requested

Started by Andrew Edsonover 17 years ago2 messagesgeneral
Jump to latest
#1Andrew Edson
cheighlund@yahoo.com

I've been tasked with maintaining a set of postgres databases created by my predecessor in this position.  The original databases several years back were version 8.1.3, and used the tsearch2 functions to enable some client-program searches.

We've recently begun preparing to shift to 8.3 (I believe the current starter box we're putting together for cloning is running 8.3.1), and I've been having a bit of trouble with the built-in searching.  I think I've got it mostly fixed after following a few leads online, but I've run across one last little bit of problem that I can't figure out how to get around, and that doesn't make much sense to me.  Admittedly, I don't know very much about tsearch.

One of the tables we're using in the 8.1.3 setups currently running includes phone numbers as a searchable field (fti_phone), with the results of a select on the field generally looking like this: 'MMM':2 'NNNN':3 'MMM-NNNN':1.  MMM is the first three digits, NNNN is the fourth-seventh.

The weird part is this: On the old systems running 8.1.3, I can look up a record by
fti_phone using any of the three above items; first three, last four, or entire number including dash.  On the new system running 8.3.1, I can do a lookup by the first three or the last four and get the results I'm after, but any attempt to do a lookup by the entire MMM-NNNN version returns no records.

I saw nothing concerning this while I was looking for information on how to get the search functions properly working in postgres 8.3.1, nor have I specifically seen anything since running across that problem.  The latter, however, may simply be because I don't know how to properly phrase my searches.

Does anyone have any information they would be willing to share regarding this issue, or a link to a website which discusses it?  I would greatly appreciate any advice I may be given.

#2Teodor Sigaev
teodor@sigaev.ru
In reply to: Andrew Edson (#1)
Re: Weird problem concerning tsearch functions built into postgres 8.3, assistance requested

One of the tables we're using in the 8.1.3 setups currently running
includes phone numbers as a searchable field (fti_phone), with the
results of a select on the field generally looking like this: 'MMM':2
'NNNN':3 'MMM-NNNN':1. MMM is the first three digits, NNNN is the
fourth-seventh.

The weird part is this: On the old systems running 8.1.3, I can look up
a record by
fti_phone using any of the three above items; first three, last four, or
entire number including dash. On the new system running 8.3.1, I can do
a lookup by the first three or the last four and get the results I'm
after, but any attempt to do a lookup by the entire MMM-NNNN version
returns no records.

Parser was changed:
postgres=# select * from ts_debug('123-4567');
alias | description | token | dictionaries | dictionary | lexemes
-------+------------------+-------+--------------+------------+---------
uint | Unsigned integer | 123 | {simple} | simple | {123}
int | Signed integer | -4567 | {simple} | simple | {-4567}
(2 rows)
postgres=# select * from ts_debug('abc-defj');
alias | description | token | dictionaries
| dictionary | lexemes
-----------------+---------------------------------+----------+----------------+--------------+------------
asciihword | Hyphenated word, all ASCII | abc-defj | {english_stem}
| english_stem | {abc-defj}
hword_asciipart | Hyphenated word part, all ASCII | abc | {english_stem}
| english_stem | {abc}
blank | Space symbols | - | {}
| |
hword_asciipart | Hyphenated word part, all ASCII | defj | {english_stem}
| english_stem | {defj}

Parser in 8.1 threats any [alnum]+-[alnum]+ as a hyphenated word, but 8.3 treats
[digit]+-[digit]+ as two separated numbers.

So, you can play around pre-process texts before indexing or have a look on
regex dictionary (http://vo.astronet.ru/arxiv/dict_regex.html)
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/