Documentation bug in 8.3?
Reading through the text search data type docs:
http://www.postgresql.org/docs/8.3/static/datatype-textsearch.html#DATATYPE-TSVECTOR
it says:
Optionally, integer position(s) can be attached to any or all of the
lexemes:
SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11
rat:12'::tsvector;
tsvector
-------------------------------------------------------------------------------
'a':1,6,10 'on':5 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'rat':12
'sat':4
A position normally indicates the source word's location in the
document. Positional information can be used for proximity ranking.
Position values can range from 1 to 16383; larger numbers are silently
clamped to 16383. Duplicate position entries are discarded.
----------------------------------------
However in my testing of 8.3 duplicate position entries are not
discarded:
test=> SELECT 'a:1 b:1'::tsvector;
tsvector
-------------
'a':1 'b':1
(1 row)
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
clamped to 16383. Duplicate position entries are discarded.
----------------------------------------
However in my testing of 8.3 duplicate position entries are not
discarded:
test=> SELECT 'a:1 b:1'::tsvector;
tsvector
-------------
'a':1 'b':1
(1 row)
Those aren't duplicates, because they're not attached to the same
lexeme. The comment is talking about this behavior:
regression=# SELECT 'a:1 a:1'::tsvector;
tsvector
----------
'a':1
(1 row)
regression=# SELECT 'a:1,2,1'::tsvector;
tsvector
----------
'a':1,2
(1 row)
regards, tom lane
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
clamped to 16383. Duplicate position entries are discarded.
----------------------------------------However in my testing of 8.3 duplicate position entries are not
discarded:test=> SELECT 'a:1 b:1'::tsvector;
tsvector
-------------
'a':1 'b':1
(1 row)Those aren't duplicates, because they're not attached to the same
lexeme. The comment is talking about this behavior:regression=# SELECT 'a:1 a:1'::tsvector;
tsvector
----------
'a':1
(1 row)regression=# SELECT 'a:1,2,1'::tsvector;
tsvector
----------
'a':1,2
(1 row)
OK, thanks. I will clarify the documentation. Patch attached and
applied.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +