Documentation bug in 8.3?

Started by Bruce Momjianover 18 years ago3 messagesdocs
Jump to latest
#1Bruce Momjian
bruce@momjian.us

Reading through the text search data type docs:

http://www.postgresql.org/docs/8.3/static/datatype-textsearch.html#DATATYPE-TSVECTOR

it says:

Optionally, integer position(s) can be attached to any or all of the
lexemes:

SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11
rat:12'::tsvector;
tsvector
-------------------------------------------------------------------------------

'a':1,6,10 'on':5 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'rat':12
'sat':4

A position normally indicates the source word's location in the
document. Positional information can be used for proximity ranking.
Position values can range from 1 to 16383; larger numbers are silently
clamped to 16383. Duplicate position entries are discarded.
----------------------------------------

However in my testing of 8.3 duplicate position entries are not
discarded:

test=> SELECT 'a:1 b:1'::tsvector;
tsvector
-------------
'a':1 'b':1
(1 row)

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)
Re: Documentation bug in 8.3?

Bruce Momjian <bruce@momjian.us> writes:

clamped to 16383. Duplicate position entries are discarded.
----------------------------------------

However in my testing of 8.3 duplicate position entries are not
discarded:

test=> SELECT 'a:1 b:1'::tsvector;
tsvector
-------------
'a':1 'b':1
(1 row)

Those aren't duplicates, because they're not attached to the same
lexeme. The comment is talking about this behavior:

regression=# SELECT 'a:1 a:1'::tsvector;
tsvector
----------
'a':1
(1 row)

regression=# SELECT 'a:1,2,1'::tsvector;
tsvector
----------
'a':1,2
(1 row)

regards, tom lane

#3Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#2)
Re: Documentation bug in 8.3?

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

clamped to 16383. Duplicate position entries are discarded.
----------------------------------------

However in my testing of 8.3 duplicate position entries are not
discarded:

test=> SELECT 'a:1 b:1'::tsvector;
tsvector
-------------
'a':1 'b':1
(1 row)

Those aren't duplicates, because they're not attached to the same
lexeme. The comment is talking about this behavior:

regression=# SELECT 'a:1 a:1'::tsvector;
tsvector
----------
'a':1
(1 row)

regression=# SELECT 'a:1,2,1'::tsvector;
tsvector
----------
'a':1,2
(1 row)

OK, thanks. I will clarify the documentation. Patch attached and
applied.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload+2-2