[PATCH][DOC][MINOR] Fix incorrect lexeme limit in textsearch docs
Hello,
A minor doc patch for this page
https://www.postgresql.org/docs/current/textsearch-limitations.html
and this line
*- The number of lexemes must be less than 2^64*
Docs wrongly claim "lexemes must be < 2^64" but the actual constraint is
1 MB total storage (MAXSTRPOS), and no 2^64 check exists in the code.
From src/include/tsearch/ts_type.h:
#define MAXSTRPOS ( (1<<20) - 1) // 1,048,575 bytes
typedef struct {
int32 size; // number of lexemes
...
} TSVectorData;
The attached patch:
- Removes the incorrect 2^64 claim
- Clarifies this means "distinct lexemes in a single tsvector value"
Thanks,
Dharin
Attachments:
0001-docs-Fix-incorrect-tsvector-lexeme-limit-in-textsear.patchapplication/octet-stream; name=0001-docs-Fix-incorrect-tsvector-lexeme-limit-in-textsear.patchDownload+2-4
Hello,
Gentle ping on the textsearch docs patch. Happy to address any feedback
Thanks,
Dharin
On Sat, Dec 27, 2025 at 10:09 PM Dharin Shah <dharinshah95@gmail.com> wrote:
Show quoted text
Hello,
A minor doc patch for this page
https://www.postgresql.org/docs/current/textsearch-limitations.html
and this line*- The number of lexemes must be less than 2^64*
Docs wrongly claim "lexemes must be < 2^64" but the actual constraint is
1 MB total storage (MAXSTRPOS), and no 2^64 check exists in the code.From src/include/tsearch/ts_type.h:
#define MAXSTRPOS ( (1<<20) - 1) // 1,048,575 bytes
typedef struct {
int32 size; // number of lexemes
...
} TSVectorData;The attached patch:
- Removes the incorrect 2^64 claim
- Clarifies this means "distinct lexemes in a single tsvector value"Thanks,
Dharin
Hi Dharin,
I looked at your patch, it looks good.
In the code, I couldn’t find any 2^64 bound on the lexeme count, so
removing that makes sense.
The added sentence about distinct lexeme count seems to overlap with the
existing description of tsvector limits, so I’m not sure it adds much new
information.
-Surya Poondla
Show quoted text
On Fri, Jan 9, 2026 at 2:01 PM surya poondla <suryapoondla4@gmail.com>
wrote:
Hi Dharin,
I looked at your patch, it looks good.
In the code, I couldn’t find any 2^64 bound on the lexeme count, so
removing that makes sense.
The added sentence about distinct lexeme count seems to overlap with the
existing description of tsvector limits, so I’m not sure it adds much new
information.-Surya Poondla
+1 on this patch, I was also a bit confused on this part of the
documentation. All I could conclude was that the number of lexemes in a
tsvector was limited by existing tsvector limits. I agree with Surya's
comment about the overlap, I think this patch should only remove the line
about the 2^64 bound. Patch applies cleanly!
Adi Gollamudi
Sorry to drop the ball on this, will get back to cleaning it up and create
a new patch
Thanks,
Dharin
On Thu, Mar 26, 2026 at 11:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Aditya Gollamudi <adigollamudi@gmail.com> writes:
+1 on this patch, I was also a bit confused on this part of the
documentation. All I could conclude was that the number of lexemes in a
tsvector was limited by existing tsvector limits. I agree with Surya's
comment about the overlap, I think this patch should only remove the line
about the 2^64 bound. Patch applies cleanly!What I'm inclined to do is just drop that <listitem> entirely,
since it's implied by the 1MB space limit. The patch as-submitted
accomplishes about the same thing, but takes more words to do it.regards, tom lane
Import Notes
Reply to msg id not found: 1048391.1774562546@sss.pgh.pa.us
I wrote:
What I'm inclined to do is just drop that <listitem> entirely,
since it's implied by the 1MB space limit. The patch as-submitted
accomplishes about the same thing, but takes more words to do it.
Pushed that way.
regards, tom lane
Import Notes
Reply to msg id not found: 1048391.1774562546@sss.pgh.pa.us