Add NOTICE about non-NFC-characters and clues for solution

Started by PG Bug reporting formover 7 years ago1 messagesdocs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/11/unaccent.html
Description:

Seems a bug, because the `select unaccent('Iglésias')` result in accented
"iglésias" again... It is correct because length('Iglésias') is 9 instead
8.

The problem is not rare as you can check by pageviews of
https://stackoverflow.com/questions/24863716
The solution is to feed database with good UTF8 (NFC characteres).

**SUGGESTION**: add a notice for reders, about the aparent bug with non-NFC
input, showing examples and clues about solutions.

REF: https://en.wikipedia.org/wiki/Unicode_equivalence#Example