pgsql: Fix XML tag namespace change inadvertantly missed from previous
Log Message:
-----------
Fix XML tag namespace change inadvertantly missed from previous fix. Add
regression test for XML names and numeric entities.
Modified Files:
--------------
pgsql/src/backend/tsearch:
wparser_def.c (r1.11 -> r1.12)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/tsearch/wparser_def.c?r1=1.11&r2=1.12)
pgsql/src/test/regress/expected:
tsearch.out (r1.9 -> r1.10)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/tsearch.out?r1=1.9&r2=1.10)
pgsql/src/test/regress/sql:
tsearch.sql (r1.4 -> r1.5)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/tsearch.sql?r1=1.4&r2=1.5)
adunstan@postgresql.org (Andrew Dunstan) writes:
Fix XML tag namespace change inadvertantly missed from previous fix. Add
regression test for XML names and numeric entities.
Still one gripe:
regression=# select * from ts_debug(' λ λ');
alias | description | token | dictionaries | dictionary | lexemes
---------+--------------------------+---------+--------------+------------+---------
blank | Space symbols | | {} | |
entity | XML entity | λ | {} | |
blank | Space symbols | | {} | |
blank | Space symbols | &# | {} | |
numword | Word, letters and digits | X3BB | {simple} | simple | {x3bb}
blank | Space symbols | ; | {} | |
(6 rows)
Aren't hexadecimal entities supposed to be case-insensitive?
regards, tom lane
Tom Lane wrote:
adunstan@postgresql.org (Andrew Dunstan) writes:
Fix XML tag namespace change inadvertantly missed from previous fix. Add
regression test for XML names and numeric entities.Still one gripe:
regression=# select * from ts_debug(' λ λ');
alias | description | token | dictionaries | dictionary | lexemes
---------+--------------------------+---------+--------------+------------+---------
blank | Space symbols | | {} | |
entity | XML entity | λ | {} | |
blank | Space symbols | | {} | |
blank | Space symbols | &# | {} | |
numword | Word, letters and digits | X3BB | {simple} | simple | {x3bb}
blank | Space symbols | ; | {} | |
(6 rows)Aren't hexadecimal entities supposed to be case-insensitive?
The 'x' must be lower case, the hex digits can be upper or lower. The
XML spec says:
CharRef ::= '&#' [0-9]+ ';'
| '&#x' [0-9a-fA-F]+ ';'
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
Tom Lane wrote:
Aren't hexadecimal entities supposed to be case-insensitive?
The 'x' must be lower case, the hex digits can be upper or lower. The
XML spec says:
But we're also interested in parsing HTML, and upper case X is
allowed in HTML:
http://www.w3.org/TR/REC-html40/charset.html#h-5.3.1
regards, tom lane
I wrote:
Tom Lane wrote:
Aren't hexadecimal entities supposed to be case-insensitive?
The 'x' must be lower case, the hex digits can be upper or lower. The
XML spec says:CharRef ::= '&#' [0-9]+ ';'
| '&#x' [0-9a-fA-F]+ ';'
But I also see that the HTML spec allows for 'X' as well as 'x', so I'll
change it.
cheers
andrew