BUG #14151: Xml special symbols are not unescaped when gettting value of Xml via xpath
The following bug has been logged on the website:
Bug reference: 14151
Logged by: oli nic
Email address: onic@live.fr
PostgreSQL version: 9.5.3
Operating system: Windows
Description:
The following bug has been logged on the website and the mailing list.
Bug reference: 7971.
It workds perfectly in 8.x but not in 9.5.
I've found this bug has already been discussed in 2 threads with no result
:
/messages/by-id/BAY152-W51C845124155064D53F3CA8A8C0@phx.gbl
/messages/by-id/E1UHyUw-0001oj-HE@wrigleys.postgresql.org
SELECT unnest(xpath('/name/text()', xmlelement(name name, 'AT&T', null )))
This gives me 'AT&T' and
I have NO WAY inside pgsql to get 'At&t' value
Is there any status/progress about this bug?
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
onic@live.fr writes:
SELECT unnest(xpath('/name/text()', xmlelement(name name, 'AT&T', null )))
This gives me 'AT&T' and
AFAICS, that behavior is correct. xpath returns a value of type xml
(well, really xml[]) and 'AT&T' is not a legal value of that type;
only 'AT&T' is.
I have NO WAY inside pgsql to get 'At&t' value
I agree that there ought to be an "unescape" function that would convert
this back to 'AT&T', but the lack of one seems like a missing feature not
a bug. I'd wonder for example what an unescape function ought to do with
other markup such as '<name>'.
The last concrete discussion we had on this seems to have been
this thread:
/messages/by-id/C71079E6-12D8-4048-B8C5-18368936FD5D@phlo.org
which petered out for lack of anyone finding the time to investigate
the relevant standards.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Thu, May 19, 2016 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
onic@live.fr writes:
SELECT unnest(xpath('/name/text()', xmlelement(name name, 'AT&T', null
)))
This gives me 'AT&T' and
AFAICS, that behavior is correct. xpath returns a value of type xml
(well, really xml[]) and 'AT&T' is not a legal value of that type;
only 'AT&T' is.I have NO WAY inside pgsql to get 'At&t' value
I agree that there ought to be an "unescape" function that would convert
this back to 'AT&T', but the lack of one seems like a missing feature not
a bug. I'd wonder for example what an unescape function ought to do with
other markup such as '<name>'.The last concrete discussion we had on this seems to have been
this thread:
/messages/by-id/C71079E6-12D8-4048-B8C5-18368936FD5D@phlo.org
which petered out for lack of anyone finding the time to investigate
the relevant standards.
This
2013
thread further discusses our existing problematic behavior.
/messages/by-id/25508.1383590668@sss.pgh.pa.us
I don't recall anything discussed in depth since then - just a bug report
or two from users.
/messages/by-id/CALr6pkhSe20gh5Hci1H=uT_7QE4av0m9h2eQMjqUX6D6AD9H1Q@mail.gmail.com
A basic entity decoder is fairly simply to write and doesn't require C code
- a single SQL function using replace will likely suffice.
The 2013 post I linked shows a potentially deeper consideration that we
need to take into account.
I'd challenge any community members who really want to see this get fixed
at least start things off by generating a set of queries and expected
outputs that can be commented upon and documented as being the desired
behavior with respect to both internal and external entities. At least
then we'd have clearly defined what the current and expected behaviors
are. With that in hand there is at least hope that someone less familiar
with xpath/xml generally, but familiar with PostgreSQL internals, would be
willing to take on the task of writing a patch that causes the tests to
pass. The existing test queries and resultant .out files should be
reasonably accessible even for someone not familiar with the code.
I'm inclined to think, however, that we will end up wanting an
"xpath_text(...)" function that returns text instead of xml. The
underlying problem here is that the xpath standard allows for different
kinds of outputs - which depend upon the xpath expression - while
PostgreSQL only allows for xml regardless of the underlying expression.
This dichotomy seems likely to be impossible to overcome without
introducing a new function.
Please, help us scratch your itch.
David J.