Help with namespaces in xpath (PostgreSQL 9.5.3)

Started by Allan Kamaualmost 10 years ago3 messagesgeneral
Jump to latest
#1Allan Kamau
kamauallan@gmail.com

I have an xml document from which I would like to extract the contents of
several elements.

I would like to use xpath to extract the contents of "name" from the xml
document shown below.

WITH x AS
(
SELECT
'<?xml version="1.0" encoding="UTF-8"?>
<uniprot xmlns="http://uniprot.org/uniprot&quot; xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance&quot; xsi:schemaLocation="
http://uniprot.org/uniprot http://www.uniprot.org/support/docs/uniprot.xsd&quot;&gt;
<entry dataset="Swiss-Prot" created="2009-12-15" modified="2016-05-11"
version="56">
<accession>A0JM59</accession>
<name>UBP20_XENTR</name>
</entry>
</uniprot>
'::xml AS d
)
SELECT (xpath('/uniprot/entry/name/text()',a.d))[1]::text AS uniprot_name
FROM
x AS a
;

The documentation for xpath() ("
https://www.postgresql.org/docs/9.5/static/functions-xml.html&quot;) describes
"xpath(xpath, xml [, nsarray]").

For the above xml document, what would be the two dimensional array
"nsarray" for the xpath() function?

-Allan.

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Allan Kamau (#1)
Re: Help with namespaces in xpath (PostgreSQL 9.5.3)

On Sun, Jun 19, 2016 at 5:09 PM, Allan Kamau <kamauallan@gmail.com> wrote:

I have an xml document from which I would like to extract the contents of
several elements.

I would like to use xpath to extract the contents of "name" from the xml
document shown below.

WITH x AS
(
SELECT
'<?xml version="1.0" encoding="UTF-8"?>
<uniprot xmlns="http://uniprot.org/uniprot&quot; xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance&quot; xsi:schemaLocation="
http://uniprot.org/uniprot http://www.uniprot.org/support/docs/uniprot.xsd
">
<entry dataset="Swiss-Prot" created="2009-12-15" modified="2016-05-11"
version="56">
<accession>A0JM59</accession>
<name>UBP20_XENTR</name>
</entry>
</uniprot>
'::xml AS d
)
SELECT (xpath('/uniprot/entry/name/text()',a.d))[1]::text AS uniprot_name
FROM
x AS a
;

The documentation for xpath() ("
https://www.postgresql.org/docs/9.5/static/functions-xml.html&quot;) describes
"xpath(xpath, xml [, nsarray]").

For the above xml document, what would be the two dimensional array
"nsarray" for the xpath() function?

​Is there a specific part of the description and two examples that doesn't
make sense to you?

​Or more specifically, do you understand what namespaces are?​

ARRAY[
ARRAY['defaultns','http://uniprot.org/uniprot&#39;],
ARRAY['xsi','http://www.w3.org/2001/XMLSchema-instance&#39;]
]​

In effect when the xpath function parses the XML document it tosses away
all of the document-local namespace aliases and instead associated the full
namespace URI with each element (in the DOM). Since, in the xpath
expression, usually you'd want to refer to nodes in the DOM via their
namespace alias you need to tell the xpath function which aliases you
intend to use in the xpath and which full URI they correspond to.
Furthermore, there is not concept of a default namespace in the xpath
expression. So while you can simply copy-paste the aliases and URIs from
all of the non-default namespace aliases you must also choose a unique
alias for the default namespace in the original document.

In the above I've copied the alias and namespace URI for the named "xsi"
alias and gave the name "defaultns" to the original document's default
namespace URI.

David J.

#3Allan Kamau
kamauallan@gmail.com
In reply to: David G. Johnston (#2)
Re: Help with namespaces in xpath (PostgreSQL 9.5.3)

Thank you David.

-Allan.

On Mon, Jun 20, 2016 at 11:19 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

Show quoted text

On Sun, Jun 19, 2016 at 5:09 PM, Allan Kamau <kamauallan@gmail.com> wrote:

I have an xml document from which I would like to extract the contents of
several elements.

I would like to use xpath to extract the contents of "name" from the xml
document shown below.

WITH x AS
(
SELECT
'<?xml version="1.0" encoding="UTF-8"?>
<uniprot xmlns="http://uniprot.org/uniprot&quot; xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance&quot; xsi:schemaLocation="
http://uniprot.org/uniprot
http://www.uniprot.org/support/docs/uniprot.xsd&quot;&gt;
<entry dataset="Swiss-Prot" created="2009-12-15" modified="2016-05-11"
version="56">
<accession>A0JM59</accession>
<name>UBP20_XENTR</name>
</entry>
</uniprot>
'::xml AS d
)
SELECT (xpath('/uniprot/entry/name/text()',a.d))[1]::text AS uniprot_name
FROM
x AS a
;

The documentation for xpath() ("
https://www.postgresql.org/docs/9.5/static/functions-xml.html&quot;)
describes "xpath(xpath, xml [, nsarray]").

For the above xml document, what would be the two dimensional array
"nsarray" for the xpath() function?

​Is there a specific part of the description and two examples that doesn't
make sense to you?

​Or more specifically, do you understand what namespaces are?​

ARRAY[
ARRAY['defaultns','http://uniprot.org/uniprot&#39;],
ARRAY['xsi','http://www.w3.org/2001/XMLSchema-instance&#39;]
]​

In effect when the xpath function parses the XML document it tosses away
all of the document-local namespace aliases and instead associated the full
namespace URI with each element (in the DOM). Since, in the xpath
expression, usually you'd want to refer to nodes in the DOM via their
namespace alias you need to tell the xpath function which aliases you
intend to use in the xpath and which full URI they correspond to.
Furthermore, there is not concept of a default namespace in the xpath
expression. So while you can simply copy-paste the aliases and URIs from
all of the non-default namespace aliases you must also choose a unique
alias for the default namespace in the original document.

In the above I've copied the alias and namespace URI for the named "xsi"
alias and gave the name "defaultns" to the original document's default
namespace URI.

David J.