XML import with DTD
Hi
I'm trying to use the XPath functionality of Postgres.
I can populate a text field (unparsed) with XML data but as far as I can
see the xpath() function [now] only works on the xml data type.
When I try to populate a text field with XML data containing a DTD,
however, the parser chokes. If I strip the DTD the parser chokes on
undefined entities which are defined in the DTD.
(I switched the app' to from MySQL to Postgres because while MySQL works
it returns matches in undelimited form which is next to useless if, for
example, you return multiple attributes from a node.)
Does anyone know of a solution to this problem?
Windows 2000 Server
Postgres 8.4
Regards
Roy Walter
Post a snippet of the xml and xpath you are trying to use.
Scott
----- Original Message -----
From: "Roy Walter" <walt@brookhouse.co.uk>
To: pgsql-general@postgresql.org
Sent: Friday, July 10, 2009 7:49:00 AM GMT -08:00 US/Canada Pacific
Subject: [GENERAL] XML import with DTD
Hi
I'm trying to use the XPath functionality of Postgres.
I can populate a text field (unparsed) with XML data but as far as I can see the xpath() function [now] only works on the xml data type.
When I try to populate a text field with XML data containing a DTD, however, the parser chokes. If I strip the DTD the parser chokes on undefined entities which are defined in the DTD.
(I switched the app' to from MySQL to Postgres because while MySQL works it returns matches in undelimited form which is next to useless if, for example, you return multiple attributes from a node.)
Does anyone know of a solution to this problem?
Windows 2000 Server
Postgres 8.4
Regards
Roy Walter
It's not an xpath problem it's an XML import problem. Sorry if I wasn't
clear.
Consider the following example queries. This one works fine:
INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<shop>
<product>Shoes</product>
</shop>')))
This one does not:
INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE publicwhip
[
<!ENTITY ndash "–">
<!ENTITY mdash "—">
]>
<shop>
<product>Shoes</product>
</shop>')))
Both are valid XML but the second query fails as follows:
ERROR: invalid XML content
DETAIL: Entity: line 2: parser error : StartTag: invalid element name
<!DOCTYPE publicwhip
^
Entity: line 4: parser error : StartTag: invalid element name
<!ENTITY ndash "–">
^
Entity: line 5: parser error : StartTag: invalid element name
<!ENTITY mdash "—">
-- Roy
artacus@comcast.net wrote:
Show quoted text
Post a snippet of the xml and xpath you are trying to use.
Scott
----- Original Message -----
From: "Roy Walter" <walt@brookhouse.co.uk>
To: pgsql-general@postgresql.org
Sent: Friday, July 10, 2009 7:49:00 AM GMT -08:00 US/Canada Pacific
Subject: [GENERAL] XML import with DTDHi
I'm trying to use the XPath functionality of Postgres.
I can populate a text field (unparsed) with XML data but as far as I
can see the xpath() function [now] only works on the xml data type.When I try to populate a text field with XML data containing a DTD,
however, the parser chokes. If I strip the DTD the parser chokes on
undefined entities which are defined in the DTD.(I switched the app' to from MySQL to Postgres because while MySQL
works it returns matches in undelimited form which is next to useless
if, for example, you return multiple attributes from a node.)Does anyone know of a solution to this problem?
Windows 2000 Server
Postgres 8.4Regards
Roy Walter
------------------------------------------------------------------------No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.387 / Virus Database: 270.13.9/2229 - Release Date: 07/10/09 07:05:00
Roy Walter <walt@brookhouse.co.uk> writes:
This one does not:
INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE publicwhip
[
<!ENTITY ndash "–">
<!ENTITY mdash "—">
]>
<shop>
<product>Shoes</product>
</shop>')))
What I know about XML wouldn't fill a thimble, but shouldn't you say
DOCUMENT not CONTENT if you are trying to provide a complete document?
Doing that seems to make this work without error.
The fine manual states near the bottom of 8.13.1
http://www.postgresql.org/docs/8.4/static/datatype-xml.html
that CONTENT is less restrictive than DOCUMENT, but at least for
this specific point that seems not to be true.
regards, tom lane
Doh! That's it. Thanks a million.
-- Roy
Tom Lane wrote:
Show quoted text
Roy Walter <walt@brookhouse.co.uk> writes:
This one does not:
INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE publicwhip
[
<!ENTITY ndash "–">
<!ENTITY mdash "—">
]>
<shop>
<product>Shoes</product>
</shop>')))What I know about XML wouldn't fill a thimble, but shouldn't you say
DOCUMENT not CONTENT if you are trying to provide a complete document?
Doing that seems to make this work without error.The fine manual states near the bottom of 8.13.1
http://www.postgresql.org/docs/8.4/static/datatype-xml.html
that CONTENT is less restrictive than DOCUMENT, but at least for
this specific point that seems not to be true.regards, tom lane
------------------------------------------------------------------------No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.387 / Virus Database: 270.13.10/2231 - Release Date: 07/11/09 05:57:00