New XML section for documentation

Started by Bruce Momjianover 19 years ago30 messages
#1Bruce Momjian
bruce@momjian.us

Here is an new XML section for our SGML documentation. It explains the
various XML capabilities, if we support them, and how to use them.

Comments?

---------------------------------------------------------------------------

XML Document Support
====================
XML support is not one capability, but a variety of features supported
by a database. These capabilities include storage, import/export,
validation, indexing, efficiency of modification, searching,
transformating, and XML to SQL mapping. PostgreSQL supports some but
not all of these XML capabilities. Future releases of PostgreSQL will
continue to improve XML support.

Storage
-------
PostgreSQL stores XML documents as ordinary text documents. It does not
split apart XML documents into its component parts and store each
element separately. You can use middle-ware solutions to do that, but
once done, the data becomes relational and has to be processed
accordingly.

Import/Export
-------------
Because XML documents are stored as normal text documents, they can be
imported/exported with little complexity. A simple TEXT field can hold
up to 1 gigabyte of text, and large objects are available for larger
documents.

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema. A
server-side language with XML capabilities could be used to do
schema-specific XML checks.

Indexing
--------
Because XML documents are stored as text, full-text indexing tool
/contrib/tsearch2 can be used to index XML documents. Of course, the
searches are text searches, with no XML awareness, but tsearch2 can be
used with other XML capabilities to dramatically reduce the amount of
data processed at the XML level.

Modification
------------
If an UPDATE does not modify an XML field, the XML data is shared
between the old and new rows. However, if the UPDATE modifies a XML
field, a full modified copy of the XML field must be created internally.

Searching
---------
XPath searches are implemented using /contrib/xml2. It processes XML
text documents and returns results based on the requested query.

Transforming
------------
/contrib/xml2 supports XSL transformations.

XML to SQL Mapping
-------------------
This involves converting XML data to and from relational structures.
PostgreSQL has no internal support for such mapping, and relies on
external tools to do such conversions.

Missing Features
----------------
o XQuery
o SQL/XML syntax (ISO/IEC 9075-14)
o XML data type optimized for XML storage

See also http://www.rpbourret.com/xml/XMLAndDatabases.htm

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#2David Fetter
david@fetter.org
In reply to: Bruce Momjian (#1)
Re: [HACKERS] New XML section for documentation

On Fri, Aug 25, 2006 at 07:46:57PM -0400, Bruce Momjian wrote:

Here is an new XML section for our SGML documentation. It explains
the various XML capabilities, if we support them, and how to use
them.

Comments?

This looks hauntingly similar to Peter's presentation at the
conference. :) I'd add a http://wiscorp.com/SQLStandards.html to the
reference section.

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#3Bruce Momjian
bruce@momjian.us
In reply to: David Fetter (#2)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

On Fri, Aug 25, 2006 at 07:46:57PM -0400, Bruce Momjian wrote:

Here is an new XML section for our SGML documentation. It explains

the various XML capabilities, if we support them, and how to use
them.

Comments?

This looks hauntingly similar to Peter's presentation at the

I used the XML/SQL and validation part from his talk, but the rest was
from earlier email discussions.

conference. :) I'd add a http://wiscorp.com/SQLStandards.html to the

This seems to be the best URL, but it seems too detailed:

http://wiscorp.com/H2-2005-197-SC32N1293-WG3_Presentation_for_SC32_20050418.pdf

reference section.

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

I don't know anything about them.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In reply to: Bruce Momjian (#1)
Re: [HACKERS] New XML section for documentation

Bruce Momjian wrote:

Here is an new XML section for our SGML documentation. It explains the
various XML capabilities, if we support them, and how to use them.

Comments?

+1. Users often ask this in the mailing lists. Where are you want to
put this? I'll suggest: FAQ. What do you all think?

Missing Features
----------------
o XQuery
o SQL/XML syntax (ISO/IEC 9075-14)
o XML data type optimized for XML storage

Another section in TODO?

--
Euler Taveira de Oliveira
http://www.timbira.com/

#5Bruce Momjian
bruce@momjian.us
In reply to: Euler Taveira de Oliveira (#4)
Re: [HACKERS] New XML section for documentation

Euler Taveira de Oliveira wrote:

Bruce Momjian wrote:

Here is an new XML section for our SGML documentation. It explains the
various XML capabilities, if we support them, and how to use them.

Comments?

+1. Users often ask this in the mailing lists. Where are you want to
put this? I'll suggest: FAQ. What do you all think?

Our main documentation. Once it is there, people will find it rather
than on the FAQ.

Missing Features
----------------
o XQuery
o SQL/XML syntax (ISO/IEC 9075-14)
o XML data type optimized for XML storage

Another section in TODO?

Perhaps, yea.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#6Magnus Hagander
mha@sollentuna.net
In reply to: Bruce Momjian (#1)
Re: New XML section for documentation

Indexing
--------
Because XML documents are stored as text, full-text indexing tool
/contrib/tsearch2 can be used to index XML documents. Of
course, the searches are text searches, with no XML
awareness, but tsearch2 can be used with other XML
capabilities to dramatically reduce the amount of data
processed at the XML level.

You can also use a functional index and /contrib/xml2 to do limited
XPath indexing. (Can't make it "subtree-aware" for example, unless you
are willing to change your queries, but you can index specific xpath
nodes).

//Magnus

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#1)
Re: New XML section for documentation

Bruce Momjian wrote:

XML Document Support
====================
XML support is not one capability, but a variety of features
supported by a database.

database system

Storage
-------
PostgreSQL stores XML documents as ordinary text documents.

It is "possible" to do that, but this sounds like it's done
automatically or implicitly. Maybe:

"PostgreSQL does not have a specialized XML data type. The recommended
way is to store XML documents as text."

Import/Export
-------------
Because XML documents are stored as normal text documents, they can
be imported/exported with little complexity.

Import/export refers to exporting schema data with XML decorations. Of
course you can export column data trivially, but that's not what this
is about.

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema.

Then this is not validation but only checking for well-formedness. The
xml2 README says so, in fact.

Indexing
--------

I think the expression index capability combined with contrib/xml2 is
more relevant here than the full-text search capability.

Transforming
------------
/contrib/xml2 supports XSL transformations.

That's XSLT.

XML to SQL Mapping
-------------------
This involves converting XML data to and from relational structures.
PostgreSQL has no internal support for such mapping, and relies on
external tools to do such conversions.

Are there instances of such tools?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#8Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Peter Eisentraut (#7)
Re: New XML section for documentation

On 8/26/06, Peter Eisentraut <peter_e@gmx.net> wrote:

Bruce Momjian wrote:

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema.

Then this is not validation but only checking for well-formedness. The
xml2 README says so, in fact.

Exactly. contrib/xml2 mixes the term here, xml_valid() should be
another function, that takes two types of data - XML value and
corresponding XML schema - and validate the XML data. Actually, the
latest version of SQL/XML standard includes such a function
(XMLVALIDATE).

If you decide to include the mentioning about contrib/xml2 to docs, I
would suggest the patch for this module. The patch renames that
function to xml_check() and adds xml_array() (issue from the current
TODO). Or it's too late for 8.2?

Also, I would add a little introduction to XML terms (from XML
standards) to this documentation section.

--
Best regards,
Nikolay

#9Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Nikolay Samokhvalov (#8)
Re: New XML section for documentation

On 8/26/06, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
[...]

If you decide to include the mentioning about contrib/xml2 to docs, I
would suggest the patch for this module. The patch renames that
function to xml_check() and adds xml_array() (issue from the current
TODO). Or it's too late for 8.2?

[...]

Typo :-( I mean "xpath_array()"

--
Best regards,
Nikolay

#10David Fetter
david@fetter.org
In reply to: Bruce Momjian (#3)
Re: [HACKERS] New XML section for documentation

On Fri, Aug 25, 2006 at 08:37:19PM -0400, Bruce Momjian wrote:

David Fetter wrote:

On Fri, Aug 25, 2006 at 07:46:57PM -0400, Bruce Momjian wrote:

Here is an new XML section for our SGML documentation. It
explains the various XML capabilities, if we support them, and
how to use them.

Comments?

This looks hauntingly similar to Peter's presentation at the

I used the XML/SQL and validation part from his talk, but the rest
was from earlier email discussions.

Reuse is good :)

conference. :) I'd add a http://wiscorp.com/SQLStandards.html to the

This seems to be the best URL, but it seems too detailed:

http://wiscorp.com/H2-2005-197-SC32N1293-WG3_Presentation_for_SC32_20050418.pdf

I'd just put the http://wiscorp.com/SQLStandards.html URL in, as it
contains several references in varying levels of detail.

reference section.

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

I don't know anything about them.

We claim SQL standard compliance, so since those are part of SQL:2003,
we probably ought to mention them. SQL/PSM is a programming language
that lives inside the database, and DB2 and MySQL have it. SQL/MED
lets people talk to other data stores. SQL/OLB appears to be derived
from equel, which we have as ecpg. SQL/Schemata contains the
information schema. SQL/JRT appears to bear some similarity to
PL/Java and PL/J.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#11Bruce Momjian
bruce@momjian.us
In reply to: Peter Eisentraut (#7)
Re: New XML section for documentation

Peter Eisentraut wrote:

Bruce Momjian wrote:

XML Document Support
====================
XML support is not one capability, but a variety of features
supported by a database.

database system

Done.

Storage
-------
PostgreSQL stores XML documents as ordinary text documents.

It is "possible" to do that, but this sounds like it's done
automatically or implicitly. Maybe:

"PostgreSQL does not have a specialized XML data type. The recommended
way is to store XML documents as text."

Clarified.

Import/Export
-------------
Because XML documents are stored as normal text documents, they can
be imported/exported with little complexity.

Import/export refers to exporting schema data with XML decorations. Of
course you can export column data trivially, but that's not what this
is about.

OK, section redone.

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema.

Then this is not validation but only checking for well-formedness. The
xml2 README says so, in fact.

I made it clear in the section that the XML syntax was being checked,
not validation against a schema. You want Check and Validation
sections?

Indexing
--------

I think the expression index capability combined with contrib/xml2 is
more relevant here than the full-text search capability.

Agreed, added.

Transforming
------------
/contrib/xml2 supports XSL transformations.

That's XSLT.

OK.

XML to SQL Mapping
-------------------
This involves converting XML data to and from relational structures.
PostgreSQL has no internal support for such mapping, and relies on
external tools to do such conversions.

Are there instances of such tools?

Well, it seems EMS has a product that does it, and I assume other XML
tools have database interfaces. Also, psql can do it if you want to
convert XHTML to XML, so I mentioned that too.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#12Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#6)
Re: New XML section for documentation

Magnus Hagander wrote:

Indexing
--------
Because XML documents are stored as text, full-text indexing tool
/contrib/tsearch2 can be used to index XML documents. Of
course, the searches are text searches, with no XML
awareness, but tsearch2 can be used with other XML
capabilities to dramatically reduce the amount of data
processed at the XML level.

You can also use a functional index and /contrib/xml2 to do limited
XPath indexing. (Can't make it "subtree-aware" for example, unless you
are willing to change your queries, but you can index specific xpath
nodes).

Good point, added.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#13Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#8)
Re: New XML section for documentation

Nikolay Samokhvalov wrote:

On 8/26/06, Peter Eisentraut <peter_e@gmx.net> wrote:

Bruce Momjian wrote:

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema.

Then this is not validation but only checking for well-formedness. The
xml2 README says so, in fact.

Exactly. contrib/xml2 mixes the term here, xml_valid() should be
another function, that takes two types of data - XML value and
corresponding XML schema - and validate the XML data. Actually, the
latest version of SQL/XML standard includes such a function
(XMLVALIDATE).

I understand, but do we want to break backward compatibility to rename
it? We could create a xml_check, and keep xml_valid as a
single-argument function, and implement schema-checks as a two-parameter
function, but that seems odd too.

If you decide to include the mentioning about contrib/xml2 to docs, I
would suggest the patch for this module. The patch renames that
function to xml_check() and adds xml_array() (issue from the current
TODO). Or it's too late for 8.2?

Hard to say. What does xml_array do? We are more lenient about
/contrib additions after feature freeze, but it is pretty late. Aren't
you working on updating the new XML syntax support in the backend? Are
you done with that patch?

Also, I would add a little introduction to XML terms (from XML
standards) to this documentation section.

OK, but which terms. I only see XML and XSLT, and I documented those on
first mention in the newest version.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#14Bruce Momjian
bruce@momjian.us
In reply to: David Fetter (#10)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

On Fri, Aug 25, 2006 at 08:37:19PM -0400, Bruce Momjian wrote:

David Fetter wrote:

On Fri, Aug 25, 2006 at 07:46:57PM -0400, Bruce Momjian wrote:

Here is an new XML section for our SGML documentation. It
explains the various XML capabilities, if we support them, and
how to use them.

Comments?

This looks hauntingly similar to Peter's presentation at the

I used the XML/SQL and validation part from his talk, but the rest
was from earlier email discussions.

Reuse is good :)

conference. :) I'd add a http://wiscorp.com/SQLStandards.html to the

This seems to be the best URL, but it seems too detailed:

http://wiscorp.com/H2-2005-197-SC32N1293-WG3_Presentation_for_SC32_20050418.pdf

I'd just put the http://wiscorp.com/SQLStandards.html URL in, as it
contains several references in varying levels of detail.

OK, added.

reference section.

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

I don't know anything about them.

We claim SQL standard compliance, so since those are part of SQL:2003,
we probably ought to mention them. SQL/PSM is a programming language
that lives inside the database, and DB2 and MySQL have it. SQL/MED
lets people talk to other data stores. SQL/OLB appears to be derived
from equel, which we have as ecpg. SQL/Schemata contains the
information schema. SQL/JRT appears to bear some similarity to
PL/Java and PL/J.

I think the big question is whether we are ever going to implement
these? I think we need to decide that before I mention them.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#15Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#8)
Re: New XML section for documentation

Updated XML documentation based on feedback. Comments?

---------------------------------------------------------------------------

XML Document Support
====================
XML (eXtensible Markup Language) support is not one capability, but a
variety of features supported by a database system. These capabilities
include storage, import/export, validation, indexing, efficiency of
modification, searching, transformating, and XML to SQL mapping.
PostgreSQL supports some but not all of these XML capabilities. Future
releases of PostgreSQL will continue to improve XML support.

Storage
-------

PostgreSQL does not have a specialized XML data type. Users should
store XML documents in ordinary TEXT fields. If you need the document
split apart into its component parts so each element is stored
separately, you must use a middle-ware solution to do that, but once
done, the data becomes relational and has to be processed accordingly.

Import/Export
-------------
There is no facility for mapping XML to relational tables. An external
tool must be used for this. One simple way to export XML is to use psql
in HTML mode ("\pset format html"), and convert the XHTML to XML using
an external tool.

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema. A
server-side language with XML capabilities could be used to do
schema-specific XML checks.

Indexing
--------
/contrib/xml2 functions can be used in expression indexes to index
specific XML fields. To index the full contents of XML documents, the
full-text indexing tool /contrib/tsearch2 can be used. Of course,
tsearch2 indexes have no XML awareness so additional /contrib/xml2
checks should be added to queries.

Modification
------------
If an UPDATE does not modify an XML field, the XML data is shared
between the old and new rows. However, if the UPDATE modifies a XML
field, a full modified copy of the XML field must be created internally.

Searching
---------
XPath searches are implemented using /contrib/xml2. It processes XML
text documents and returns results based on the requested query.

Transforming
------------
/contrib/xml2 supports XSLT (XML Stylesheet Language Transformation).

XML to SQL Mapping
-------------------
This involves converting XML data to and from relational structures.
PostgreSQL has no internal support for such mapping, and relies on
external tools to do such conversions.

Missing Features
----------------
o XQuery
o SQL/XML syntax (ISO/IEC 9075-14)
o XML data type optimized for XML storage

See also http://www.rpbourret.com/xml/XMLAndDatabases.htm for an
overview XML use in databases, and http://wiscorp.com/SQLStandards.html
for the XML standards.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#16David Fetter
david@fetter.org
In reply to: Bruce Momjian (#14)
Re: [HACKERS] New XML section for documentation

On Sat, Aug 26, 2006 at 12:48:32PM -0400, Bruce Momjian wrote:

David Fetter wrote:

On Fri, Aug 25, 2006 at 08:37:19PM -0400, Bruce Momjian wrote:

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

I don't know anything about them.

We claim SQL standard compliance, so since those are part of
SQL:2003, we probably ought to mention them. SQL/PSM is a
programming language that lives inside the database, and DB2 and
MySQL have it. SQL/MED lets people talk to other data stores.
SQL/OLB appears to be derived from equel, which we have as ecpg.
SQL/Schemata contains the information schema. SQL/JRT appears to
bear some similarity to PL/Java and PL/J.

I think the big question is whether we are ever going to implement
these? I think we need to decide that before I mention them.

The SQL/Schemata thing is already in. I think we should at least
mention which features that we already have are from what part of the
standard. As far as the rest of the standard goes, we might want to
mention whether we've even considered any of each piece in the TODO
list, and what sub-pieces, if any, are already included/scheduled/too
silly to contemplate :)

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#17Bruce Momjian
bruce@momjian.us
In reply to: David Fetter (#16)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

On Sat, Aug 26, 2006 at 12:48:32PM -0400, Bruce Momjian wrote:

David Fetter wrote:

On Fri, Aug 25, 2006 at 08:37:19PM -0400, Bruce Momjian wrote:

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

I don't know anything about them.

We claim SQL standard compliance, so since those are part of
SQL:2003, we probably ought to mention them. SQL/PSM is a
programming language that lives inside the database, and DB2 and
MySQL have it. SQL/MED lets people talk to other data stores.
SQL/OLB appears to be derived from equel, which we have as ecpg.
SQL/Schemata contains the information schema. SQL/JRT appears to
bear some similarity to PL/Java and PL/J.

I think the big question is whether we are ever going to implement
these? I think we need to decide that before I mention them.

The SQL/Schemata thing is already in. I think we should at least

Uh, what is the SQL/Schemata? Are you sure it is in CVS?

mention which features that we already have are from what part of the
standard. As far as the rest of the standard goes, we might want to
mention whether we've even considered any of each piece in the TODO
list, and what sub-pieces, if any, are already included/scheduled/too
silly to contemplate :)

Well, this seems like something that belongs in our chapter on how we
support the SQL standard.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#18David Fetter
david@fetter.org
In reply to: Bruce Momjian (#17)
Re: [HACKERS] New XML section for documentation

On Sat, Aug 26, 2006 at 01:16:06PM -0400, Bruce Momjian wrote:

David Fetter wrote:

On Sat, Aug 26, 2006 at 12:48:32PM -0400, Bruce Momjian wrote:

David Fetter wrote:

On Fri, Aug 25, 2006 at 08:37:19PM -0400, Bruce Momjian wrote:

Speaking of other parts of the SQL:2003 standard, how about one
section each that mentions them? There's

Part 4: SQL/PSM (Persistent Stored Modules)
Part 9: SQL/MED (Management of External Data) (my favorite)
Part 10: SQL/OLB (Object Language Binding)
Part 11: SQL/Schemata
Part 13: SQL/JRT (Java Routines and Types)

I don't know anything about them.

We claim SQL standard compliance, so since those are part of
SQL:2003, we probably ought to mention them. SQL/PSM is a
programming language that lives inside the database, and DB2 and
MySQL have it. SQL/MED lets people talk to other data stores.
SQL/OLB appears to be derived from equel, which we have as ecpg.
SQL/Schemata contains the information schema. SQL/JRT appears to
bear some similarity to PL/Java and PL/J.

I think the big question is whether we are ever going to implement
these? I think we need to decide that before I mention them.

The SQL/Schemata thing is already in. I think we should at least

Uh, what is the SQL/Schemata? Are you sure it is in CVS?

It contains the information schema, among other things. We've had the
information schema for awhile. :)

mention which features that we already have are from what part of
the standard. As far as the rest of the standard goes, we might
want to mention whether we've even considered any of each piece in
the TODO list, and what sub-pieces, if any, are already
included/scheduled/too silly to contemplate :)

Well, this seems like something that belongs in our chapter on how
we support the SQL standard.

I'm not too fussy about where it first goes in. Just *that* it goes
in somewhere. I'll be happy to start the needed patches. :)

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#19Bruce Momjian
bruce@momjian.us
In reply to: David Fetter (#18)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

mention which features that we already have are from what part of
the standard. As far as the rest of the standard goes, we might
want to mention whether we've even considered any of each piece in
the TODO list, and what sub-pieces, if any, are already
included/scheduled/too silly to contemplate :)

Well, this seems like something that belongs in our chapter on how
we support the SQL standard.

I'm not too fussy about where it first goes in. Just *that* it goes
in somewhere. I'll be happy to start the needed patches. :)

OK, I think the SGML docs are the proper place.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#20Peter Eisentraut
peter_e@gmx.net
In reply to: David Fetter (#10)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

We claim SQL standard compliance,

No, we don't. And SQL conformance doesn't require you to implement all
parts anyway.

so since those are part of
SQL:2003, we probably ought to mention them. SQL/PSM is a
programming language that lives inside the database, and DB2 and
MySQL have it. SQL/MED lets people talk to other data stores.
SQL/OLB appears to be derived from equel, which we have as ecpg.
SQL/Schemata contains the information schema. SQL/JRT appears to
bear some similarity to PL/Java and PL/J.

It's pretty useless to talk about stuff that we don't have yet. The
point of the XML section is that we have a number of things, and users
are having trouble (understandably) fitting them together.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#21Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#11)
Re: New XML section for documentation

Bruce Momjian wrote:

I made it clear in the section that the XML syntax was being checked,
not validation against a schema. You want Check and Validation
sections?

"Valid" and "well-formed" have very specific distinct meanings in XML.
(Note that "check" doesn't have any meaning there.) We will eventually
want a method to verify both the validity and the well-formedness.

I think that a function called xml_valid checks for well-formedness is
an outright bug and needs to be fixed.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#22Peter Eisentraut
peter_e@gmx.net
In reply to: David Fetter (#16)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

The SQL/Schemata thing is already in. I think we should at least
mention which features that we already have are from what part of the
standard.

We do. Read the documentation.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#23Joshua D. Drake
jd@commandprompt.com
In reply to: Peter Eisentraut (#20)
Re: [HACKERS] New XML section for documentation

Peter Eisentraut wrote:

David Fetter wrote:

We claim SQL standard compliance,

No, we don't. And SQL conformance doesn't require you to implement all
parts anyway.

so since those are part of
SQL:2003, we probably ought to mention them. SQL/PSM is a
programming language that lives inside the database, and DB2 and
MySQL have it. SQL/MED lets people talk to other data stores.
SQL/OLB appears to be derived from equel, which we have as ecpg.
SQL/Schemata contains the information schema. SQL/JRT appears to
bear some similarity to PL/Java and PL/J.

It's pretty useless to talk about stuff that we don't have yet. The
point of the XML section is that we have a number of things, and users
are having trouble (understandably) fitting them together.

As separate sections I agree with Peter. However it would be a good idea
to have a section that talks about Potential and/or Upcoming technologies.

All of the above could be covered under that.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#24David Fetter
david@fetter.org
In reply to: Peter Eisentraut (#20)
Re: [HACKERS] New XML section for documentation

On Sat, Aug 26, 2006 at 08:38:43PM +0200, Peter Eisentraut wrote:

David Fetter wrote:

We claim SQL standard compliance,

No, we don't. And SQL conformance doesn't require you to implement
all parts anyway.

Right. It'd be nice to be able to tell what level of conformance we
have to which parts of the standard.

so since those are part of SQL:2003, we probably ought to mention
them. SQL/PSM is a programming language that lives inside the
database, and DB2 and MySQL have it. SQL/MED lets people talk to
other data stores. SQL/OLB appears to be derived from equel,
which we have as ecpg. SQL/Schemata contains the information
schema. SQL/JRT appears to bear some similarity to PL/Java and
PL/J.

It's pretty useless to talk about stuff that we don't have yet.

I think it's useful to mention what's arriving, what's being worked
on, and what's not even being contemplated in the long term.

The point of the XML section is that we have a number of things, and
users are having trouble (understandably) fitting them together.

Similar troubles apply--on a smaller scale--to the information schema,
SQL/OLB, SQL/JRT, etc.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#25Joshua D. Drake
jd@commandprompt.com
In reply to: David Fetter (#16)
Re: [HACKERS] New XML section for documentation

bear some similarity to PL/Java and PL/J.

I think the big question is whether we are ever going to implement
these? I think we need to decide that before I mention them.

The SQL/Schemata thing is already in. I think we should at least
mention which features that we already have are from what part of the
standard.

I also see PSM and OLB as a target.

Joshua D. Drake

As far as the rest of the standard goes, we might want to
mention whether we've even considered any of each piece in the TODO
list, and what sub-pieces, if any, are already included/scheduled/too
silly to contemplate :)

Cheers,
D

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#26Peter Eisentraut
peter_e@gmx.net
In reply to: David Fetter (#24)
Re: [HACKERS] New XML section for documentation

David Fetter wrote:

I think it's useful to mention what's arriving, what's being worked
on, and what's not even being contemplated in the long term.

We don't even have a roadmap of any kind, so the last thing we can do is
put claims of that sort in the documentation.

Similar troubles apply--on a smaller scale--to the information
schema, SQL/OLB, SQL/JRT, etc.

The information schema is quite extensively documentated. If you have
something to add on OLB and JRT, then let's hear your suggestions.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#27Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Peter Eisentraut (#21)
Re: [HACKERS] New XML section for documentation

On 8/26/06, Peter Eisentraut <peter_e@gmx.net> wrote:

Bruce Momjian wrote:

I made it clear in the section that the XML syntax was being checked,
not validation against a schema. You want Check and Validation
sections?

"Valid" and "well-formed" have very specific distinct meanings in XML.
(Note that "check" doesn't have any meaning there.) We will eventually
want a method to verify both the validity and the well-formedness.

I think that a function called xml_valid checks for well-formedness is
an outright bug and needs to be fixed.

That's exactly what I'm talking about. xml_valid() is wrong name and
it may confuse people.
I what to add that, with XML section in the documentation, this bug
becomes more significant.

Bruce suggested to use overload to keep backward compat. - in other
words, 1-arg function for checking for well-formedness and 2-arg
function for validation process. That's bad too:
- two _different_ actions for one function => another confusion
- I (as a user) would think that 1-arg function is designed for
validation process for cases when XML document contains a reference to
DTD (as an example).

I stand for fixing it via renaming, breaking backward compatibility.
Later it will be more painful.

BTW, what is the deadline for changes (additions) in docs? I would add
general XML terms (such as what is XML, what is well-formed document,
what is validation; short overview of XML standards and SQL/XML as a
part of SQL:200n, etc Maybe about contrib/xml2 installation process -
actually, XSLT support requires additional lib). Moreover, if SQL/XML
patch will be accepted it will require several words too.
--
Best regards,
Nikolay

#28Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#27)
Re: New XML section for documentation

I have added a modified version of this to the SGML documentation,
under data types.

---------------------------------------------------------------------------

bruce wrote:

Here is an new XML section for our SGML documentation. It explains the
various XML capabilities, if we support them, and how to use them.

Comments?

---------------------------------------------------------------------------

XML Document Support
====================
XML support is not one capability, but a variety of features supported
by a database. These capabilities include storage, import/export,
validation, indexing, efficiency of modification, searching,
transformating, and XML to SQL mapping. PostgreSQL supports some but
not all of these XML capabilities. Future releases of PostgreSQL will
continue to improve XML support.

Storage
-------
PostgreSQL stores XML documents as ordinary text documents. It does not
split apart XML documents into its component parts and store each
element separately. You can use middle-ware solutions to do that, but
once done, the data becomes relational and has to be processed
accordingly.

Import/Export
-------------
Because XML documents are stored as normal text documents, they can be
imported/exported with little complexity. A simple TEXT field can hold
up to 1 gigabyte of text, and large objects are available for larger
documents.

Validation
----------
/contrib/xml2 has a function called xml_valid() that can be used in
a CHECK constraint to enforce that a field contains valid XML. It
does not support validation against a specific XML schema. A
server-side language with XML capabilities could be used to do
schema-specific XML checks.

Indexing
--------
Because XML documents are stored as text, full-text indexing tool
/contrib/tsearch2 can be used to index XML documents. Of course, the
searches are text searches, with no XML awareness, but tsearch2 can be
used with other XML capabilities to dramatically reduce the amount of
data processed at the XML level.

Modification
------------
If an UPDATE does not modify an XML field, the XML data is shared
between the old and new rows. However, if the UPDATE modifies a XML
field, a full modified copy of the XML field must be created internally.

Searching
---------
XPath searches are implemented using /contrib/xml2. It processes XML
text documents and returns results based on the requested query.

Transforming
------------
/contrib/xml2 supports XSL transformations.

XML to SQL Mapping
-------------------
This involves converting XML data to and from relational structures.
PostgreSQL has no internal support for such mapping, and relies on
external tools to do such conversions.

Missing Features
----------------
o XQuery
o SQL/XML syntax (ISO/IEC 9075-14)
o XML data type optimized for XML storage

See also http://www.rpbourret.com/xml/XMLAndDatabases.htm

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nikolay Samokhvalov (#27)
Re: [HACKERS] New XML section for documentation

"Nikolay Samokhvalov" <samokhvalov@gmail.com> writes:

On 8/26/06, Peter Eisentraut <peter_e@gmx.net> wrote:

"Valid" and "well-formed" have very specific distinct meanings in XML.
(Note that "check" doesn't have any meaning there.) We will eventually
want a method to verify both the validity and the well-formedness.

I think that a function called xml_valid checks for well-formedness is
an outright bug and needs to be fixed.

That's exactly what I'm talking about. xml_valid() is wrong name and
it may confuse people.

Bruce suggested to use overload to keep backward compat. - in other
words, 1-arg function for checking for well-formedness and 2-arg
function for validation process. That's bad too:

ISTM the right answer is to add xml_is_well_formed() in this release
and have xml_valid as an alias for it, with documentation explaining
that xml_valid is deprecated and will be removed in the next release.
Then we can add a proper validity-checking function too.

Nikolay submitted a patch later
http://archives.postgresql.org/pgsql-patches/2006-09/msg00123.php
that does part of this and can easily be adapted to add the alias.

His patch also adds an xpath_array() function --- what do people
think about that? It's well past feature freeze ... now we've always
been laxer about contrib than the core code, but still I'm inclined
to say that that function should wait for 8.3.

Comments? It's time to get a move on with resolving this.

regards, tom lane

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#29)
Re: [HACKERS] New XML section for documentation

I wrote:

ISTM the right answer is to add xml_is_well_formed() in this release
and have xml_valid as an alias for it, with documentation explaining
that xml_valid is deprecated and will be removed in the next release.

Not hearing any objection, I've done this.

His patch also adds an xpath_array() function --- what do people
think about that? It's well past feature freeze ... now we've always
been laxer about contrib than the core code, but still I'm inclined
to say that that function should wait for 8.3.

I didn't add xpath_array(), but am still open to doing it if there
is any consensus in favor of it.

regards, tom lane