need xmllint on borka

Started by Peter Eisentrautover 11 years ago20 messages
#1Peter Eisentraut
peter_e@gmx.net
1 attachment(s)

I have been working on making the DocBook XML output valid. The first
part was bb4eefe7bf518e42c73797ea37b033a5d8a8e70a, I now have the rest
ready, but I'll spare you the mostly mechanical 200kB patch for now. In
addition, I'd like to add the attached patch with an xmllint call to
make sure things stay valid.

But we don't have xmllint installed on borka, where we build the
releases. Could someone please install it?

Attachments:

xmllint.patchtext/x-patch; charset=UTF-8; name=xmllint.patchDownload
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -76,6 +76,7 @@ override SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
 man distprep-man: man-stamp
 
 man-stamp: stylesheet-man.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_MAN_FLAGS) $^
 	touch $@
 
@@ -252,11 +253,13 @@ endif
 xslthtml: xslthtml-stamp
 
 xslthtml-stamp: stylesheet.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $^
 	cp $(srcdir)/stylesheet.css html/
 	touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $^
 
 %-A4.fo.tmp: stylesheet-fo.xsl %.xml
@@ -279,6 +282,7 @@ XMLLINT = xmllint
 
 epub: postgres.epub
 postgres.epub: postgres.xml
+	$(XMLLINT) --noout --valid $<
 	$(DBTOEPUB) $<
 
 
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: need xmllint on borka

Peter Eisentraut <peter_e@gmx.net> writes:

I have been working on making the DocBook XML output valid. The first
part was bb4eefe7bf518e42c73797ea37b033a5d8a8e70a, I now have the rest
ready, but I'll spare you the mostly mechanical 200kB patch for now. In
addition, I'd like to add the attached patch with an xmllint call to
make sure things stay valid.

But we don't have xmllint installed on borka, where we build the
releases. Could someone please install it?

-1. Doesn't this break "make man" for *any* hacker who doesn't have
xmllint installed?

Please change the patch so that it runs xmllint if available, rather
than turning it into a de facto requirement for anybody who works on
documentation. Or if you think you can pass a vote to require it,
I'd suggest that the patch had better include documentation additions
listing this as a required build tool.

(The subtext here is that borka is absolutely not an acceptable place
to encounter documentation build failures. By the time we're at that
stage of the release cycle, I don't really care what xmllint might
have to say; there isn't going to be time to make it happy.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: need xmllint on borka

Tom Lane wrote:

(The subtext here is that borka is absolutely not an acceptable place
to encounter documentation build failures. By the time we're at that
stage of the release cycle, I don't really care what xmllint might
have to say; there isn't going to be time to make it happy.)

Borka is what runs the guaibasaurus animal, so failures would show up in
buildfarm ...

If the xmllint check could be made optional, I guess we could have the
failures show up in buildfarm but avoid having them cause a problem for
"make dist" when releases are created.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#1)
Re: need xmllint on borka

Peter Eisentraut wrote:

I have been working on making the DocBook XML output valid. The first
part was bb4eefe7bf518e42c73797ea37b033a5d8a8e70a, I now have the rest
ready, but I'll spare you the mostly mechanical 200kB patch for now. In
addition, I'd like to add the attached patch with an xmllint call to
make sure things stay valid.

But we don't have xmllint installed on borka, where we build the
releases. Could someone please install it?

xmllint installed on borka.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#2)
Re: need xmllint on borka

On Thu, 2014-05-01 at 22:51 -0400, Tom Lane wrote:

But we don't have xmllint installed on borka, where we build the
releases. Could someone please install it?

-1. Doesn't this break "make man" for *any* hacker who doesn't have
xmllint installed?

Please change the patch so that it runs xmllint if available, rather
than turning it into a de facto requirement for anybody who works on
documentation.

The intention is that we enforce that the documentation is correctly
formatted. Enforcing that only when the required tool is installed is
the same as not enforcing it and annoying those who happen to have the
tool installed.

Or if you think you can pass a vote to require it,
I'd suggest that the patch had better include documentation additions
listing this as a required build tool.

Agreed. I have committed the SGML changes that make things valid now,
but I will postpone the xmllint addition until the 9.5 branch, complete
with more documentation.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: need xmllint on borka

Peter Eisentraut <peter_e@gmx.net> writes:

On Thu, 2014-05-01 at 22:51 -0400, Tom Lane wrote:

-1. Doesn't this break "make man" for *any* hacker who doesn't have
xmllint installed?

The intention is that we enforce that the documentation is correctly
formatted. Enforcing that only when the required tool is installed is
the same as not enforcing it and annoying those who happen to have the
tool installed.

Okay, but if that's the intent I suggest that the check needs to happen
somewhere more prominent than in nondefault build targets. Personally,
I run "make man" about once a release cycle, if that often; and I'm not
sure I've ever built any of the other targets modified in this patch.
If you want to enforce correctness then I think you should put the xmllint
call into the "make all" path, which would render all of the targets
touched here rather redundant.

Also, in the vein of "is this a full-scale build requirement or not",
why is the pathname of xmllint just hard-coded into the makefile and
not something that's checked for by configure?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#5)
1 attachment(s)
run xmllint during build (was Re: need xmllint on borka)

On 5/6/14 10:20 PM, Peter Eisentraut wrote:

Agreed. I have committed the SGML changes that make things valid now,
but I will postpone the xmllint addition until the 9.5 branch, complete
with more documentation.

Per the above announcement, here is an updated patch, also with more
documentation and explanations.

It would especially be valuable if someone with a different-than-mine OS
would verify whether they can install xmllint according to the
documentation, or update the documentation if not.

Attachments:

0001-doc-Check-DocBook-XML-validity-during-the-build.patchtext/x-patch; name=0001-doc-Check-DocBook-XML-validity-during-the-build.patchDownload
>From 27f1fabb4afb32ec44373100ab438277ebdac806 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 14 Aug 2014 21:48:44 -0400
Subject: [PATCH] doc: Check DocBook XML validity during the build

Building the documentation with XSLT does not check the DTD, like a
DSSSL build would.  One can often get away with having invalid XML, but
the stylesheets might then create incorrect output, as they are not
designed to handle that.  Therefore, check the validity of the XML
against the DTD, using xmllint, during the build.

Add xmllint detection to configure, and add some documentation.

xmllint comes with libxml2, which is already in use, but it might be in
a separate package, such as libxml2-utils on Debian.
---
 configure                  | 43 +++++++++++++++++++++++++++++++++++++++++++
 configure.in               |  1 +
 doc/src/sgml/Makefile      |  9 ++++++++-
 doc/src/sgml/docguide.sgml | 19 +++++++++++++++++--
 src/Makefile.global.in     |  1 +
 5 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 0f435b5..57c2455 100755
--- a/configure
+++ b/configure
@@ -630,6 +630,7 @@ vpath_build
 PROVE
 OSX
 XSLTPROC
+XMLLINT
 COLLATEINDEX
 DOCBOOKSTYLE
 have_docbook
@@ -14415,6 +14416,48 @@ fi
 
 
 fi
+for ac_prog in xmllint
+do
+  # Extract the first word of "$ac_prog", so it can be a program name with args.
+set dummy $ac_prog; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_prog_XMLLINT+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if test -n "$XMLLINT"; then
+  ac_cv_prog_XMLLINT="$XMLLINT" # Let the user override the test.
+else
+as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_prog_XMLLINT="$ac_prog"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+fi
+fi
+XMLLINT=$ac_cv_prog_XMLLINT
+if test -n "$XMLLINT"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $XMLLINT" >&5
+$as_echo "$XMLLINT" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+  test -n "$XMLLINT" && break
+done
+
 for ac_prog in xsltproc
 do
   # Extract the first word of "$ac_prog", so it can be a program name with args.
diff --git a/configure.in b/configure.in
index f8a4507..0d963c6 100644
--- a/configure.in
+++ b/configure.in
@@ -1864,6 +1864,7 @@ PGAC_PROG_JADE
 PGAC_CHECK_DOCBOOK(4.2)
 PGAC_PATH_DOCBOOK_STYLESHEETS
 PGAC_PATH_COLLATEINDEX
+AC_CHECK_PROGS(XMLLINT, xmllint)
 AC_CHECK_PROGS(XSLTPROC, xsltproc)
 AC_CHECK_PROGS(OSX, [osx sgml2xml sx])
 
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 271c700..d2517f2 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -40,6 +40,10 @@ ifndef OSX
 OSX = osx
 endif
 
+ifndef XMLLINT
+XMLLINT = xmllint
+endif
+
 ifndef XSLTPROC
 XSLTPROC = xsltproc
 endif
@@ -76,6 +80,7 @@ override SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
 man distprep-man: man-stamp
 
 man-stamp: stylesheet-man.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_MAN_FLAGS) $^
 	touch $@
 
@@ -252,11 +257,13 @@ endif
 xslthtml: xslthtml-stamp
 
 xslthtml-stamp: stylesheet.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $^
 	cp $(srcdir)/stylesheet.css html/
 	touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $^
 
 %-A4.fo.tmp: stylesheet-fo.xsl %.xml
@@ -266,7 +273,6 @@ htmlhelp: stylesheet-hh.xsl postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) --stringparam paper.type USletter -o $@ $^
 
 FOP = fop
-XMLLINT = xmllint
 
 # reformat FO output so that locations of errors are easier to find
 %.fo: %.fo.tmp
@@ -279,6 +285,7 @@ XMLLINT = xmllint
 
 epub: postgres.epub
 postgres.epub: postgres.xml
+	$(XMLLINT) --noout --valid $<
 	$(DBTOEPUB) $<
 
 
diff --git a/doc/src/sgml/docguide.sgml b/doc/src/sgml/docguide.sgml
index 816375f..2fe325e 100644
--- a/doc/src/sgml/docguide.sgml
+++ b/doc/src/sgml/docguide.sgml
@@ -149,6 +149,20 @@ <title>Tool Sets</title>
     </varlistentry>
 
     <varlistentry>
+     <term><ulink url="http://xmlsoft.org/">Libxml2</ulink> for <command>xmllint</command></term>
+     <listitem>
+      <para>
+       This library and the <command>xmllint</command> tool it contains are
+       used for processing XML.  Many developers will already
+       have <application>Libxml2</application> installed, because it is also
+       used when building the PostgreSQL code.  Note, however,
+       that <command>xmllint</command> might need to be installed from a
+       separate subpackage.
+      </para>
+     </listitem>
+     </varlistentry>
+
+    <varlistentry>
      <term><ulink url="http://xmlsoft.org/XSLT/">Libxslt</ulink> for <command>xsltproc</command></term>
      <listitem>
       <para>
@@ -201,7 +215,8 @@ <title><productname>Linux</productname> <acronym>RPM</acronym> Installation</tit
     installing, or the following packages:
     <filename>sgml-common</filename>, <filename>docbook</filename>,
     <filename>stylesheets</filename>, <filename>openjade</filename>
-    (or <filename>jade</filename>).  You may also need <filename>sgml-tools</filename>
+    (or <filename>jade</filename>).  You may also need <filename>sgml-tools</filename>,
+    either <filename>libxml2</filename> or <filename>libxml2-utils</filename>,
     and either <filename>xsltproc</filename> or <filename>libxslt</>.  If your
     distributor does not provide these then you should be able to make
     use of the packages from some other, reasonably compatible vendor.
@@ -273,7 +288,7 @@ <title>Debian Packages</title>
     available for <productname>Debian GNU/Linux</productname>.
     To install, simply use:
 <programlisting>
-apt-get install docbook docbook-dsssl docbook-xsl openjade1.3 opensp xsltproc
+apt-get install docbook docbook-dsssl docbook-xsl libxml2-utils openjade1.3 opensp xsltproc
 </programlisting>
    </para>
   </sect2>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index ecd4015..b3b5df5 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -342,6 +342,7 @@ DOCBOOKSTYLE	= @DOCBOOKSTYLE@
 JADE			= @JADE@
 NSGMLS			= @NSGMLS@
 OSX				= @OSX@
+XMLLINT			= @XMLLINT@
 XSLTPROC		= @XSLTPROC@
 
 # Code coverage
-- 
2.0.4

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#7)
Re: run xmllint during build (was Re: need xmllint on borka)

Peter Eisentraut <peter_e@gmx.net> writes:

It would especially be valuable if someone with a different-than-mine OS
would verify whether they can install xmllint according to the
documentation, or update the documentation if not.

FWIW, xmllint appears to be part of the base libxml2 package on Red Hat
(checked on RHEL6 and Fedora rawhide).

I'm not sure I'd phrase it like this:

+       This library and the <command>xmllint</command> tool it contains are
+       used for processing XML.  Many developers will already

The library surely does not contain the tool; more like vice versa.
Perhaps "provides" would be a better verb.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Peter Eisentraut (#7)
Re: run xmllint during build (was Re: need xmllint on borka)

Hello Peter,

Agreed. I have committed the SGML changes that make things valid now,
but I will postpone the xmllint addition until the 9.5 branch, complete
with more documentation.

Per the above announcement, here is an updated patch, also with more
documentation and explanations.

It would especially be valuable if someone with a different-than-mine OS
would verify whether they can install xmllint according to the
documentation, or update the documentation if not.

Tested on Ubuntu trusty.

Patched applies with some minor warning on current head, and works, that
is "xmllint" is run for some of the targets (epub, man).

However, it seems that it is not run for target "html", nor for pdf/ps
targets. I'm wondering whether it is an oversight or if there is a reason
for that. Maybe the intention is that an explicit "htmlhelp" is expected
to do the checking, but then why force it for man and epub? It seems to me
that it is not consistent.

Also, a general comment, which is independent of this patch: I found the
documentation build especially not resilient, with a lack of clear error
messages when something is broken. Basically, if configure does not found
something for the doc (openjade, osx, xmllint, ...) it does not complain.
That is fine with me, people would not always want to build the doc anyway
as it is available online. However, the Makefile in doc/src/sgml overrides
the not found commands (ifndef JADE JADE=..., etc), and proceed to
unhelpful and unclear errors later on. ISTM that it may be more helful to
do:

ifndef JADE
#error "no jade found on your system, cannot generate the documention"
endif

Rather than overriding with "JADE=jade" if jade was not there when
configuring.

This xmllint addition is done in the same spirit. I would suggest at the
minimum to check that xmllint is available before running it, or to ignore
the call if not available, something like:

type -p $(XMLLINT) || { echo error no $(XMLLINT)... ; exit 1 ; }
$(XMLLINT) ...

or

-type -p $(XMLLINT) && $(XMLLINT) ...

And I would prefer that a straightforward error is generated when
commands or styles are not found, in general.

Also, a detail, the Makefile style is not homogeneous:

ifndef XSLTPROC
XSLTPROC = xsltproc
endif

DBTOEPUB ?= dbtoepub

Why not XSLTPROC ?= xsltproc and the like?

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Fabien COELHO (#9)
Re: run xmllint during build (was Re: need xmllint on borka)

Also, a general comment, which is independent of this patch: I found the
documentation build especially not resilient, with a lack of clear error
messages when something is broken. Basically, if configure does not found
something for the doc (openjade, osx, xmllint, ...) it does not complain.
That is fine with me, people would not always want to build the doc anyway as
it is available online. However, the Makefile in doc/src/sgml overrides the
not found commands (ifndef JADE JADE=..., etc), and proceed to unhelpful and
unclear errors later on. ISTM that it may be more helful to do:

To be more constructive:

Maybe all commands could have a check counterpart added to the
dependencies, so as to fail gracefully only if needed, something like:

.check_XXX:
if type -p $(XXX) > /dev/null ; then touch $@ ; else \
echo "command $(XXX) not found"; exit 1 ; \
fi

foo: .check_XXX
$(XXX) ...

I'm not sure how to check for the docbook style availability though.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Fabien COELHO (#9)
Re: run xmllint during build (was Re: need xmllint on borka)

On 8/21/14 5:12 AM, Fabien COELHO wrote:

However, it seems that it is not run for target "html", nor for pdf/ps
targets. I'm wondering whether it is an oversight or if there is a
reason for that. Maybe the intention is that an explicit "htmlhelp" is
expected to do the checking, but then why force it for man and epub? It
seems to me that it is not consistent.

It is only run when the build is via XML/XSLT, not via SGML/DSSSL
(because the SGML/DSSSL build already checks the validity anyway).

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Peter Eisentraut
peter_e@gmx.net
In reply to: Fabien COELHO (#9)
Re: run xmllint during build (was Re: need xmllint on borka)

On 8/21/14 5:12 AM, Fabien COELHO wrote:

Also, a general comment, which is independent of this patch: I found the
documentation build especially not resilient, with a lack of clear error
messages when something is broken. Basically, if configure does not
found something for the doc (openjade, osx, xmllint, ...) it does not
complain. That is fine with me, people would not always want to build
the doc anyway as it is available online. However, the Makefile in
doc/src/sgml overrides the not found commands (ifndef JADE JADE=...,
etc), and proceed to unhelpful and unclear errors later on. ISTM that it
may be more helful to do:

ifndef JADE
#error "no jade found on your system, cannot generate the documention"
endif

We could use $(missing) for that, which is already used for bison, flex,
and perl.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#11)
Re: run xmllint during build (was Re: need xmllint on borka)

Peter Eisentraut <peter_e@gmx.net> writes:

On 8/21/14 5:12 AM, Fabien COELHO wrote:

However, it seems that it is not run for target "html", nor for pdf/ps
targets. I'm wondering whether it is an oversight or if there is a
reason for that. Maybe the intention is that an explicit "htmlhelp" is
expected to do the checking, but then why force it for man and epub? It
seems to me that it is not consistent.

It is only run when the build is via XML/XSLT, not via SGML/DSSSL
(because the SGML/DSSSL build already checks the validity anyway).

I'm confused. I thought the point of this change was mostly that the
SGML toolchain is less strict than the XML toolchain, and you wanted
to have the more-strict checks applied whenever possible.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#12)
Re: run xmllint during build (was Re: need xmllint on borka)

Peter Eisentraut <peter_e@gmx.net> writes:

We could use $(missing) for that, which is already used for bison, flex,
and perl.

+1 ... I'm surprised it's not like that already. Fabien's quite right to
complain that the Makefile has no business inserting defaults for things
configure couldn't find.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#13)
Re: run xmllint during build (was Re: need xmllint on borka)

On 8/21/14 4:00 PM, Tom Lane wrote:

It is only run when the build is via XML/XSLT, not via SGML/DSSSL
(because the SGML/DSSSL build already checks the validity anyway).

I'm confused. I thought the point of this change was mostly that the
SGML toolchain is less strict than the XML toolchain, and you wanted
to have the more-strict checks applied whenever possible.

The SGML tool chain is less strict about what it considers valid, but
the XML toolchain doesn't check at all unless we run xmllint, it just
produces garbage when the input is invalid.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Peter Eisentraut (#15)
Re: run xmllint during build (was Re: need xmllint on borka)

It is only run when the build is via XML/XSLT, not via SGML/DSSSL
(because the SGML/DSSSL build already checks the validity anyway).

I'm confused. I thought the point of this change was mostly that the
SGML toolchain is less strict than the XML toolchain, and you wanted
to have the more-strict checks applied whenever possible.

The SGML tool chain is less strict about what it considers valid, but
the XML toolchain doesn't check at all unless we run xmllint, it just
produces garbage when the input is invalid.

This suggests that "xmllint" should always be run, because it is more
strict than the other, so it is safer if the source has passed it and is
validated, whatever the tool chain used afterwards?

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Peter Eisentraut (#12)
Re: run xmllint during build (was Re: need xmllint on borka)

ISTM that it may be more helful to do:

ifndef JADE
#error "no jade found on your system, cannot generate the documention"
endif

We could use $(missing) for that, which is already used for bison, flex,
and perl.

I'm fine with "$(missing)" or whatever else that provides a clear message.
Oops, not "#error", but "$(error ...)", I was mixing cpp & make above...

However the example in the doc Makefile for "collageindex.pl" is on the
heavy side, as it suggests that every use of such commands should be put
in ifdef/else/endif. ISTM that a dependence-based solution would be
simpler.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Fabien COELHO (#17)
1 attachment(s)
Re: run xmllint during build (was Re: need xmllint on borka)

On 8/21/14 4:28 PM, Fabien COELHO wrote:

I'm fine with "$(missing)" or whatever else that provides a clear message.

I've committed the $(missing) use separately, and rebased this patch on
top of that.

Attachments:

v2-0001-doc-Check-DocBook-XML-validity-during-the-build.patchtext/x-patch; name=v2-0001-doc-Check-DocBook-XML-validity-during-the-build.patchDownload
>From cb2787b2473186239eb23e7320654513df8e8fb7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Sat, 13 Sep 2014 20:42:16 -0400
Subject: [PATCH v2] doc: Check DocBook XML validity during the build

Building the documentation with XSLT does not check the DTD, like a
DSSSL build would.  One can often get away with having invalid XML, but
the stylesheets might then create incorrect output, as they are not
designed to handle that.  Therefore, check the validity of the XML
against the DTD, using xmllint, during the build.

Add xmllint detection to configure, and add some documentation.

xmllint comes with libxml2, which is already in use, but it might be in
a separate package, such as libxml2-utils on Debian.
---
 configure                  | 43 +++++++++++++++++++++++++++++++++++++++++++
 configure.in               |  1 +
 doc/src/sgml/Makefile      |  9 ++++++++-
 doc/src/sgml/docguide.sgml | 19 +++++++++++++++++--
 src/Makefile.global.in     |  1 +
 5 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 4918f95..873bcff 100755
--- a/configure
+++ b/configure
@@ -630,6 +630,7 @@ vpath_build
 PROVE
 OSX
 XSLTPROC
+XMLLINT
 DBTOEPUB
 COLLATEINDEX
 DOCBOOKSTYLE
@@ -14449,6 +14450,48 @@ fi
   test -n "$DBTOEPUB" && break
 done
 
+for ac_prog in xmllint
+do
+  # Extract the first word of "$ac_prog", so it can be a program name with args.
+set dummy $ac_prog; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_prog_XMLLINT+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if test -n "$XMLLINT"; then
+  ac_cv_prog_XMLLINT="$XMLLINT" # Let the user override the test.
+else
+as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_prog_XMLLINT="$ac_prog"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+fi
+fi
+XMLLINT=$ac_cv_prog_XMLLINT
+if test -n "$XMLLINT"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $XMLLINT" >&5
+$as_echo "$XMLLINT" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+  test -n "$XMLLINT" && break
+done
+
 for ac_prog in xsltproc
 do
   # Extract the first word of "$ac_prog", so it can be a program name with args.
diff --git a/configure.in b/configure.in
index 1392277..e3b17ea 100644
--- a/configure.in
+++ b/configure.in
@@ -1859,6 +1859,7 @@ PGAC_CHECK_DOCBOOK(4.2)
 PGAC_PATH_DOCBOOK_STYLESHEETS
 PGAC_PATH_COLLATEINDEX
 AC_CHECK_PROGS(DBTOEPUB, dbtoepub)
+AC_CHECK_PROGS(XMLLINT, xmllint)
 AC_CHECK_PROGS(XSLTPROC, xsltproc)
 AC_CHECK_PROGS(OSX, [osx sgml2xml sx])
 
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 1d42be8..8bdd26c 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -44,6 +44,10 @@ ifndef OSX
 OSX = $(missing) osx
 endif
 
+ifndef XMLLINT
+XMLLINT = $(missing) xmllint
+endif
+
 ifndef XSLTPROC
 XSLTPROC = $(missing) xsltproc
 endif
@@ -78,6 +82,7 @@ override SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
 man distprep-man: man-stamp
 
 man-stamp: stylesheet-man.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_MAN_FLAGS) $^
 	touch $@
 
@@ -254,11 +259,13 @@ endif
 xslthtml: xslthtml-stamp
 
 xslthtml-stamp: stylesheet.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $^
 	cp $(srcdir)/stylesheet.css html/
 	touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.xml
+	$(XMLLINT) --noout --valid postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) $^
 
 %-A4.fo.tmp: stylesheet-fo.xsl %.xml
@@ -268,7 +275,6 @@ htmlhelp: stylesheet-hh.xsl postgres.xml
 	$(XSLTPROC) $(XSLTPROCFLAGS) --stringparam paper.type USletter -o $@ $^
 
 FOP = fop
-XMLLINT = xmllint
 
 # reformat FO output so that locations of errors are easier to find
 %.fo: %.fo.tmp
@@ -281,6 +287,7 @@ XMLLINT = xmllint
 
 epub: postgres.epub
 postgres.epub: postgres.xml
+	$(XMLLINT) --noout --valid $<
 	$(DBTOEPUB) $<
 
 
diff --git a/doc/src/sgml/docguide.sgml b/doc/src/sgml/docguide.sgml
index 943ab5a..614a76c 100644
--- a/doc/src/sgml/docguide.sgml
+++ b/doc/src/sgml/docguide.sgml
@@ -149,6 +149,20 @@ <title>Tool Sets</title>
     </varlistentry>
 
     <varlistentry>
+     <term><ulink url="http://xmlsoft.org/">Libxml2</ulink> for <command>xmllint</command></term>
+     <listitem>
+      <para>
+       This library and the <command>xmllint</command> tool it contains are
+       used for processing XML.  Many developers will already
+       have <application>Libxml2</application> installed, because it is also
+       used when building the PostgreSQL code.  Note, however,
+       that <command>xmllint</command> might need to be installed from a
+       separate subpackage.
+      </para>
+     </listitem>
+     </varlistentry>
+
+    <varlistentry>
      <term><ulink url="http://xmlsoft.org/XSLT/">Libxslt</ulink> for <command>xsltproc</command></term>
      <listitem>
       <para>
@@ -201,7 +215,8 @@ <title><productname>Linux</productname> <acronym>RPM</acronym> Installation</tit
     installing, or the following packages:
     <filename>sgml-common</filename>, <filename>docbook</filename>,
     <filename>stylesheets</filename>, <filename>openjade</filename>
-    (or <filename>jade</filename>).  You may also need <filename>sgml-tools</filename>
+    (or <filename>jade</filename>).  You may also need <filename>sgml-tools</filename>,
+    either <filename>libxml2</filename> or <filename>libxml2-utils</filename>,
     and either <filename>xsltproc</filename> or <filename>libxslt</>.  If your
     distributor does not provide these then you should be able to make
     use of the packages from some other, reasonably compatible vendor.
@@ -273,7 +288,7 @@ <title>Debian Packages</title>
     available for <productname>Debian GNU/Linux</productname>.
     To install, simply use:
 <programlisting>
-apt-get install docbook docbook-dsssl docbook-xsl openjade1.3 opensp xsltproc
+apt-get install docbook docbook-dsssl docbook-xsl libxml2-utils openjade1.3 opensp xsltproc
 </programlisting>
    </para>
   </sect2>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 39a6175..07c5b27 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -353,6 +353,7 @@ DOCBOOKSTYLE	= @DOCBOOKSTYLE@
 JADE			= @JADE@
 NSGMLS			= @NSGMLS@
 OSX				= @OSX@
+XMLLINT			= @XMLLINT@
 XSLTPROC		= @XSLTPROC@
 
 # Code coverage
-- 
1.8.5.2 (Apple Git-48)

#19Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Peter Eisentraut (#18)
Re: run xmllint during build (was Re: need xmllint on borka)

Hello Peter,

I've committed the $(missing) use separately,

That was simple and is a definite improvement.

Tiny detail: the new DBTOEPUB macro definition in "src/Makefile.global.in"
lacks another tab to be nicely aligned with the other definitions.

and rebased this patch on top of that.

Applied and tested, everything looks fine.

The only remaining question is whether the xmllint check should always be
called. You stated that it was stricter than sgml processing, so I would
think it worth to always call it, but this is really a marginal
preference. I think it is okay if some slaves in the build farm do build
the various targets.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Peter Eisentraut
peter_e@gmx.net
In reply to: Fabien COELHO (#19)
Re: run xmllint during build (was Re: need xmllint on borka)

On 9/14/14 3:34 AM, Fabien COELHO wrote:

and rebased this patch on top of that.

Applied and tested, everything looks fine.

The only remaining question is whether the xmllint check should always
be called. You stated that it was stricter than sgml processing, so I
would think it worth to always call it, but this is really a marginal
preference. I think it is okay if some slaves in the build farm do build
the various targets.

Committed.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers