pgsql: Build HTML documentation using XSLT stylesheets by default

Started by Peter Eisentrautover 9 years ago31 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Build HTML documentation using XSLT stylesheets by default

The old DSSSL build is still available for a while using the make target
"oldhtml".

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/e36ddab11735052841b4eff96642187ec9a8a7bc

Modified Files
--------------
doc/src/sgml/Makefile | 8 ++++----
doc/src/sgml/stylesheet.css | 50 +++++++++++++++++----------------------------
2 files changed, 23 insertions(+), 35 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Magnus Hagander
magnus@hagander.net
In reply to: Peter Eisentraut (#1)
Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

This seems to have broken our website build a bit. If you check
https://www.postgresql.org/docs/devel/static/index.html, you'll notice a
bunch of bad characters.

AFAICT this is because the output is now UTF8 and it used to be LATIN1. The
current output actually has it in the html tags that it's utf8,but since
the old one had no tags specifying it's encoding we hardcoded it to LATIN1.

I assume we shall expect it to always be UTF8 from now on, and just find a
way for the docs loader script for the website to properly detect when we
switched over? Probably by just looking for that specific <?xml tag on the
first line.

Is this change something that might break something else, though?

//Magnus

On Wed, Nov 16, 2016 at 8:06 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

Show quoted text

Build HTML documentation using XSLT stylesheets by default

The old DSSSL build is still available for a while using the make target
"oldhtml".

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/e36ddab11735052841b4eff9664218
7ec9a8a7bc

Modified Files
--------------
doc/src/sgml/Makefile | 8 ++++----
doc/src/sgml/stylesheet.css | 50 +++++++++++++++++-------------
---------------
2 files changed, 23 insertions(+), 35 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Magnus Hagander (#2)
Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 1:38 AM, Magnus Hagander wrote:

AFAICT this is because the output is now UTF8 and it used to be LATIN1.
The current output actually has it in the html tags that it's utf8,but
since the old one had no tags specifying it's encoding we hardcoded it
to LATIN1.

The old output has this:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

This has always been the case, AFAICT.

Btw., shouldn't the output web site pages have encoding declarations?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Magnus Hagander
magnus@hagander.net
In reply to: Peter Eisentraut (#3)
Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On Wed, Nov 16, 2016 at 3:02 PM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 11/16/16 1:38 AM, Magnus Hagander wrote:

AFAICT this is because the output is now UTF8 and it used to be LATIN1.
The current output actually has it in the html tags that it's utf8,but
since the old one had no tags specifying it's encoding we hardcoded it
to LATIN1.

The old output has this:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

This has always been the case, AFAICT.

Oh, it's there. It's just not on one line and not at the beginning, so I
misssed it :)

Btw., shouldn't the output web site pages have encoding declarations?

That gets sent in the http header, doesn't it?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#5Erik Rijkers
er@xs4all.nl
In reply to: Peter Eisentraut (#1)
Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 2016-11-16 08:06, Peter Eisentraut wrote:

Build HTML documentation using XSLT stylesheets by default

The old DSSSL build is still available for a while using the make
target
"oldhtml".

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I'd say that is a strong disadvantage.

I hope 'for a while' will mean 'for a long time to come' or even
'forever.'

Erik Rijkers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erik Rijkers (#5)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

Erik Rijkers <er@xs4all.nl> writes:

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I'm just discovering the same.

I'd say that is a strong disadvantage.

I'd say that is flat out unacceptable. I won't ever use this toolchain
if it's that much slower than the old way. What was the improvement
we were hoping for, again?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#6)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/2016 09:46 AM, Tom Lane wrote:

Erik Rijkers <er@xs4all.nl> writes:

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I'm just discovering the same.

I'd say that is a strong disadvantage.

I'd say that is flat out unacceptable. I won't ever use this toolchain
if it's that much slower than the old way. What was the improvement
we were hoping for, again?

On the buildfarm crake has gone from about 2 minutes to about 3.5
minutes to run "make doc". That's not good but it's not an eight-fold
increase either.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On Wed, Nov 16, 2016 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Erik Rijkers <er@xs4all.nl> writes:

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I'm just discovering the same.

I'd say that is a strong disadvantage.

I'd say that is flat out unacceptable. I won't ever use this toolchain
if it's that much slower than the old way. What was the improvement
we were hoping for, again?

Gosh, and I thought the existing toolchain was already ridiculously
slow. Couldn't somebody write a Perl script that generated the HTML
documentation from the SGML in, like, a second? I mean, we're
basically just mapping one set up markup tags to another set of markup
tags. And splitting up some files for the HTML version. And adding
some boilerplate. But none of that sounds like it should be all that
hard.

I am reminded of the saying that XML is like violence -- if it doesn't
solve your problem, you're not using enough of it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#7)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On Wed, Nov 16, 2016 at 10:16 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

On the buildfarm crake has gone from about 2 minutes to about 3.5 minutes to
run "make doc". That's not good but it's not an eight-fold increase either.

On my MacBook, "time make docs" as of e36ddab11735052841b4eff96642187ec9a8a7bc:

real 2m17.871s
user 2m15.505s
sys 0m2.238s

And as of 4ecd1974377ffb4d6d72874ba14fcd23965b1792:

real 1m47.696s
user 1m47.085s
sys 0m1.145s

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#1)
Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

Peter Eisentraut wrote:

Build HTML documentation using XSLT stylesheets by default

"make check" still uses DSSSL though. Is that intentional? Is it going
to be changed?

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Erik Rijkers (#5)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 6:29 AM, Erik Rijkers wrote:

On 2016-11-16 08:06, Peter Eisentraut wrote:

Build HTML documentation using XSLT stylesheets by default

The old DSSSL build is still available for a while using the make
target
"oldhtml".

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I have committed another patch to improve the build performance a bit.
Could you check again?

On my machine and on the build farm, the performance now almost matches
the DSSSL build.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Peter Eisentraut
peter_e@gmx.net
In reply to: Magnus Hagander (#4)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 6:09 AM, Magnus Hagander wrote:

Btw., shouldn't the output web site pages have encoding declarations?

That gets sent in the http header, doesn't it?

That's probably alright, but it would be nicer if the documents were
self-contained.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#10)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 12:38 PM, Alvaro Herrera wrote:

"make check" still uses DSSSL though. Is that intentional? Is it going
to be changed?

It doesn't use DSSSL. Is uses nsgmls to parse the SGML, which is a
different thing that will be addressed in a separate step.

So, yes, but later.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 6:46 AM, Tom Lane wrote:

What was the improvement we were hoping for, again?

Get off an ancient and unmaintained tool chain.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Erik Rijkers
er@xs4all.nl
In reply to: Peter Eisentraut (#11)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 2016-11-16 21:59, Peter Eisentraut wrote:

On 11/16/16 6:29 AM, Erik Rijkers wrote:

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I have committed another patch to improve the build performance a bit.
Could you check again?

It is indeed better (three minutes off, nice) but still:
real 5m21.348s -- for 'make -j 8 html'
versus
real 1m8.502s -- for 'make -j 8 oldhtml'

Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
the cause of the discrepancy with other's measurements.

Obviously as long as 'oldhtml' is possible I won't complain.

thanks,

Erik Rijkers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#11)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

Peter Eisentraut wrote:

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I have committed another patch to improve the build performance a bit.
Could you check again?

After the optimization, on my laptop it takes 2:31 with the new system
and 1:58 with the old one. If it can be made faster, all the better,
but at this level I'm okay.

Now admittedly this conversion didn't do one bit towards the goal I
wanted to achieve: that each doc source file ended up as a valid XML
file that could be processed separately with tools like xml2po. They
are still SGML only -- in particular no doctype declaration and
incomplete closing tags.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erik Rijkers (#15)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

Erik Rijkers <er@xs4all.nl> writes:

On 2016-11-16 21:59, Peter Eisentraut wrote:

I have committed another patch to improve the build performance a bit.
Could you check again?

It is indeed better (three minutes off, nice) but still:
real 5m21.348s -- for 'make -j 8 html'
versus
real 1m8.502s -- for 'make -j 8 oldhtml'

Yeah, I get about the same.

Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
the cause of the discrepancy with other's measurements.

... and on the same toolchain. Probably the answer is "install a newer
toolchain", but from what I understand, there's a whole lot of work there
if your platform vendor doesn't supply it already packaged.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#16)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 1:23 PM, Alvaro Herrera wrote:

Now admittedly this conversion didn't do one bit towards the goal I
wanted to achieve: that each doc source file ended up as a valid XML
file that could be processed separately with tools like xml2po. They
are still SGML only -- in particular no doctype declaration and
incomplete closing tags.

Yes, that is one of the upcoming steps. But we need to do the current
thing first.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Peter Eisentraut
peter_e@gmx.net
In reply to: Erik Rijkers (#15)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 11/16/16 1:14 PM, Erik Rijkers wrote:

real 5m21.348s -- for 'make -j 8 html'
versus
real 1m8.502s -- for 'make -j 8 oldhtml'

Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
the cause of the discrepancy with other's measurements.

I tested the build on a variety of operating systems, including that
one, with different tool chain versions and I am getting consistent
performance. So the above is unclear to me at the moment.

For the heck of it, run this

xsltproc --nonet --stringparam pg.version '10devel' stylesheet.xsl
postgres.xml

to make sure it's not downloading something from the network.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Erik Rijkers
er@xs4all.nl
In reply to: Peter Eisentraut (#19)
Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

On 2016-11-17 02:15, Peter Eisentraut wrote:

On 11/16/16 1:14 PM, Erik Rijkers wrote:

real 5m21.348s -- for 'make -j 8 html'
versus
real 1m8.502s -- for 'make -j 8 oldhtml'

Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
the cause of the discrepancy with other's measurements.

I tested the build on a variety of operating systems, including that
one, with different tool chain versions and I am getting consistent
performance. So the above is unclear to me at the moment.

For the heck of it, run this

xsltproc --nonet --stringparam pg.version '10devel' stylesheet.xsl
postgres.xml

to make sure it's not downloading something from the network.

$ time xsltproc --nonet --stringparam pg.version '10devel'
stylesheet.xsl postgres.xml
real 5m43.776s

$ ( cd /home/aardvark/pg_stuff/pg_sandbox/pgsql.HEAD/doc/src/sgml; time
make oldhtml )
real 1m14.152s

(I did clean out in between)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#11)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#22)
#24Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#24)
#26Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Pavel Stehule (#26)
#28Alexander Lakhin
exclusion@gmail.com
In reply to: Alvaro Herrera (#27)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#28)
#30Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#25)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#30)