Release note bloat is getting out of hand
I noticed that the release notes now constitute 25% of our SGML
documentation, by line count at least:
[postgres@sss1 sgml]$ wc *.sgml ref/*.sgml | tail -1
336338 1116259 11124003 total
[postgres@sss1 sgml]$ wc release-*.sgml | tail -1
85139 267417 2516545 total
Another way to measure it is that Appendix E (release notes) is
up to 270 subsections:
http://www.postgresql.org/docs/devel/static/release.html
That's starting to seem a bit excessive. And it's only going to get
worse, because each set of minor releases adds hundreds of not thousands
of lines here; for example the current set of release note additions
weighs in at
doc/src/sgml/release-9.0.sgml | 641 +++++++++++++++
doc/src/sgml/release-9.1.sgml | 725 +++++++++++++++++
doc/src/sgml/release-9.2.sgml | 832 +++++++++++++++++++
doc/src/sgml/release-9.3.sgml | 1746 ++++++++++++++++++++++++++++++++++++++++
doc/src/sgml/release-9.4.sgml | 674 ++++++++++++++++
I think it's time we changed the policy of including all release notes
back to the beginning in Appendix E. I seem to recall we debated this
once before, and decided that we liked having all that project history
visible. But Release 6.0 is old enough to vote as of last week, so really
we no longer need to prove anything about project stability/longevity.
I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.
(Note I'm *not* proposing applying this policy in time for this week's
releases. There's plenty of time to think about it.)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane-2 wrote
I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.
+1
Given the ready web access we provide to documentation for unsupported
releases, requiring constant recompilation of static material seems
wasteful.
Maybe a release history page and a note to look at the website would be a
nice addition but removing the detailed release notes would not cause
information to be lost.
David J.
--
View this message in context: http://postgresql.nabble.com/Release-note-bloat-is-getting-out-of-hand-tp5836330p5836346.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/01/2015 08:10 PM, Tom Lane wrote:
I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.
I'd like to keep a complete, downloadable version of the release notes
somewhere on the website; it's helpful to have "one big file" for
searches. It doesn't need to be in our core docs, though.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMdee10b60e97241a3cc9539a1403f4ebbb9f6f4d8a9592768897522fd59c30e1963411f6ae475b31b8a5db45a16f80493@asav-1.01.com
On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think it's time we changed the policy of including all release notes
back to the beginning in Appendix E. I seem to recall we debated this
once before, and decided that we liked having all that project history
visible. But Release 6.0 is old enough to vote as of last week, so really
we no longer need to prove anything about project stability/longevity.I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.
-1. I find it very useful to be able to go back through all the
release notes using grep, and have done so on multiple occasions. It
sounds like this policy would make that harder, and I don't see what
we get out of of it. It doesn't bother me that the SGML documentation
of the release notes is big; disk space is cheap.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 2, 2015 at 9:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.-1. I find it very useful to be able to go back through all the
release notes using grep, and have done so on multiple occasions. It
sounds like this policy would make that harder, and I don't see what
we get out of of it. It doesn't bother me that the SGML documentation
of the release notes is big; disk space is cheap.
FWIW, -0.5. I think that we should keep documentation down to the
oldest version supported by binary tools, I am referring particularly
to pg_dump that supports servers down to 7.0. Such information may be
useful for a dump/restore upgrade.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.
-1. I find it very useful to be able to go back through all the
release notes using grep, and have done so on multiple occasions. It
sounds like this policy would make that harder, and I don't see what
we get out of of it. It doesn't bother me that the SGML documentation
of the release notes is big; disk space is cheap.
Disk space isn't the only consideration here; if it were I'd not be
concerned about this. Processing time is an issue, and so is distribution
size, and so is the length of the manual if someone decides to print it
on dead trees. I also live in fear of the day that we hit some hard-to-
change internal limit in TeX.
Personally, what I grep when I'm looking for historical info is "git log"
output, which will certainly not be getting any shorter.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 2, 2015 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when itsplit
off from HEAD. This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.-1. I find it very useful to be able to go back through all the
release notes using grep, and have done so on multiple occasions. It
sounds like this policy would make that harder, and I don't see what
we get out of of it. It doesn't bother me that the SGML documentation
of the release notes is big; disk space is cheap.Disk space isn't the only consideration here; if it were I'd not be
concerned about this. Processing time is an issue, and so is distribution
size, and so is the length of the manual if someone decides to print it
on dead trees. I also live in fear of the day that we hit some hard-to-
change internal limit in TeX.
Yeah, the PDF size is definitely someting to consider in this context. And
the limits.
But if we can find some good way to "archive" or preserve them *outside the
main docs* that should solve this problem, no? We could keep them in SGML
even, but make sure they are not actually included in the build? Would
still be useful for developers there...
Or if we could find a way to do like Josh says - archive them separately
and publish a separate download. We could even keep it in a separate git
repo if we have to, with a "migrate" job to run on a major release?
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes:
Yeah, the PDF size is definitely someting to consider in this context. And
the limits.
But if we can find some good way to "archive" or preserve them *outside the
main docs* that should solve this problem, no? We could keep them in SGML
even, but make sure they are not actually included in the build? Would
still be useful for developers there...
Or if we could find a way to do like Josh says - archive them separately
and publish a separate download. We could even keep it in a separate git
repo if we have to, with a "migrate" job to run on a major release?
Yeah, seems like this and Josh's request could both be addressed fine
with a separate document.
I could live with keeping the ancient-branch release note SGML files
around in HEAD --- I'd hoped to reduce the size of tarballs a bit, but the
savings by that measure would only be a few percent (at present anyway).
What's more important is to get them out of the main documentation build.
So how about cutting the main doc build down to last-five-branches,
and adding a non-default make target that produces a separate document
consisting of (only) the complete release note history?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 2, 2015 at 10:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
Yeah, the PDF size is definitely someting to consider in this context. And
the limits.But if we can find some good way to "archive" or preserve them *outside the
main docs* that should solve this problem, no? We could keep them in SGML
even, but make sure they are not actually included in the build? Would
still be useful for developers there...Or if we could find a way to do like Josh says - archive them separately
and publish a separate download. We could even keep it in a separate git
repo if we have to, with a "migrate" job to run on a major release?Yeah, seems like this and Josh's request could both be addressed fine
with a separate document.I could live with keeping the ancient-branch release note SGML files
around in HEAD --- I'd hoped to reduce the size of tarballs a bit, but the
savings by that measure would only be a few percent (at present anyway).
What's more important is to get them out of the main documentation build.
So how about cutting the main doc build down to last-five-branches,
and adding a non-default make target that produces a separate document
consisting of (only) the complete release note history?
The last 5 branches only takes us back to 9.0, which isn't very far.
I would want to have at least the 8.x branches in the SGML build, and
maybe the 7.x branches as well. I would be happy to drop anything
pre-7.x from the docs build and just let the people who care look at
the SGML. You seem to be assuming that nobody spends much time
looking at the release notes for older branches, but that is certainly
false in my own case.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/02/2015 07:54 AM, Robert Haas wrote:
I could live with keeping the ancient-branch release note SGML files
around in HEAD --- I'd hoped to reduce the size of tarballs a bit, but the
savings by that measure would only be a few percent (at present anyway).
What's more important is to get them out of the main documentation build.
So how about cutting the main doc build down to last-five-branches,
and adding a non-default make target that produces a separate document
consisting of (only) the complete release note history?The last 5 branches only takes us back to 9.0, which isn't very far.
I would want to have at least the 8.x branches in the SGML build, and
maybe the 7.x branches as well. I would be happy to drop anything
pre-7.x from the docs build and just let the people who care look at
the SGML. You seem to be assuming that nobody spends much time
looking at the release notes for older branches, but that is certainly
false in my own case.
I was suggesting having a separate "historical release notes" tarball,
actually. If that's in SGML, and can be built using our doc tools, we
haven't lost anything and we've reduced the size of the distribution
tarball.
One of the things I've been tinkering with for a while is a better
searchable version of the release notes. The problem I keep running
into is that it's very difficult to write an error-free importer from
the present SGML file; there's just too much variation in how certain
things are recorded, and SGML just isn't a database import format.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMddb1a16ccc9f1faf72caece9b5fc9e18cea94fb2a470892490383fa170e22a912f174905e0036f560c246d15c9b8e306@asav-2.01.com
Josh Berkus <josh@agliodbs.com> writes:
On 02/02/2015 07:54 AM, Robert Haas wrote:
The last 5 branches only takes us back to 9.0, which isn't very far.
I would want to have at least the 8.x branches in the SGML build, and
maybe the 7.x branches as well. I would be happy to drop anything
pre-7.x from the docs build and just let the people who care look at
the SGML. You seem to be assuming that nobody spends much time
looking at the release notes for older branches, but that is certainly
false in my own case.
I was suggesting having a separate "historical release notes" tarball,
actually. If that's in SGML, and can be built using our doc tools, we
haven't lost anything and we've reduced the size of the distribution
tarball.
That was pretty much my point as well. Sure, we can keep all the notes
online somewhere; that doesn't mean they have to be in the standard
distribution tarball, nor in the standard documentation build.
One of the things I've been tinkering with for a while is a better
searchable version of the release notes. The problem I keep running
into is that it's very difficult to write an error-free importer from
the present SGML file; there's just too much variation in how certain
things are recorded, and SGML just isn't a database import format.
The existing release notes are not conveniently searchable, for sure;
they're not in a single file, and they don't show up on a single page
on the Web, and I've never seen a PDF-searching tool that didn't suck.
So I'm bemused by Robert's insistence that he wants that format to support
searches. As I said, I find it far more convenient to search the output
of "git log" and/or src/tools/git_changelog --- I keep text files of those
around for exactly that purpose.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The existing release notes are not conveniently searchable, for sure;
they're not in a single file, and they don't show up on a single page
on the Web, and I've never seen a PDF-searching tool that didn't suck.
So I'm bemused by Robert's insistence that he wants that format to support
searches. As I said, I find it far more convenient to search the output
of "git log" and/or src/tools/git_changelog --- I keep text files of those
around for exactly that purpose.
I normally search in one of two ways. Sometimes a grep the sgml;
other times, I go to, say,
http://www.postgresql.org/docs/devel/static/release-9-4.html and then
edit the URL to take me back to 9.3, 9.2, 9.1, etc. It's true that
'git log' is often the place to go searching for stuff, but there are
times when it's easier to find out what release introduced a feature
by looking at the release notes, and it's certainly more useful if you
want to send a link to someone who is not git-aware illustrating the
results of your search.
Well, maybe I'm the only one who is doing this and it's not worth
worrying about it just for me. But I do it, all the same.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160
Robert Haas wrote:
but there are times when it's easier to find out what release
introduced a feature by looking at the release notes, and it's
certainly more useful if you want to send a link to someone who
is not git-aware illustrating the results of your search.Well, maybe I'm the only one who is doing this and it's not worth
worrying about it just for me. But I do it, all the same.
I do this *all the time*. Please don't mess with the release notes.
Except to put them all on one page for easy searching. That would
be awesome.
- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201502021555
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iEYEAREDAAYFAlTP5EQACgkQvJuQZxSWSsj13QCfTrKBKDlOm0E5K4+2ib7F8Tjl
w5QAoOY3vX9tUb1KUxk3VaW+k71vrW7m
=y+SU
-----END PGP SIGNATURE-----
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/02/2015 09:38 PM, Robert Haas wrote:
Well, maybe I'm the only one who is doing this and it's not worth
worrying about it just for me. But I do it, all the same.
I do the later quite often: link people to old release notes. For me it
would be fine to remove them from tar balls as long as they are still on
the website.
--
Andreas Karlsson
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So I'm bemused by Robert's insistence that he wants that format to support
searches. As I said, I find it far more convenient to search the output
of "git log" and/or src/tools/git_changelog --- I keep text files of those
around for exactly that purpose.
I normally search in one of two ways. Sometimes a grep the sgml;
other times, I go to, say,
http://www.postgresql.org/docs/devel/static/release-9-4.html and then
edit the URL to take me back to 9.3, 9.2, 9.1, etc. It's true that
'git log' is often the place to go searching for stuff, but there are
times when it's easier to find out what release introduced a feature
by looking at the release notes, and it's certainly more useful if you
want to send a link to someone who is not git-aware illustrating the
results of your search.
Well, maybe I'm the only one who is doing this and it's not worth
worrying about it just for me. But I do it, all the same.
I'm not out to take away a feature you need. I'm just wondering why it
has to be supported in exactly the way it's done now. Wouldn't a
separately maintained release-notes-only document serve the purpose fine?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On February 2, 2015 9:38:43 PM CET, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The existing release notes are not conveniently searchable, for sure;
they're not in a single file, and they don't show up on a single page
on the Web, and I've never seen a PDF-searching tool that didn'tsuck.
So I'm bemused by Robert's insistence that he wants that format to
support
searches. As I said, I find it far more convenient to search the
output
of "git log" and/or src/tools/git_changelog --- I keep text files of
those
around for exactly that purpose.
I normally search in one of two ways. Sometimes a grep the sgml;
other times, I go to, say,
http://www.postgresql.org/docs/devel/static/release-9-4.html and then
edit the URL to take me back to 9.3, 9.2, 9.1, etc.
FWIW I the same. Git log is great if you want all detail. But often enough the more condensed format of the release notes is helpful. Say, a customer has problems after migrating to a new version. It's quite a bit faster to read the section about incompatibilities than travel through the git log.
There's a reason the release notes exist. Given that they're apparently useful, it doesn't seem strange that devs sometimes read them...
---
Please excuse brevity and formatting - I am writing this on my mobile phone.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/02/2015 07:54 AM, Robert Haas wrote:
The last 5 branches only takes us back to 9.0, which isn't very far.
I would want to have at least the 8.x branches in the SGML build, and
maybe the 7.x branches as well. I would be happy to drop anything
pre-7.x from the docs build and just let the people who care look at
the SGML. You seem to be assuming that nobody spends much time
looking at the release notes for older branches, but that is certainly
false in my own case.
It seems to me that the docs that are shipped should only contain
information in regards to supported versions. Frankly there is no reason
to ship any release notes except for the version that they are shipping
with (e.g; there is no reason for 9.0 to be in 9.1). It is just bloat at
that point when we can point everyone to the website or ftp site.
JD
--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
not be surprised when they come back as Romans."
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2/1/15 11:10 PM, Tom Lane wrote:
I think it's time we changed the policy of including all release notes
back to the beginning in Appendix E.
I share the sentiment that the release notes *seem* too big, but the
subsequent discussion shows that it's not clear why that's really a
problem. Exactly what problem are we trying to fix?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 2, 2015 at 6:40 PM, Peter Eisentraut-2 [via PostgreSQL] <
ml-node+s1045698n5836471h25@n5.nabble.com> wrote:
On 2/1/15 11:10 PM, Tom Lane wrote:
I think it's time we changed the policy of including all release notes
back to the beginning in Appendix E.I share the sentiment that the release notes *seem* too big, but the
subsequent discussion shows that it's not clear why that's really a
problem. Exactly what problem are we trying to fix?
We'd get a lines-of-code decrease which would translate into a improvement
in the make process time; most noticeable for someone doing a doc-only
build, multiple times, to see how a doc change looks. No time percentage
has been provided yet but the goal seems reasonable in theory.
David J.
--
View this message in context: http://postgresql.nabble.com/Release-note-bloat-is-getting-out-of-hand-tp5836330p5836473.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
On 2/1/15 11:10 PM, Tom Lane wrote:
I think it's time we changed the policy of including all release notes
back to the beginning in Appendix E.I share the sentiment that the release notes *seem* too big, but the
subsequent discussion shows that it's not clear why that's really a
problem. Exactly what problem are we trying to fix?
At a rough count of lines, the release notes for unsupported versions
are about 18% of documentation overall (47K out of 265K lines). So
they're not insubstantial. Compared to the total size of the tarball,
though ...
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMee11ec32c9e8a6e289f4515e21dc720873a02bf83726e21b6594fb9415117bc64b3fa258b5480c0486ae680fbdc3b69a@asav-1.01.com