Thread: Release note bloat is getting out of hand

Release note bloat is getting out of hand

From
Tom Lane
Date:
I noticed that the release notes now constitute 25% of our SGML
documentation, by line count at least:

[postgres@sss1 sgml]$ wc *.sgml ref/*.sgml | tail -1 336338  1116259 11124003 total
[postgres@sss1 sgml]$ wc release-*.sgml | tail -1 85139  267417 2516545 total

Another way to measure it is that Appendix E (release notes) is
up to 270 subsections:
http://www.postgresql.org/docs/devel/static/release.html

That's starting to seem a bit excessive.  And it's only going to get
worse, because each set of minor releases adds hundreds of not thousands
of lines here; for example the current set of release note additions
weighs in at
doc/src/sgml/release-9.0.sgml  |  641 +++++++++++++++doc/src/sgml/release-9.1.sgml  |  725
+++++++++++++++++doc/src/sgml/release-9.2.sgml |  832 +++++++++++++++++++doc/src/sgml/release-9.3.sgml  | 1746
++++++++++++++++++++++++++++++++++++++++doc/src/sgml/release-9.4.sgml |  674 ++++++++++++++++
 

I think it's time we changed the policy of including all release notes
back to the beginning in Appendix E.  I seem to recall we debated this
once before, and decided that we liked having all that project history
visible.  But Release 6.0 is old enough to vote as of last week, so really
we no longer need to prove anything about project stability/longevity.

I propose that we go over to a policy of keeping in HEAD only release
notes for actively maintained branches, and that each back branch should
retain notes only for branches that were actively maintained when it split
off from HEAD.  This would keep about five years worth of history in
Appendix E, which should be a roughly stable amount of text.

(Note I'm *not* proposing applying this policy in time for this week's
releases.  There's plenty of time to think about it.)
        regards, tom lane



Re: Release note bloat is getting out of hand

From
David G Johnston
Date:
Tom Lane-2 wrote
> I propose that we go over to a policy of keeping in HEAD only release
> notes for actively maintained branches, and that each back branch should
> retain notes only for branches that were actively maintained when it split
> off from HEAD.  This would keep about five years worth of history in
> Appendix E, which should be a roughly stable amount of text.

+1

Given the ready web access we provide to documentation for unsupported
releases, requiring constant recompilation of static material seems
wasteful.

Maybe a release history page and a note to look at the website would be a
nice addition but removing the detailed release notes would not cause
information to be lost.

David J.



--
View this message in context:
http://postgresql.nabble.com/Release-note-bloat-is-getting-out-of-hand-tp5836330p5836346.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: Release note bloat is getting out of hand

From
Josh Berkus
Date:
On 02/01/2015 08:10 PM, Tom Lane wrote:
> I propose that we go over to a policy of keeping in HEAD only release
> notes for actively maintained branches, and that each back branch should
> retain notes only for branches that were actively maintained when it split
> off from HEAD.  This would keep about five years worth of history in
> Appendix E, which should be a roughly stable amount of text.

I'd like to keep a complete, downloadable version of the release notes
somewhere on the website; it's helpful to have "one big file" for
searches.  It doesn't need to be in our core docs, though.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Release note bloat is getting out of hand

From
Robert Haas
Date:
On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think it's time we changed the policy of including all release notes
> back to the beginning in Appendix E.  I seem to recall we debated this
> once before, and decided that we liked having all that project history
> visible.  But Release 6.0 is old enough to vote as of last week, so really
> we no longer need to prove anything about project stability/longevity.
>
> I propose that we go over to a policy of keeping in HEAD only release
> notes for actively maintained branches, and that each back branch should
> retain notes only for branches that were actively maintained when it split
> off from HEAD.  This would keep about five years worth of history in
> Appendix E, which should be a roughly stable amount of text.

-1.  I find it very useful to be able to go back through all the
release notes using grep, and have done so on multiple occasions.  It
sounds like this policy would make that harder, and I don't see what
we get out of of it.  It doesn't bother me that the SGML documentation
of the release notes is big; disk space is cheap.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Release note bloat is getting out of hand

From
Michael Paquier
Date:
On Mon, Feb 2, 2015 at 9:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I propose that we go over to a policy of keeping in HEAD only release
>> notes for actively maintained branches, and that each back branch should
>> retain notes only for branches that were actively maintained when it split
>> off from HEAD.  This would keep about five years worth of history in
>> Appendix E, which should be a roughly stable amount of text.
>
> -1.  I find it very useful to be able to go back through all the
> release notes using grep, and have done so on multiple occasions.  It
> sounds like this policy would make that harder, and I don't see what
> we get out of of it.  It doesn't bother me that the SGML documentation
> of the release notes is big; disk space is cheap.
FWIW, -0.5. I think that we should keep documentation down to the
oldest version supported by binary tools, I am referring particularly
to pg_dump that supports servers down to 7.0. Such information may be
useful for a dump/restore upgrade.
-- 
Michael



Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I propose that we go over to a policy of keeping in HEAD only release
>> notes for actively maintained branches, and that each back branch should
>> retain notes only for branches that were actively maintained when it split
>> off from HEAD.  This would keep about five years worth of history in
>> Appendix E, which should be a roughly stable amount of text.

> -1.  I find it very useful to be able to go back through all the
> release notes using grep, and have done so on multiple occasions.  It
> sounds like this policy would make that harder, and I don't see what
> we get out of of it.  It doesn't bother me that the SGML documentation
> of the release notes is big; disk space is cheap.

Disk space isn't the only consideration here; if it were I'd not be
concerned about this.  Processing time is an issue, and so is distribution
size, and so is the length of the manual if someone decides to print it
on dead trees.  I also live in fear of the day that we hit some hard-to-
change internal limit in TeX.

Personally, what I grep when I'm looking for historical info is "git log"
output, which will certainly not be getting any shorter.
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Magnus Hagander
Date:
On Mon, Feb 2, 2015 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Feb 1, 2015 at 11:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I propose that we go over to a policy of keeping in HEAD only release
>> notes for actively maintained branches, and that each back branch should
>> retain notes only for branches that were actively maintained when it split
>> off from HEAD.  This would keep about five years worth of history in
>> Appendix E, which should be a roughly stable amount of text.

> -1.  I find it very useful to be able to go back through all the
> release notes using grep, and have done so on multiple occasions.  It
> sounds like this policy would make that harder, and I don't see what
> we get out of of it.  It doesn't bother me that the SGML documentation
> of the release notes is big; disk space is cheap.

Disk space isn't the only consideration here; if it were I'd not be
concerned about this.  Processing time is an issue, and so is distribution
size, and so is the length of the manual if someone decides to print it
on dead trees.  I also live in fear of the day that we hit some hard-to-
change internal limit in TeX.

Yeah, the PDF size is definitely someting to consider in this context. And the limits.

But if we can find some good way to "archive" or preserve them *outside the main docs* that should solve this problem, no? We could keep them in SGML even, but make sure they are not actually included in the build? Would still be useful for developers there...

Or if we could find a way to do like Josh says - archive them separately and publish a separate download. We could even keep it in a separate git repo if we have to, with a "migrate" job to run on a major release?

--

Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Yeah, the PDF size is definitely someting to consider in this context. And
> the limits.

> But if we can find some good way to "archive" or preserve them *outside the
> main docs* that should solve this problem, no? We could keep them in SGML
> even, but make sure they are not actually included in the build? Would
> still be useful for developers there...

> Or if we could find a way to do like Josh says - archive them separately
> and publish a separate download. We could even keep it in a separate git
> repo if we have to, with a "migrate" job to run on a major release?

Yeah, seems like this and Josh's request could both be addressed fine
with a separate document.

I could live with keeping the ancient-branch release note SGML files
around in HEAD --- I'd hoped to reduce the size of tarballs a bit, but the
savings by that measure would only be a few percent (at present anyway).
What's more important is to get them out of the main documentation build.
So how about cutting the main doc build down to last-five-branches,
and adding a non-default make target that produces a separate document
consisting of (only) the complete release note history?
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Robert Haas
Date:
On Mon, Feb 2, 2015 at 10:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Yeah, the PDF size is definitely someting to consider in this context. And
>> the limits.
>
>> But if we can find some good way to "archive" or preserve them *outside the
>> main docs* that should solve this problem, no? We could keep them in SGML
>> even, but make sure they are not actually included in the build? Would
>> still be useful for developers there...
>
>> Or if we could find a way to do like Josh says - archive them separately
>> and publish a separate download. We could even keep it in a separate git
>> repo if we have to, with a "migrate" job to run on a major release?
>
> Yeah, seems like this and Josh's request could both be addressed fine
> with a separate document.
>
> I could live with keeping the ancient-branch release note SGML files
> around in HEAD --- I'd hoped to reduce the size of tarballs a bit, but the
> savings by that measure would only be a few percent (at present anyway).
> What's more important is to get them out of the main documentation build.
> So how about cutting the main doc build down to last-five-branches,
> and adding a non-default make target that produces a separate document
> consisting of (only) the complete release note history?

The last 5 branches only takes us back to 9.0, which isn't very far.
I would want to have at least the 8.x branches in the SGML build, and
maybe the 7.x branches as well.  I would be happy to drop anything
pre-7.x from the docs build and just let the people who care look at
the SGML.  You seem to be assuming that nobody spends much time
looking at the release notes for older branches, but that is certainly
false in my own case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Release note bloat is getting out of hand

From
Josh Berkus
Date:
On 02/02/2015 07:54 AM, Robert Haas wrote:
>> I could live with keeping the ancient-branch release note SGML files
>> > around in HEAD --- I'd hoped to reduce the size of tarballs a bit, but the
>> > savings by that measure would only be a few percent (at present anyway).
>> > What's more important is to get them out of the main documentation build.
>> > So how about cutting the main doc build down to last-five-branches,
>> > and adding a non-default make target that produces a separate document
>> > consisting of (only) the complete release note history?
> The last 5 branches only takes us back to 9.0, which isn't very far.
> I would want to have at least the 8.x branches in the SGML build, and
> maybe the 7.x branches as well.  I would be happy to drop anything
> pre-7.x from the docs build and just let the people who care look at
> the SGML.  You seem to be assuming that nobody spends much time
> looking at the release notes for older branches, but that is certainly
> false in my own case.

I was suggesting having a separate "historical release notes" tarball,
actually.  If that's in SGML, and can be built using our doc tools, we
haven't lost anything and we've reduced the size of the distribution
tarball.

One of the things I've been tinkering with for a while is a better
searchable version of the release notes.  The problem I keep running
into is that it's very difficult to write an error-free importer from
the present SGML file; there's just too much variation in how certain
things are recorded, and SGML just isn't a database import format.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> On 02/02/2015 07:54 AM, Robert Haas wrote:
>> The last 5 branches only takes us back to 9.0, which isn't very far.
>> I would want to have at least the 8.x branches in the SGML build, and
>> maybe the 7.x branches as well.  I would be happy to drop anything
>> pre-7.x from the docs build and just let the people who care look at
>> the SGML.  You seem to be assuming that nobody spends much time
>> looking at the release notes for older branches, but that is certainly
>> false in my own case.

> I was suggesting having a separate "historical release notes" tarball,
> actually.  If that's in SGML, and can be built using our doc tools, we
> haven't lost anything and we've reduced the size of the distribution
> tarball.

That was pretty much my point as well.  Sure, we can keep all the notes
online somewhere; that doesn't mean they have to be in the standard
distribution tarball, nor in the standard documentation build.

> One of the things I've been tinkering with for a while is a better
> searchable version of the release notes.  The problem I keep running
> into is that it's very difficult to write an error-free importer from
> the present SGML file; there's just too much variation in how certain
> things are recorded, and SGML just isn't a database import format.

The existing release notes are not conveniently searchable, for sure;
they're not in a single file, and they don't show up on a single page
on the Web, and I've never seen a PDF-searching tool that didn't suck.
So I'm bemused by Robert's insistence that he wants that format to support
searches.  As I said, I find it far more convenient to search the output
of "git log" and/or src/tools/git_changelog --- I keep text files of those
around for exactly that purpose.
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Robert Haas
Date:
On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The existing release notes are not conveniently searchable, for sure;
> they're not in a single file, and they don't show up on a single page
> on the Web, and I've never seen a PDF-searching tool that didn't suck.
> So I'm bemused by Robert's insistence that he wants that format to support
> searches.  As I said, I find it far more convenient to search the output
> of "git log" and/or src/tools/git_changelog --- I keep text files of those
> around for exactly that purpose.

I normally search in one of two ways.  Sometimes a grep the sgml;
other times, I go to, say,
http://www.postgresql.org/docs/devel/static/release-9-4.html and then
edit the URL to take me back to 9.3, 9.2, 9.1, etc.  It's true that
'git log' is often the place to go searching for stuff, but there are
times when it's easier to find out what release introduced a feature
by looking at the release notes, and it's certainly more useful if you
want to send a link to someone who is not git-aware illustrating the
results of your search.

Well, maybe I'm the only one who is doing this and it's not worth
worrying about it just for me.  But I do it, all the same.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Release note bloat is getting out of hand

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


Robert Haas wrote:
> but there are times when it's easier to find out what release 
> introduced a feature by looking at the release notes, and it's 
> certainly more useful if you want to send a link to someone who 
> is not git-aware illustrating the results of your search.
>
> Well, maybe I'm the only one who is doing this and it's not worth
> worrying about it just for me.  But I do it, all the same.

I do this *all the time*. Please don't mess with the release notes.
Except to put them all on one page for easy searching. That would 
be awesome.

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201502021555
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlTP5EQACgkQvJuQZxSWSsj13QCfTrKBKDlOm0E5K4+2ib7F8Tjl
w5QAoOY3vX9tUb1KUxk3VaW+k71vrW7m
=y+SU
-----END PGP SIGNATURE-----





Re: Release note bloat is getting out of hand

From
Andreas Karlsson
Date:
On 02/02/2015 09:38 PM, Robert Haas wrote:
> Well, maybe I'm the only one who is doing this and it's not worth
> worrying about it just for me.  But I do it, all the same.

I do the later quite often: link people to old release notes. For me it 
would be fine to remove them from tar balls as long as they are still on 
the website.

-- 
Andreas Karlsson



Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So I'm bemused by Robert's insistence that he wants that format to support
>> searches.  As I said, I find it far more convenient to search the output
>> of "git log" and/or src/tools/git_changelog --- I keep text files of those
>> around for exactly that purpose.

> I normally search in one of two ways.  Sometimes a grep the sgml;
> other times, I go to, say,
> http://www.postgresql.org/docs/devel/static/release-9-4.html and then
> edit the URL to take me back to 9.3, 9.2, 9.1, etc.  It's true that
> 'git log' is often the place to go searching for stuff, but there are
> times when it's easier to find out what release introduced a feature
> by looking at the release notes, and it's certainly more useful if you
> want to send a link to someone who is not git-aware illustrating the
> results of your search.

> Well, maybe I'm the only one who is doing this and it's not worth
> worrying about it just for me.  But I do it, all the same.

I'm not out to take away a feature you need.  I'm just wondering why it
has to be supported in exactly the way it's done now.  Wouldn't a
separately maintained release-notes-only document serve the purpose fine?
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Andres Freund
Date:
On February 2, 2015 9:38:43 PM CET, Robert Haas <robertmhaas@gmail.com> wrote:
>On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The existing release notes are not conveniently searchable, for sure;
>> they're not in a single file, and they don't show up on a single page
>> on the Web, and I've never seen a PDF-searching tool that didn't
>suck.
>> So I'm bemused by Robert's insistence that he wants that format to
>support
>> searches.  As I said, I find it far more convenient to search the
>output
>> of "git log" and/or src/tools/git_changelog --- I keep text files of
>those
>> around for exactly that purpose.
>
>I normally search in one of two ways.  Sometimes a grep the sgml;
>other times, I go to, say,
>http://www.postgresql.org/docs/devel/static/release-9-4.html and then
>edit the URL to take me back to 9.3, 9.2, 9.1, etc.  

FWIW I the same. Git log is great if you want all detail. But often enough the more condensed format of the release
notesis helpful. Say, a customer has problems after migrating to a new version. It's quite a bit faster to read the
sectionabout incompatibilities than travel through the git log.
 

There's a reason the release notes exist. Given that they're apparently useful, it doesn't seem strange that devs
sometimesread them...
 



--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: Release note bloat is getting out of hand

From
"Joshua D. Drake"
Date:
On 02/02/2015 07:54 AM, Robert Haas wrote:

> The last 5 branches only takes us back to 9.0, which isn't very far.
> I would want to have at least the 8.x branches in the SGML build, and
> maybe the 7.x branches as well.  I would be happy to drop anything
> pre-7.x from the docs build and just let the people who care look at
> the SGML.  You seem to be assuming that nobody spends much time
> looking at the release notes for older branches, but that is certainly
> false in my own case.

It seems to me that the docs that are shipped should only contain 
information in regards to supported versions. Frankly there is no reason 
to ship any release notes except for the version that they are shipping 
with (e.g; there is no reason for 9.0 to be in 9.1). It is just bloat at 
that point when we can point everyone to the website or ftp site.

JD




-- 
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should             not be surprised when they come back as
Romans."



Re: Release note bloat is getting out of hand

From
Peter Eisentraut
Date:
On 2/1/15 11:10 PM, Tom Lane wrote:
> I think it's time we changed the policy of including all release notes
> back to the beginning in Appendix E.

I share the sentiment that the release notes *seem* too big, but the
subsequent discussion shows that it's not clear why that's really a
problem.  Exactly what problem are we trying to fix?




Re: Release note bloat is getting out of hand

From
David G Johnston
Date:
On Mon, Feb 2, 2015 at 6:40 PM, Peter Eisentraut-2 [via PostgreSQL] <[hidden email]> wrote:
On 2/1/15 11:10 PM, Tom Lane wrote:
> I think it's time we changed the policy of including all release notes
> back to the beginning in Appendix E.

I share the sentiment that the release notes *seem* too big, but the
subsequent discussion shows that it's not clear why that's really a
problem.  Exactly what problem are we trying to fix?


​We'd get a lines-of-code decrease which would translate into a improvement in the make process time; most noticeable for someone doing a doc-only build, multiple times, to see how a doc change looks.  No time percentage has been provided yet but the goal seems reasonable in theory.

David J.​
 


View this message in context: Re: Release note bloat is getting out of hand
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: Release note bloat is getting out of hand

From
Josh Berkus
Date:
On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
> On 2/1/15 11:10 PM, Tom Lane wrote:
>> I think it's time we changed the policy of including all release notes
>> back to the beginning in Appendix E.
> 
> I share the sentiment that the release notes *seem* too big, but the
> subsequent discussion shows that it's not clear why that's really a
> problem.  Exactly what problem are we trying to fix?

At a rough count of lines, the release notes for unsupported versions
are about 18% of documentation overall (47K out of 265K lines).  So
they're not insubstantial.  Compared to the total size of the tarball,
though ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Release note bloat is getting out of hand

From
Jim Nasby
Date:
On 2/2/15 3:10 PM, Andres Freund wrote:
> On February 2, 2015 9:38:43 PM CET, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> The existing release notes are not conveniently searchable, for sure;
>>> they're not in a single file, and they don't show up on a single page
>>> on the Web, and I've never seen a PDF-searching tool that didn't
>> suck.
>>> So I'm bemused by Robert's insistence that he wants that format to
>> support
>>> searches.  As I said, I find it far more convenient to search the
>> output
>>> of "git log" and/or src/tools/git_changelog --- I keep text files of
>> those
>>> around for exactly that purpose.
>>
>> I normally search in one of two ways.  Sometimes a grep the sgml;
>> other times, I go to, say,
>> http://www.postgresql.org/docs/devel/static/release-9-4.html and then
>> edit the URL to take me back to 9.3, 9.2, 9.1, etc.
>
> FWIW I the same. Git log is great if you want all detail. But often enough the more condensed format of the release
notesis helpful. Say, a customer has problems after migrating to a new version. It's quite a bit faster to read the
sectionabout incompatibilities than travel through the git log.
 

This wouldn't prevent that; you could still point them to 
http://www.postgresql.org/docs/7.1/static/release-0-01.html

> There's a reason the release notes exist. Given that they're apparently useful, it doesn't seem strange that devs
sometimesread them...
 

Sure, but dev's have any number of other ways to get at this info, and 
in a fashion that's actually *more* useful to them. Several people have 
asked for a single grep-able file, for example. ISTM that keeping such a 
file around in the source (and perhaps in /src instead of /doc) should 
be easy.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
>> I share the sentiment that the release notes *seem* too big, but the
>> subsequent discussion shows that it's not clear why that's really a
>> problem.  Exactly what problem are we trying to fix?

> At a rough count of lines, the release notes for unsupported versions
> are about 18% of documentation overall (47K out of 265K lines).  So
> they're not insubstantial.  Compared to the total size of the tarball,
> though ...

It would not make that much of a difference in tarball size, agreed.
It *would* make a difference in the build time and output size of the
SGML docs --- as I mentioned at the outset, the release notes currently
account for 25% of the SGML source linecount.

Now, that's probably still only marginally a problem, but my real
point is that this is not sustainable.  The release notes are growing
faster than the rest of the docs.  This isn't so obvious if you compare
adjacent release branches, but over a slightly longer timescale it is.
A quick "wc -l" in my current git checkouts gives

Release        release-*.sgml    all .sgml    Percent

8.3        37770        204060        18.5
9.0        59318        250493        23.7
HEAD        85672        336874        25.4

We can stick our heads in the sand for awhile longer yet, but
eventually this is going to have to be dealt with.
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
>>> I share the sentiment that the release notes *seem* too big, but the
>>> subsequent discussion shows that it's not clear why that's really a
>>> problem.  Exactly what problem are we trying to fix?
>>
>> At a rough count of lines, the release notes for unsupported versions
>> are about 18% of documentation overall (47K out of 265K lines).  So
>> they're not insubstantial.  Compared to the total size of the tarball,
>> though ...
>
> It would not make that much of a difference in tarball size, agreed.
> It *would* make a difference in the build time and output size of the
> SGML docs --- as I mentioned at the outset, the release notes currently
> account for 25% of the SGML source linecount.

I run `make -s -j4 world` on my i7 fairly often, and it is often
the doc build that I wind up waiting for at the end.

FWIW, my preference would be that unless you choose a special "all
release notes" build, it only build release notes for supported
versions and only up to the point that the branch being built split
off.  That way, you don't have to work at it to see whether the
release notes from an older branch duplicate the same bug fixes as
a later branch that is listed.  I think it makes sense to put the
"all release notes" page up on our web site.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Release note bloat is getting out of hand

From
Andrew Dunstan
Date:
On 02/03/2015 08:55 AM, Kevin Grittner wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Josh Berkus <josh@agliodbs.com> writes:
>>> On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
>>>> I share the sentiment that the release notes *seem* too big, but the
>>>> subsequent discussion shows that it's not clear why that's really a
>>>> problem.  Exactly what problem are we trying to fix?
>>> At a rough count of lines, the release notes for unsupported versions
>>> are about 18% of documentation overall (47K out of 265K lines).  So
>>> they're not insubstantial.  Compared to the total size of the tarball,
>>> though ...
>> It would not make that much of a difference in tarball size, agreed.
>> It *would* make a difference in the build time and output size of the
>> SGML docs --- as I mentioned at the outset, the release notes currently
>> account for 25% of the SGML source linecount.
> I run `make -s -j4 world` on my i7 fairly often, and it is often
> the doc build that I wind up waiting for at the end.
>
>

I realize this is slightly OT, but I wonder if it might be worth having 
targets that build and install everything but the docs.

cheers

andrew



Re: Release note bloat is getting out of hand

From
Robert Haas
Date:
On Mon, Feb 2, 2015 at 4:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> So I'm bemused by Robert's insistence that he wants that format to support
>>> searches.  As I said, I find it far more convenient to search the output
>>> of "git log" and/or src/tools/git_changelog --- I keep text files of those
>>> around for exactly that purpose.
>
>> I normally search in one of two ways.  Sometimes a grep the sgml;
>> other times, I go to, say,
>> http://www.postgresql.org/docs/devel/static/release-9-4.html and then
>> edit the URL to take me back to 9.3, 9.2, 9.1, etc.  It's true that
>> 'git log' is often the place to go searching for stuff, but there are
>> times when it's easier to find out what release introduced a feature
>> by looking at the release notes, and it's certainly more useful if you
>> want to send a link to someone who is not git-aware illustrating the
>> results of your search.
>
>> Well, maybe I'm the only one who is doing this and it's not worth
>> worrying about it just for me.  But I do it, all the same.
>
> I'm not out to take away a feature you need.  I'm just wondering why it
> has to be supported in exactly the way it's done now.  Wouldn't a
> separately maintained release-notes-only document serve the purpose fine?

Well, I'd like to have all the release notes on the web site as they
are now, and I'd like to have all the SGML in the build tree as it is
now.  If someone wants to create a make docs-fast target that omits
older release notes, that's obviously not a problem.  What I don't
want to do is have information that is easily available now (put the
right two digits into the URL bar and you are done) suddenly require
more thought to get at (OK, so back through 9.0 is on the web site,
and then there's this other place you can go to get the older stuff,
if you can remember where it is).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Release note bloat is getting out of hand

From
Magnus Hagander
Date:
On Tue, Feb 3, 2015 at 3:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 2, 2015 at 4:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> So I'm bemused by Robert's insistence that he wants that format to support
>>> searches.  As I said, I find it far more convenient to search the output
>>> of "git log" and/or src/tools/git_changelog --- I keep text files of those
>>> around for exactly that purpose.
>
>> I normally search in one of two ways.  Sometimes a grep the sgml;
>> other times, I go to, say,
>> http://www.postgresql.org/docs/devel/static/release-9-4.html and then
>> edit the URL to take me back to 9.3, 9.2, 9.1, etc.  It's true that
>> 'git log' is often the place to go searching for stuff, but there are
>> times when it's easier to find out what release introduced a feature
>> by looking at the release notes, and it's certainly more useful if you
>> want to send a link to someone who is not git-aware illustrating the
>> results of your search.
>
>> Well, maybe I'm the only one who is doing this and it's not worth
>> worrying about it just for me.  But I do it, all the same.
>
> I'm not out to take away a feature you need.  I'm just wondering why it
> has to be supported in exactly the way it's done now.  Wouldn't a
> separately maintained release-notes-only document serve the purpose fine?

Well, I'd like to have all the release notes on the web site as they
are now, and I'd like to have all the SGML in the build tree as it is
now.  If someone wants to create a make docs-fast target that omits
older release notes, that's obviously not a problem.  What I don't
want to do is have information that is easily available now (put the
right two digits into the URL bar and you are done) suddenly require
more thought to get at (OK, so back through 9.0 is on the web site,
and then there's this other place you can go to get the older stuff,
if you can remember where it is).


Not sure how feasible that it through build targets, but perhaps we could also include it in the HTML docs for the website per above, but skip it in the PDFs, which would make them a *lot* easier to deal with I bet.
 
--

Re: Release note bloat is getting out of hand

From
David Fetter
Date:
On Tue, Feb 03, 2015 at 09:08:45AM -0500, Andrew Dunstan wrote:
> 
> On 02/03/2015 08:55 AM, Kevin Grittner wrote:
> >Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>Josh Berkus <josh@agliodbs.com> writes:
> >>>On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
> >>>>I share the sentiment that the release notes *seem* too big, but the
> >>>>subsequent discussion shows that it's not clear why that's really a
> >>>>problem.  Exactly what problem are we trying to fix?
> >>>At a rough count of lines, the release notes for unsupported versions
> >>>are about 18% of documentation overall (47K out of 265K lines).  So
> >>>they're not insubstantial.  Compared to the total size of the tarball,
> >>>though ...
> >>It would not make that much of a difference in tarball size, agreed.
> >>It *would* make a difference in the build time and output size of the
> >>SGML docs --- as I mentioned at the outset, the release notes currently
> >>account for 25% of the SGML source linecount.
> >I run `make -s -j4 world` on my i7 fairly often, and it is often
> >the doc build that I wind up waiting for at the end.
> 
> I realize this is slightly OT, but I wonder if it might be worth having
> targets that build and install everything but the docs.

That'd be great.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Release note bloat is getting out of hand

From
David Fetter
Date:
On Mon, Feb 02, 2015 at 08:56:19PM -0000, Greg Sabino Mullane wrote:
> Robert Haas wrote:
> > but there are times when it's easier to find out what release 
> > introduced a feature by looking at the release notes, and it's 
> > certainly more useful if you want to send a link to someone who 
> > is not git-aware illustrating the results of your search.
> >
> > Well, maybe I'm the only one who is doing this and it's not worth
> > worrying about it just for me.  But I do it, all the same.
> 
> I do this *all the time*. Please don't mess with the release notes.
> Except to put them all on one page for easy searching. That would 
> be awesome.

Truly awesome.  When supporting older versions, being able to find
precisely when a feature was introduced in a single search through a
file, or "find in page," for those using web browsers, would really
smooth off some burrs.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Release note bloat is getting out of hand

From
Marko Tiikkaja
Date:
On 2/3/15 4:04 PM, David Fetter wrote:
> On Mon, Feb 02, 2015 at 08:56:19PM -0000, Greg Sabino Mullane wrote:
>> Robert Haas wrote:
>>> but there are times when it's easier to find out what release
>>> introduced a feature by looking at the release notes, and it's
>>> certainly more useful if you want to send a link to someone who
>>> is not git-aware illustrating the results of your search.
>>>
>>> Well, maybe I'm the only one who is doing this and it's not worth
>>> worrying about it just for me.  But I do it, all the same.
>>
>> I do this *all the time*. Please don't mess with the release notes.
>> Except to put them all on one page for easy searching. That would
>> be awesome.
>
> Truly awesome.  When supporting older versions, being able to find
> precisely when a feature was introduced in a single search through a
> file, or "find in page," for those using web browsers, would really
> smooth off some burrs.

And now that we're on the subject of ponies, it would be nice if the 
relevant git hashes were included as well.


.m



Re: Release note bloat is getting out of hand

From
Peter Eisentraut
Date:
On 2/3/15 10:11 AM, Marko Tiikkaja wrote:
> And now that we're on the subject of ponies, it would be nice if the
> relevant git hashes were included as well.

That's probably not going to happen.  A release-note entry is often the
combination of many commits, and accurately tracking those is a lot of work.





Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On 2/3/15 10:11 AM, Marko Tiikkaja wrote:
>> And now that we're on the subject of ponies, it would be nice if the
>> relevant git hashes were included as well.

> That's probably not going to happen.  A release-note entry is often the
> combination of many commits, and accurately tracking those is a lot of work.

Well, actually, modern release note entries tend to look like

<!--
Author: Andres Freund <andres@anarazel.de>
Branch: master [3fabed070] 2015-01-07 00:19:37 +0100
Branch: REL9_4_STABLE [7da102154] 2015-01-07 00:24:58 +0100
Author: Andres Freund <andres@anarazel.de>
Branch: master [31912d01d] 2015-01-07 00:18:00 +0100
Branch: REL9_4_STABLE [84911ff51] 2015-01-07 00:24:47 +0100
-->
   <listitem>    <para>     Assorted fixes for logical decoding (Andres Freund)    </para>   </listitem>

because they're prepared on the basis of src/tools/git_changelog output
and so having the git hashes is merely a matter of not deleting that
info after we're done writing the user-visible text.  So I could imagine
some tool that presents that info.  I'm not volunteering to write it
though ...
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Peter Eisentraut
Date:
On 2/3/15 12:32 AM, Tom Lane wrote:
> It would not make that much of a difference in tarball size, agreed.
> It *would* make a difference in the build time and output size of the
> SGML docs --- as I mentioned at the outset, the release notes currently
> account for 25% of the SGML source linecount.

We could make the release notes a separate top-level document.  Then it
could be built in parallel with the documentation.  There are only two
links from the rest of the docs into the release notes, so there
wouldn't be much harm.

Doing this would also solve an unrelated issue:  Sometimes, when I run a
spell or markup checker over the documentation, the old release notes
tend to get flagged for many things, because we don't edit those after
the release.  Separating the actively maintained documentation from the
it-is-what-it-is release notes would make that distinction clearer.

The web site would need to be reorganized slightly, but I can imagine
that in the long run this could also be a win.

Note also that you only need to present the release notes from the
latest stable release branch on the web site, as opposed to
documentation for each branch.




Re: Release note bloat is getting out of hand

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Note also that you only need to present the release notes from the
> latest stable release branch on the web site, as opposed to
> documentation for each branch.

Yeah, JD suggested the same upthread.  If we went over to a separate
document containing all the historical notes, then it would make sense
for the main documentation to contain only release notes for the current
branch, which would further reduce its build time.  My thread-starting
proposal of keeping the last five branches was based on the assumption
that we didn't need any whole-history document, but if we're keeping one
separately then this seems to make the most sense.
        regards, tom lane



Re: Release note bloat is getting out of hand

From
Steven Lembark
Date:
> -1.  I find it very useful to be able to go back through all the
> release notes using grep, and have done so on multiple occasions.  It
> sounds like this policy would make that harder, and I don't see what
> we get out of of it.  It doesn't bother me that the SGML documentation
> of the release notes is big; disk space is cheap.

Put the full release notes in a tarball on the website, distribute
the relevant portions with the distro?

A single wget "ftp://foo/bar-current.tar.gz" will pull the whole thing
for full history; through if you're grepping it having a set of files
with version-speicific changes works equaly well. 

-- 
Steven Lembark                                             3646 Flora Pl
Workhorse Computing                                   St Louis, MO 63110
lembark@wrkhors.com                                      +1 888 359 3508



Re: Release note bloat is getting out of hand

From
Steven Lembark
Date:
> Disk space isn't the only consideration here; if it were I'd not be
> concerned about this.  Processing time is an issue, and so is distribution
> size, and so is the length of the manual if someone decides to print it
> on dead trees.  I also live in fear of the day that we hit some hard-to-
> change internal limit in TeX.
> 
> Personally, what I grep when I'm looking for historical info is "git log"
> output, which will certainly not be getting any shorter.

Fifteen years ago distributing it all made sense: not everyone had
access to get the doc's and they were manageable and didn't take too
long to download at 9600Kb (N71...).

Today most users will have direct internet access and the whole 
content is intimidating (let alone heavy if you try to print it); 
Searching the doc's actually becomes more difficult due to the sheer
volume of matches from outdated packages.

A majority of users these days will look up anything they need via 
Google (Duck, Y!...) rather than search the original content anyway: 
having it available via web actually makes it more searchable in that 
case. Left to my own devices I'd rather have the current major version
locally for the rare times my connection is down when I have to 
restore a database and be able to search version-specific content 
via the net otherwise.

-- 
Steven Lembark                                             3646 Flora Pl
Workhorse Computing                                   St Louis, MO 63110
lembark@wrkhors.com                                      +1 888 359 3508



Re: Release note bloat is getting out of hand

From
Steven Lembark
Date:
> searchable version of the release notes.

Would be a wonderful thing if it happened.

Segregating the content by version would help -- finding lots of notes
about version 7 & 8 when I'm running 9.3/4 helps not at all.



-- 
Steven Lembark                                             3646 Flora Pl
Workhorse Computing                                   St Louis, MO 63110
lembark@wrkhors.com                                      +1 888 359 3508



On 02/03/2015 10:00 AM, David Fetter wrote:
> On Tue, Feb 03, 2015 at 09:08:45AM -0500, Andrew Dunstan wrote:
>> On 02/03/2015 08:55 AM, Kevin Grittner wrote:
>>> I run `make -s -j4 world` on my i7 fairly often, and it is often
>>> the doc build that I wind up waiting for at the end.
>> I realize this is slightly OT, but I wonder if it might be worth having
>> targets that build and install everything but the docs.
> That'd be great.
>



Here's a tiny patch for that

cheers

andrew





Attachment
On 2/4/15 2:24 PM, Andrew Dunstan wrote:
> 
> On 02/03/2015 10:00 AM, David Fetter wrote:
>> On Tue, Feb 03, 2015 at 09:08:45AM -0500, Andrew Dunstan wrote:
>>> On 02/03/2015 08:55 AM, Kevin Grittner wrote:
>>>> I run `make -s -j4 world` on my i7 fairly often, and it is often
>>>> the doc build that I wind up waiting for at the end.
>>> I realize this is slightly OT, but I wonder if it might be worth having
>>> targets that build and install everything but the docs.
>> That'd be great.

> Here's a tiny patch for that

Not excited about that name.  (Does "bin" include "lib?")

If we're reshuffling, how about renaming world to all, and adding
all-no-doc or something similarly explicit?

Or maybe use a make variable, like NO_DOC.  I think that's preferable to
adding more targets.




Re: Release note bloat is getting out of hand

From
Peter Eisentraut
Date:
On 2/3/15 11:53 AM, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> Note also that you only need to present the release notes from the
>> latest stable release branch on the web site, as opposed to
>> documentation for each branch.
> 
> Yeah, JD suggested the same upthread.  If we went over to a separate
> document containing all the historical notes, then it would make sense
> for the main documentation to contain only release notes for the current
> branch, which would further reduce its build time.  My thread-starting
> proposal of keeping the last five branches was based on the assumption
> that we didn't need any whole-history document, but if we're keeping one
> separately then this seems to make the most sense.

I think that's not what I was trying to say.  My proposal would be to
leave the source code in each branch exactly the same, but redefine the
documentation build to build two separate documents: one with the
documentation without any release notes, and one with all the release notes.




Peter Eisentraut <peter_e@gmx.net> writes:
> On 2/4/15 2:24 PM, Andrew Dunstan wrote:
>> On 02/03/2015 10:00 AM, David Fetter wrote:
>>> On 02/03/2015 08:55 AM, Kevin Grittner wrote:
>>>> I realize this is slightly OT, but I wonder if it might be worth having
>>>> targets that build and install everything but the docs.

>> Here's a tiny patch for that

> Not excited about that name.  (Does "bin" include "lib?")

Yeah, doesn't seem quite right to me either.

> If we're reshuffling, how about renaming world to all, and adding
> all-no-doc or something similarly explicit?

-1 for renaming any existing targets; that's quite likely to cause
packagers pain, for little gain IMO.

> Or maybe use a make variable, like NO_DOC.  I think that's preferable to
> adding more targets.

Unless we can come up with a new target name that obviously means
"world minus docs", the make-variable idea may be the best.
        regards, tom lane



On 02/04/2015 06:53 PM, Tom Lane wrote:
>> Or maybe use a make variable, like NO_DOC.  I think that's preferable to
>> adding more targets.
> Unless we can come up with a new target name that obviously means
> "world minus docs", the make-variable idea may be the best.
>
>             


I'm not terribly keen on this. If you don't like "binworld", how about 
"world-no-docs"?

cheers

andrew



Andrew Dunstan <andrew@dunslane.net> writes:
> I'm not terribly keen on this. If you don't like "binworld", how about 
> "world-no-docs"?

[ shrug... ]  Doesn't bother me particularly.
        regards, tom lane



On 2/4/15 8:20 PM, Andrew Dunstan wrote:
> 
> On 02/04/2015 06:53 PM, Tom Lane wrote:
>>> Or maybe use a make variable, like NO_DOC.  I think that's preferable to
>>> adding more targets.
>> Unless we can come up with a new target name that obviously means
>> "world minus docs", the make-variable idea may be the best.

> I'm not terribly keen on this. If you don't like "binworld", how about
> "world-no-docs"?

I think using options of some kind instead of top-level targets is
preferable.

If we add world-no-docs, should we also add install-world-no-docs,
installdirs-world-no-docs, uninstall-world-no-docs, check-work-no-docs,
installcheck-world-no-docs, clean-no-docs, distclean-no-docs, etc.?
This would get out of hand.

Also, it's harder to port things like that to other build systems,
including the secondary ones we already have.

We already have configure options to decide that we don't want to deal
with part of the tree.  (There is no make world-no-python.)  We used to
have support in configure to not build part of the docs.  We could
resurrect that if that's what people want.  I'd actually prefer that
even over a make variable.