Thread: [PATCH] Various documentation typo/grammar fixes

[PATCH] Various documentation typo/grammar fixes

From
Marti Raudsepp
Date:
Hi

I ran the Topy typo-fixer (https://github.com/intgr/topy , using rules
developed for Wikipedia) over the PostgreSQL documentation directory.
Attached is a patch with hand-picked fixes that are clearly an
improvement.

There are also some changes that I didn't include that seem somewhat
opinionated, for example:
* "eg" -> "e.g."
* "etc" -> "etc."
* "the exact same" -> "exactly the same"
* "newly-received" -> "newly received"
* "recently-used" -> "recently used"

Does anyone have an opinion about these, should I also include them in
the patch?

Regards,
Marti

Attachment

Re: [PATCH] Various documentation typo/grammar fixes

From
Kevin Grittner
Date:
Marti Raudsepp <marti@juffo.org> wrote:

> I ran the Topy typo-fixer (https://github.com/intgr/topy , using rules
> developed for Wikipedia) over the PostgreSQL documentation directory.
> Attached is a patch with hand-picked fixes that are clearly an
> improvement.
>
> There are also some changes that I didn't include that seem somewhat
> opinionated, for example:
> * "eg" -> "e.g."
> * "etc" -> "etc."

These two seem to me like they should be changed; otherwise they
just don't look correct.

> * "the exact same" -> "exactly the same"

The old version sounds a bit colloquial to my ear, so it seems like
it may be worth changing.  The problem is, this change doesn't seem
like it would generally be a direct substitution; it may require a
major reorganization of the sentence.  I think the resulting
sentences need to be compared to see which is more clear and
concise.

> * "newly-received" -> "newly received"
> * "recently-used" -> "recently used"

I think these depend on context.  To me, the former scans better
when the combination of words is used as an adjective.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Various documentation typo/grammar fixes

From
Marti Raudsepp
Date:
On Mon, Aug 25, 2014 at 4:46 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>> * "eg" -> "e.g."
>> * "etc" -> "etc."
> These two seem to me like they should be changed; otherwise they
> just don't look correct.

Oh, right you are; I looked it up on some grammar forums and there
seems to be no question about this. I attached these changes as patch
0002.

(For the record I'm not a native English speaker, much less using
Latin in English sentences ;)

>> * "the exact same" -> "exactly the same"
> The old version sounds a bit colloquial to my ear, so it seems like
> it may be worth changing.

>> * "newly-received" -> "newly received"
>> * "recently-used" -> "recently used"
> I think these depend on context.  To me, the former scans better
> when the combination of words is used as an adjective.

Attached these as patch 0003 so someone who feels more confident can
take a look.

Regards,
Marti

Attachment

Re: [PATCH] Various documentation typo/grammar fixes

From
Kevin Grittner
Date:
Marti Raudsepp <marti@juffo.org> wrote:

> I ran the Topy typo-fixer (https://github.com/intgr/topy , using rules
> developed for Wikipedia) over the PostgreSQL documentation directory.
> Attached is a patch with hand-picked fixes that are clearly an
> improvement.

For the initial patch, I agree with all except the first change of
"others'" to "other's" -- the start of the sentence uses "All", so
there is clearly more than one "other", so the the apostrophe
belongs after the "s".  The rest of the 0001- patch all look like
improvements to me.

I think most or all of the places that the 0002- patch (i.e. and
e.g.) need a change to look right, but I'm not sure this goes far
enough.  In style guides I've had to use these generally should
follow punctuation (left parenthesis, colon, semi-colon, comma, or
em dash), and be separated from any following text by punctuation
-- usually a comma.

In patch 0003- I agree that the change to "exactly the same" reads
better.  I'm torn on changing the hyphens to spaces.  I probably
wouldn't change them, but I wouldn't squawk if others preferred to
do so.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Various documentation typo/grammar fixes

From
Marti Raudsepp
Date:
On Mon, Aug 25, 2014 at 10:13 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> For the initial patch, I agree with all except the first change of
> "others'" to "other's" -- the start of the sentence uses "All", so
> there is clearly more than one "other", so the the apostrophe
> belongs after the "s".

Not arguing against this, but all the grammar advice I've read
suggests that "each others' " is always wrong. For example
http://jakubmarian.com/each-others-vs-each-others-in-english/

To illustrate this better, supposedly the "other" can be swapped out
for "statement":

All the statements are executed with the same snapshot, so they cannot
see each statements' effects / each statement's effects

What do you think?

> I think most or all of the places that the 0002- patch (i.e. and
> e.g.) need a change to look right, but I'm not sure this goes far
> enough.  In style guides I've had to use these generally should
> follow punctuation (left parenthesis, colon, semi-colon, comma, or
> em dash), and be separated from any following text by punctuation
> -- usually a comma.

This matches my research too. I wrote two rules to add commas where no
punctation was being used:

<Typo word="Add comma after i.e. & e.g."
      find="\b([Ii]\.e|[Ee]\.g)\.? ?(\s|$)"
      replace="$1.,$2" />
<Typo word="Add comma before i.e. & e.g."
      find="([a-z])(\s+([Ii]\.e|[Ee]\.g))\b"
      replace="$1,$2" />

Attached is the updated 0002 patch. I skimmed over all the differences
and they looked good to me, but with 124 more lines touched, it's
possible that I missed something.

> In patch 0003- I agree that the change to "exactly the same" reads
> better.  I'm torn on changing the hyphens to spaces.  I probably
> wouldn't change them

Agreed. I'll move the "exactly the same" hunk to patch 0001 once we
settle the "each other" question.

Thanks,
Marti

Attachment

Re: [PATCH] Various documentation typo/grammar fixes

From
Kevin Grittner
Date:
Marti Raudsepp <marti@juffo.org> wrote:
> Kevin Grittner <kgrittn@ymail.com> wrote:

>>  For the initial patch, I agree with all except the first change of
>>  "others'" to "other's" -- the start of the sentence uses "All", so
>>  there is clearly more than one "other", so the the apostrophe
>>  belongs after the "s".
>
> Not arguing against this, but all the grammar advice I've read
> suggests that "each others' " is always wrong. For example
> http://jakubmarian.com/each-others-vs-each-others-in-english/

Hmm.  I checked with the Chicago Manual of Style, which is my
preferred source, and it agrees that "each others'" is always
wrong.  More than that, it says that traditionalists use "each
other" when exactly two are involved and "one another" when more
than two are involved.  So to be totally proper, that instance of
"each others'" should be "one another's".  ("One anothers'" is also
always wrong.)

>>  I think most or all of the places that the 0002- patch (i.e. and
>>  e.g.) need a change to look right, but I'm not sure this goes far
>>  enough.  In style guides I've had to use these generally should
>>  follow punctuation (left parenthesis, colon, semi-colon, comma, or
>>  em dash), and be separated from any following text by punctuation
>>  -- usually a comma.
>
> This matches my research too. I wrote two rules to add commas where no
> punctation was being used:

> Attached is the updated 0002 patch. I skimmed over all the differences
> and they looked good to me, but with 124 more lines touched, it's
> possible that I missed something.

I'll need a little more time to digest all of those in detail, too.
It sounds like we agree.  Where a couple occurrences of "etc" are
being to abbreviate a code example we might have a special case, but
otherwise these are likely to all be the way I'm used to seeing
them.

>>  In patch 0003- I agree that the change to "exactly the same" reads
>>  better.  I'm torn on changing the hyphens to spaces.  I probably
>>  wouldn't change them
>
> Agreed. I'll move the "exactly the same" hunk to patch 0001 once
> we settle the "each other" question.

Sounds like a good idea.  Each patch file is then dealing with a
distinct set of issues.



--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [PATCH] Various documentation typo/grammar fixes

From
Marti Raudsepp
Date:
On Tue, Aug 26, 2014 at 12:59 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> More than that, it says that traditionalists use "each
> other" when exactly two are involved and "one another" when more
> than two are involved.  So to be totally proper, that instance of
> "each others'" should be "one another's".  ("One anothers'" is also
> always wrong.)

Ok, changed to "one another's". (I will say it sounds a bit odd to me,
but I suppose it is more correct).

All updated patches attached, per our previous discussion. Patches 1-2
are ready, 3 can be discarded unless someone else chimes in.

Do you think it's worth doing a run over developer documentation
(README files) and code comments too, or is that a waste of time?

Regards,
Marti

Attachment

Re: [PATCH] Various documentation typo/grammar fixes

From
Kevin Grittner
Date:
Marti Raudsepp <marti@juffo.org> wrote:

> All updated patches attached, per our previous discussion.
> Patches 1-2 are ready, 3 can be discarded unless someone else
> chimes in.

Patch 1 pushed, with each fix back-patched as far as the error
exists in supported releases.  I held off on patch 2 -- partly
because I spotted at least one case where things weren't quite
right, partly because there's so much I haven't had time to go over
it in sufficient detail, and partly because commas are omitted so
consistently where most style guides want them that I thought
someone might want to argue that this was an intentional style
choice and should be preserved.  I'm also not sure that it should
be back-patched as the outright errors were -- it does seem like
more of a style issue than most of the things in patch 1 were,
which included clear misspellings, accidentally repeated words, and
incorrect choices of indefinite articles.

The remaining error I mentioned above was an occurrence of a
parenthetical remark which started with "e.g.," and ended with
", etc.").  The Chicago Manual of Style says that you can use one
or the other, but using both is redundant and should be avoided.

> Do you think it's worth doing a run over developer documentation
> (README files) and code comments too, or is that a waste of time?

Some README files could use a lot of attention; I'm just not sure
that fixing a few misspelling, repeated words, or grammar errors
makes enough of a dent in what is needed to be worth it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Various documentation typo/grammar fixes

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Patch 1 pushed, with each fix back-patched as far as the error
> exists in supported releases.� I held off on patch 2 -- partly
> because I spotted at least one case where things weren't quite
> right, partly because there's so much I haven't had time to go over
> it in sufficient detail, and partly because commas are omitted so
> consistently where most style guides want them that I thought
> someone might want to argue that this was an intentional style
> choice and should be preserved.

I would argue against applying patch 2 at all.  I think it's what
John McIntyre (http://www.baltimoresun.com/news/language-blog/)
would call peeverism.  If there are any places where the extra
commas/periods actually add anything to clarity, then sure, change
those places --- but doing it only because some style guide tells
you to is not the way to approach the issue.  We're writing English
not C code here, and so there is no single standard of correctness.

            regards, tom lane


Re: [PATCH] Various documentation typo/grammar fixes

From
David G Johnston
Date:
Tom Lane-2 wrote
> Kevin Grittner <

> kgrittn@

> > writes:
>> Patch 1 pushed, with each fix back-patched as far as the error
>> exists in supported releases.  I held off on patch 2 -- partly
>> because I spotted at least one case where things weren't quite
>> right, partly because there's so much I haven't had time to go over
>> it in sufficient detail, and partly because commas are omitted so
>> consistently where most style guides want them that I thought
>> someone might want to argue that this was an intentional style
>> choice and should be preserved.
>
> I would argue against applying patch 2 at all.  I think it's what
> John McIntyre (http://www.baltimoresun.com/news/language-blog/)
> would call peeverism.  If there are any places where the extra
> commas/periods actually add anything to clarity, then sure, change
> those places --- but doing it only because some style guide tells
> you to is not the way to approach the issue.  We're writing English
> not C code here, and so there is no single standard of correctness.

Writing C code, aside from purely syntactic concerns, does not abide by a
single standard of style correctness either yet this project does have a
style guide that is to be adhered to.  I do not think it to be a bad thing
to accept a style guide as reference for the documentation as well.

My main objections are:

1) we are simply treating symptoms
2) we are introducing a decent amount of not-meaning-enhancing change

But if one is going to adopt a lassiez-faire attitude then whether or not
patch 2 (and 3) is applied should make little difference.

And given the surgical nature of the update in question the commit history
of the documentation is practically unaffected.

As a matter of internal consistency, and also for adherence to "best
practices", I'd vote to apply all three patches once fully reviewed - though
by the same logic above if we miss a few items the overall impact on the
qualify of the documentation will still be positive and the incorrect
locations will likely still be understandable.

I don't expect writers to be able to keep straight all of the style rules
that we'd like to try and adhere to but having good existing documentation
will make it more likely for new content to conform.  The goal would be to
let them focus on writing but then have some kind of editing process - which
is what is going on now - to aid in polishing the final publication.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PATCH-Various-documentation-typo-grammar-fixes-tp5816078p5817040.html
Sent from the PostgreSQL - docs mailing list archive at Nabble.com.


Re: [PATCH] Various documentation typo/grammar fixes

From
Marti Raudsepp
Date:
On Sat, Aug 30, 2014 at 8:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I would argue against applying patch 2 at all.

I don't feel strongly about this, I originally didn't plan to submit
this patch at all. But I don't agree with:

> doing it only because some style guide tells
> you to is not the way to approach the issue.  We're writing English
> not C code here, and so there is no single standard of correctness.

There isn't a "single standard", but there are many style guides about
English and pretty much all of them agree on the usage of periods in
"e.g.", "i.e.", "etc." and "et al."

With regards to commas following e.g. and i.e., this article surveyed
6 different English style guides and just 1 recommended not using the
comma: http://www.quickanddirtytips.com/education/grammar/ie-versus-eg?page=2

Regards,
Marti


Re: [PATCH] Various documentation typo/grammar fixes

From
Alvaro Herrera
Date:
Kevin Grittner wrote:

> In patch 0003- I agree that the change to "exactly the same" reads
> better.  I'm torn on changing the hyphens to spaces.  I probably
> wouldn't change them, but I wouldn't squawk if others preferred to
> do so.

Apparently adverb hyphenation discussion plagues the web.  One
reasonably-serious page I found in a very quick search is
http://www.dailywritingtips.com/adverbs-and-hyphens/
which cites the Chicago Manual of Style, among others.

There are lots of places in our docs that use hyphens after the "-ly"
suffix, so I think patch 3 is either too short because it should change
more of them, or too long because we should leave this alone.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: [PATCH] Various documentation typo/grammar fixes

From
Kevin Grittner
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> One reasonably-serious page

... and with that, I guess you've expressed a reasonably clear
preference.  ;-)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Various documentation typo/grammar fixes

From
Kevin Grittner
Date:
Marti Raudsepp <marti@juffo.org> wrote:
> On Sat, Aug 30, 2014 at 8:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I would argue against applying patch 2 at all.
>
> I don't feel strongly about this, I originally didn't plan to submit
> this patch at all. But I don't agree with:
>
>> doing it only because some style guide tells
>> you to is not the way to approach the issue.  We're writing English
>> not C code here, and so there is no single standard of correctness.
>
> There isn't a "single standard", but there are many style guides
> about
> English and pretty much all of them agree on the usage of periods in
> "e.g.", "i.e.", "etc." and "et al."

To me, omitting the dots in any of those looks like a misspelling.
I think we should fix those.

Also, it seems odd to so strictly enforce formatting rules in C
code (where it makes no semantic difference) but blow off style
issues, and even correctness of English usage, in the
documentation.

> With regards to commas following e.g. and i.e., this article surveyed
> 6 different English style guides and just 1 recommended not using the
>
> comma: http://www.quickanddirtytips.com/education/grammar/ie-versus-eg?page=2

Commas after "i.e." and "e.g." are less clearly a correctness issue
and getting more into style questions, so I wouldn't feel too bad
about letting those go where it doesn't cause confusion or too much
of a "double take" on reading.  Note that the cited page summarizes
the positions of these documents on the topic with phrases like "is
usually used", "preferable/optional", "makes good sense", and
"should be".  Only "The Columbia Guide to Standard American
English" actually said it was "required".

Starting a parenthetical clause with "e.g." and ending it with
", etc." also looks wrong to me.  My inclination is to pick one;
otherwise I find it distracting or confusing and tend to go back
over it one or two extra times to make sure I'm understanding.

There's at least one place I spotted "e.g." where it seemed to me
that the "example" was really a restatement in other terms, so it
seemed like it should have been "i.e." -- I would be inclined to
scan for more of those and present that as a separate patch, since
it's less mechanical than the others.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company