Thread: [PATCH] Various documentation typo/grammar fixes
Hi I ran the Topy typo-fixer (https://github.com/intgr/topy , using rules developed for Wikipedia) over the PostgreSQL documentation directory. Attached is a patch with hand-picked fixes that are clearly an improvement. There are also some changes that I didn't include that seem somewhat opinionated, for example: * "eg" -> "e.g." * "etc" -> "etc." * "the exact same" -> "exactly the same" * "newly-received" -> "newly received" * "recently-used" -> "recently used" Does anyone have an opinion about these, should I also include them in the patch? Regards, Marti
Attachment
Marti Raudsepp <marti@juffo.org> wrote: > I ran the Topy typo-fixer (https://github.com/intgr/topy , using rules > developed for Wikipedia) over the PostgreSQL documentation directory. > Attached is a patch with hand-picked fixes that are clearly an > improvement. > > There are also some changes that I didn't include that seem somewhat > opinionated, for example: > * "eg" -> "e.g." > * "etc" -> "etc." These two seem to me like they should be changed; otherwise they just don't look correct. > * "the exact same" -> "exactly the same" The old version sounds a bit colloquial to my ear, so it seems like it may be worth changing. The problem is, this change doesn't seem like it would generally be a direct substitution; it may require a major reorganization of the sentence. I think the resulting sentences need to be compared to see which is more clear and concise. > * "newly-received" -> "newly received" > * "recently-used" -> "recently used" I think these depend on context. To me, the former scans better when the combination of words is used as an adjective. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Aug 25, 2014 at 4:46 PM, Kevin Grittner <kgrittn@ymail.com> wrote: >> * "eg" -> "e.g." >> * "etc" -> "etc." > These two seem to me like they should be changed; otherwise they > just don't look correct. Oh, right you are; I looked it up on some grammar forums and there seems to be no question about this. I attached these changes as patch 0002. (For the record I'm not a native English speaker, much less using Latin in English sentences ;) >> * "the exact same" -> "exactly the same" > The old version sounds a bit colloquial to my ear, so it seems like > it may be worth changing. >> * "newly-received" -> "newly received" >> * "recently-used" -> "recently used" > I think these depend on context. To me, the former scans better > when the combination of words is used as an adjective. Attached these as patch 0003 so someone who feels more confident can take a look. Regards, Marti
Attachment
Marti Raudsepp <marti@juffo.org> wrote: > I ran the Topy typo-fixer (https://github.com/intgr/topy , using rules > developed for Wikipedia) over the PostgreSQL documentation directory. > Attached is a patch with hand-picked fixes that are clearly an > improvement. For the initial patch, I agree with all except the first change of "others'" to "other's" -- the start of the sentence uses "All", so there is clearly more than one "other", so the the apostrophe belongs after the "s". The rest of the 0001- patch all look like improvements to me. I think most or all of the places that the 0002- patch (i.e. and e.g.) need a change to look right, but I'm not sure this goes far enough. In style guides I've had to use these generally should follow punctuation (left parenthesis, colon, semi-colon, comma, or em dash), and be separated from any following text by punctuation -- usually a comma. In patch 0003- I agree that the change to "exactly the same" reads better. I'm torn on changing the hyphens to spaces. I probably wouldn't change them, but I wouldn't squawk if others preferred to do so. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Aug 25, 2014 at 10:13 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > For the initial patch, I agree with all except the first change of > "others'" to "other's" -- the start of the sentence uses "All", so > there is clearly more than one "other", so the the apostrophe > belongs after the "s". Not arguing against this, but all the grammar advice I've read suggests that "each others' " is always wrong. For example http://jakubmarian.com/each-others-vs-each-others-in-english/ To illustrate this better, supposedly the "other" can be swapped out for "statement": All the statements are executed with the same snapshot, so they cannot see each statements' effects / each statement's effects What do you think? > I think most or all of the places that the 0002- patch (i.e. and > e.g.) need a change to look right, but I'm not sure this goes far > enough. In style guides I've had to use these generally should > follow punctuation (left parenthesis, colon, semi-colon, comma, or > em dash), and be separated from any following text by punctuation > -- usually a comma. This matches my research too. I wrote two rules to add commas where no punctation was being used: <Typo word="Add comma after i.e. & e.g." find="\b([Ii]\.e|[Ee]\.g)\.? ?(\s|$)" replace="$1.,$2" /> <Typo word="Add comma before i.e. & e.g." find="([a-z])(\s+([Ii]\.e|[Ee]\.g))\b" replace="$1,$2" /> Attached is the updated 0002 patch. I skimmed over all the differences and they looked good to me, but with 124 more lines touched, it's possible that I missed something. > In patch 0003- I agree that the change to "exactly the same" reads > better. I'm torn on changing the hyphens to spaces. I probably > wouldn't change them Agreed. I'll move the "exactly the same" hunk to patch 0001 once we settle the "each other" question. Thanks, Marti
Attachment
Marti Raudsepp <marti@juffo.org> wrote: > Kevin Grittner <kgrittn@ymail.com> wrote: >> For the initial patch, I agree with all except the first change of >> "others'" to "other's" -- the start of the sentence uses "All", so >> there is clearly more than one "other", so the the apostrophe >> belongs after the "s". > > Not arguing against this, but all the grammar advice I've read > suggests that "each others' " is always wrong. For example > http://jakubmarian.com/each-others-vs-each-others-in-english/ Hmm. I checked with the Chicago Manual of Style, which is my preferred source, and it agrees that "each others'" is always wrong. More than that, it says that traditionalists use "each other" when exactly two are involved and "one another" when more than two are involved. So to be totally proper, that instance of "each others'" should be "one another's". ("One anothers'" is also always wrong.) >> I think most or all of the places that the 0002- patch (i.e. and >> e.g.) need a change to look right, but I'm not sure this goes far >> enough. In style guides I've had to use these generally should >> follow punctuation (left parenthesis, colon, semi-colon, comma, or >> em dash), and be separated from any following text by punctuation >> -- usually a comma. > > This matches my research too. I wrote two rules to add commas where no > punctation was being used: > Attached is the updated 0002 patch. I skimmed over all the differences > and they looked good to me, but with 124 more lines touched, it's > possible that I missed something. I'll need a little more time to digest all of those in detail, too. It sounds like we agree. Where a couple occurrences of "etc" are being to abbreviate a code example we might have a special case, but otherwise these are likely to all be the way I'm used to seeing them. >> In patch 0003- I agree that the change to "exactly the same" reads >> better. I'm torn on changing the hyphens to spaces. I probably >> wouldn't change them > > Agreed. I'll move the "exactly the same" hunk to patch 0001 once > we settle the "each other" question. Sounds like a good idea. Each patch file is then dealing with a distinct set of issues. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Aug 26, 2014 at 12:59 AM, Kevin Grittner <kgrittn@ymail.com> wrote: > More than that, it says that traditionalists use "each > other" when exactly two are involved and "one another" when more > than two are involved. So to be totally proper, that instance of > "each others'" should be "one another's". ("One anothers'" is also > always wrong.) Ok, changed to "one another's". (I will say it sounds a bit odd to me, but I suppose it is more correct). All updated patches attached, per our previous discussion. Patches 1-2 are ready, 3 can be discarded unless someone else chimes in. Do you think it's worth doing a run over developer documentation (README files) and code comments too, or is that a waste of time? Regards, Marti
Attachment
Marti Raudsepp <marti@juffo.org> wrote: > All updated patches attached, per our previous discussion. > Patches 1-2 are ready, 3 can be discarded unless someone else > chimes in. Patch 1 pushed, with each fix back-patched as far as the error exists in supported releases. I held off on patch 2 -- partly because I spotted at least one case where things weren't quite right, partly because there's so much I haven't had time to go over it in sufficient detail, and partly because commas are omitted so consistently where most style guides want them that I thought someone might want to argue that this was an intentional style choice and should be preserved. I'm also not sure that it should be back-patched as the outright errors were -- it does seem like more of a style issue than most of the things in patch 1 were, which included clear misspellings, accidentally repeated words, and incorrect choices of indefinite articles. The remaining error I mentioned above was an occurrence of a parenthetical remark which started with "e.g.," and ended with ", etc."). The Chicago Manual of Style says that you can use one or the other, but using both is redundant and should be avoided. > Do you think it's worth doing a run over developer documentation > (README files) and code comments too, or is that a waste of time? Some README files could use a lot of attention; I'm just not sure that fixing a few misspelling, repeated words, or grammar errors makes enough of a dent in what is needed to be worth it. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@ymail.com> writes: > Patch 1 pushed, with each fix back-patched as far as the error > exists in supported releases.� I held off on patch 2 -- partly > because I spotted at least one case where things weren't quite > right, partly because there's so much I haven't had time to go over > it in sufficient detail, and partly because commas are omitted so > consistently where most style guides want them that I thought > someone might want to argue that this was an intentional style > choice and should be preserved. I would argue against applying patch 2 at all. I think it's what John McIntyre (http://www.baltimoresun.com/news/language-blog/) would call peeverism. If there are any places where the extra commas/periods actually add anything to clarity, then sure, change those places --- but doing it only because some style guide tells you to is not the way to approach the issue. We're writing English not C code here, and so there is no single standard of correctness. regards, tom lane
Tom Lane-2 wrote > Kevin Grittner < > kgrittn@ > > writes: >> Patch 1 pushed, with each fix back-patched as far as the error >> exists in supported releases. I held off on patch 2 -- partly >> because I spotted at least one case where things weren't quite >> right, partly because there's so much I haven't had time to go over >> it in sufficient detail, and partly because commas are omitted so >> consistently where most style guides want them that I thought >> someone might want to argue that this was an intentional style >> choice and should be preserved. > > I would argue against applying patch 2 at all. I think it's what > John McIntyre (http://www.baltimoresun.com/news/language-blog/) > would call peeverism. If there are any places where the extra > commas/periods actually add anything to clarity, then sure, change > those places --- but doing it only because some style guide tells > you to is not the way to approach the issue. We're writing English > not C code here, and so there is no single standard of correctness. Writing C code, aside from purely syntactic concerns, does not abide by a single standard of style correctness either yet this project does have a style guide that is to be adhered to. I do not think it to be a bad thing to accept a style guide as reference for the documentation as well. My main objections are: 1) we are simply treating symptoms 2) we are introducing a decent amount of not-meaning-enhancing change But if one is going to adopt a lassiez-faire attitude then whether or not patch 2 (and 3) is applied should make little difference. And given the surgical nature of the update in question the commit history of the documentation is practically unaffected. As a matter of internal consistency, and also for adherence to "best practices", I'd vote to apply all three patches once fully reviewed - though by the same logic above if we miss a few items the overall impact on the qualify of the documentation will still be positive and the incorrect locations will likely still be understandable. I don't expect writers to be able to keep straight all of the style rules that we'd like to try and adhere to but having good existing documentation will make it more likely for new content to conform. The goal would be to let them focus on writing but then have some kind of editing process - which is what is going on now - to aid in polishing the final publication. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PATCH-Various-documentation-typo-grammar-fixes-tp5816078p5817040.html Sent from the PostgreSQL - docs mailing list archive at Nabble.com.
On Sat, Aug 30, 2014 at 8:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I would argue against applying patch 2 at all. I don't feel strongly about this, I originally didn't plan to submit this patch at all. But I don't agree with: > doing it only because some style guide tells > you to is not the way to approach the issue. We're writing English > not C code here, and so there is no single standard of correctness. There isn't a "single standard", but there are many style guides about English and pretty much all of them agree on the usage of periods in "e.g.", "i.e.", "etc." and "et al." With regards to commas following e.g. and i.e., this article surveyed 6 different English style guides and just 1 recommended not using the comma: http://www.quickanddirtytips.com/education/grammar/ie-versus-eg?page=2 Regards, Marti
Kevin Grittner wrote: > In patch 0003- I agree that the change to "exactly the same" reads > better. I'm torn on changing the hyphens to spaces. I probably > wouldn't change them, but I wouldn't squawk if others preferred to > do so. Apparently adverb hyphenation discussion plagues the web. One reasonably-serious page I found in a very quick search is http://www.dailywritingtips.com/adverbs-and-hyphens/ which cites the Chicago Manual of Style, among others. There are lots of places in our docs that use hyphens after the "-ly" suffix, so I think patch 3 is either too short because it should change more of them, or too long because we should leave this alone. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > One reasonably-serious page ... and with that, I guess you've expressed a reasonably clear preference. ;-) -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Marti Raudsepp <marti@juffo.org> wrote: > On Sat, Aug 30, 2014 at 8:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I would argue against applying patch 2 at all. > > I don't feel strongly about this, I originally didn't plan to submit > this patch at all. But I don't agree with: > >> doing it only because some style guide tells >> you to is not the way to approach the issue. We're writing English >> not C code here, and so there is no single standard of correctness. > > There isn't a "single standard", but there are many style guides > about > English and pretty much all of them agree on the usage of periods in > "e.g.", "i.e.", "etc." and "et al." To me, omitting the dots in any of those looks like a misspelling. I think we should fix those. Also, it seems odd to so strictly enforce formatting rules in C code (where it makes no semantic difference) but blow off style issues, and even correctness of English usage, in the documentation. > With regards to commas following e.g. and i.e., this article surveyed > 6 different English style guides and just 1 recommended not using the > > comma: http://www.quickanddirtytips.com/education/grammar/ie-versus-eg?page=2 Commas after "i.e." and "e.g." are less clearly a correctness issue and getting more into style questions, so I wouldn't feel too bad about letting those go where it doesn't cause confusion or too much of a "double take" on reading. Note that the cited page summarizes the positions of these documents on the topic with phrases like "is usually used", "preferable/optional", "makes good sense", and "should be". Only "The Columbia Guide to Standard American English" actually said it was "required". Starting a parenthetical clause with "e.g." and ending it with ", etc." also looks wrong to me. My inclination is to pick one; otherwise I find it distracting or confusing and tend to go back over it one or two extra times to make sure I'm understanding. There's at least one place I spotted "e.g." where it seemed to me that the "example" was really a restatement in other terms, so it seemed like it should have been "i.e." -- I would be inclined to scan for more of those and present that as a separate patch, since it's less mechanical than the others. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company