Re: Cleanup of syntax.sgml - Mailing list pgsql-docs

From David G. Johnston
Subject Re: Cleanup of syntax.sgml
Date
Msg-id CAKFQuwYxK-ctDWYpT0VEQJ6Yaz+TcF0CoSiXR=szbgdraqnZ0g@mail.gmail.com
Whole thread Raw
List pgsql-docs
On Fri, Jun 20, 2025 at 12:33 PM Joshua Drake <jd@commandprompt.com> wrote:
To make it more consumable.

Overall I'm good with the attempt to trim, and most of the changes, but feel it tries to hard and ends up being to "matter-of-fact"; the conjunctions that exist make reading a wall of text easier.  I agree that some of them could be removed as being more judgemental than mechanical.
 
Reviewing this reminds me we are inconsistent regarding "key word" vs. "keyword".

"We advise users who to read this chapter carefully  ..." ? botched surgery on this one

Not sure I agree with removing the comment regarding "end of the input stream".

I think I'm ok with leaving token separation unspecified here, especially since it isn't totally accurate (at least in regards to "special character symbol" which often are grouped together).

Why leave "(syntactically)" in parentheses?  Also, you got rid of the word "input" in SQL input above but left it here.  I think leaving "SQL input consists of..." is better.

For the examples, I would put "values" on its own line.  And I would add a delete command on the same line as the update command.  Then just describe that.

Select...;
update...; delete...;
insert ...
values ...;

I really don't like the re-wording regarding comments.

"But for the <command>UPDATE</command> command always ..." ? another botched surgery
I'm not sure what the entire paragraph really gives the reader though, besides a pointer to the reference chapter.  It needs more pruning than given here IMO.


I feel like if we want to enhance clarity about where we differ from the standard that we use callouts for those items instead of burying the information in walls of text.  Like the point about accepting dollar signs in unquoted identifiers.


-    A convention often used is to write key words in upper
+    The recommened convention is to write key words in upper  [recommended needs a d]
Both should be avoided.  We can say "It is the convention in this documentation to write key words in upper case and names in lower case."  Let other places than our syntax reference speak to real-world conventions besides ours.

Where we introduce "quoted identifiers" link to the description for the formal syntax - then it's ok to remove discussions of minutia like including double quotes in a quoted identifier.

punctuation:
+    Inside the quotes, Unicode characters can be specified in escaped
+    form by writing a backslash followed by the four-digit hexadecimal
+    code point number or[,] alternatively[,] a backslash followed by a plus
+    sign [(+)] followed by a six-digit hexadecimal code point number.


I've kind of grown fond of "This slightly bizarre behavior"... ;)


+     If you can use Unicode escapes or the alternative Unicode escape syntax,
+     explained in <xref linkend="sql-syntax-strings-uescape"/>; then the server

Prefer the existing.  This lacks commas or other ways to make it read well.  Removing "useful" judgement is probably sufficient.  Or maybe try a different approach.

I concur we should remove the discussion regarding the GUCs at this point.

Maybe also include the correct way of writing the U & 'foo' operation in the ambiguity discussion?

"optional tag of zero or more characters" is redundant.  Optional is sufficient.

But much more concisely:
''""
A dollar-quoted string surrounds the content with user-specified tags of the form  $label$ instead of quotation marks.  The label may be the empty string.  For example, here are two different ways...
"""

-     used without needing to be escaped.  Indeed, no characters inside
+     used without needing to be escaped. No characters inside
-     Here, the sequence <literal>$q$[\t\r\n\v\\]$q$</literal> represents a
+     The sequence <literal>$q$[\t\r\n\v\\]$q$</literal> represents a
-     <productname>PostgreSQL</productname>.  But since the sequence does not match
+     <productname>PostgreSQL</productname>.  Since the sequence does not match
Removing the word "Indeed, " isn't an improvement.  I get the desire to remove the "commentary" filler fragments but this one isn't a judgement but a highlight and seems quite appropriate.  Same goes for removing "Here" and "But" - conjunctions are good.


"Bit-string constants is a string constant with a  "  plural needs "are", not "is"

-     described below.  Note that any leading plus or minus sign is not actually
+     described below.  Any leading plus or minus sign is not considered part of
"Note" is also a perfectly fine conjunction, and you haven't claimed your fixes are to bring things in line with a style guideline, which I don't think exists at this level of specificity.

-     These are some examples of valid non-decimal integer constants:
+     Examples of valid non-decimal integer constants:
Status quo preferred.


Note, the stuff I'm not calling out does seem ok to remove in context.

     A comment is removed from the input stream before further syntax
-    analysis and is effectively replaced by whitespace.
+    analysis and is replaced by whitespace.

This seems repetitive with an earlier change...also, is a 20 character comment replaced with 20 spaces?  Why whitespace and not "space character" or "nothing"?

[For example,] If you define a <quote>+</quote> operator  -- this is an example so the conjunction is valid.  Though the trailing ", no matter what yours does." seems unnecessary.

Removing legacy comment regarding 9.5 makes sense.

David J.


pgsql-docs by date:

Previous
From: Fujii Masao
Date:
Subject: Fix incorrect UUID index entry in function documentation
Next
From: Masahiko Sawada
Date:
Subject: Re: Fix incorrect UUID index entry in function documentation