Thread: Should we put command options in alphabetical order in the doc?

Should we put command options in alphabetical order in the doc?

From

David Rowley

Date:

18 April 2023, 05:44:39

Over on [1], Peter mentions that we might want to consider putting the
VACUUM options into some order that's better than the apparent random
order that they're currently in.

VACUUM is certainly one command that's grown a fairly good number of
options over the years and it appears we've not given much
consideration to what order to put those in in the documentation.

It's not just VACUUM that has this issue.  I see 6 commands using the
following text:

$ git grep "option</replaceable> can be one of"
src/sgml/ref/analyze.sgml: ...
src/sgml/ref/cluster.sgml: ...
src/sgml/ref/copy.sgml: ...
src/sgml/ref/explain.sgml: ...
src/sgml/ref/reindex.sgml: ...
src/sgml/ref/vacuum.sgml: ...

(maybe there's more we should consider adjusting?)

Likely if we do opt to put these options in a more well-defined order,
we should apply that to at least the 6 commands listed above.

For the case of reindex.sgml, I do see that the existing parameter
order lists INDEX | TABLE | SCHEMA | DATABASE | SYSTEM first which is
the target of the reindex. I wondered if that was worth keeping. I'm
just thinking that since all of these are under the "Parameters"
heading that we should class them all as equals and just make the
order alphabetical. I feel that if we don't do that then the order to
add any new parameters is just not going to be obvious and we'll end
up with things getting out of order again quite quickly.

I've attached a patch which makes the changes as I propose them.

David

[1] https://postgr.es/m/16845cb1-b228-e157-f293-5892bced9253@enterprisedb.com

Attachment

alphabetical_order_for_parameter_names.patch

Re: Should we put command options in alphabetical order in the doc?

From

Peter Geoghegan

Date:

18 April 2023, 06:53:17

On Mon, Apr 17, 2023 at 10:45 PM David Rowley <dgrowleyml@gmail.com> wrote:
> For the case of reindex.sgml, I do see that the existing parameter
> order lists INDEX | TABLE | SCHEMA | DATABASE | SYSTEM first which is
> the target of the reindex. I wondered if that was worth keeping. I'm
> just thinking that since all of these are under the "Parameters"
> heading that we should class them all as equals and just make the
> order alphabetical. I feel that if we don't do that then the order to
> add any new parameters is just not going to be obvious and we'll end
> up with things getting out of order again quite quickly.

I don't think that alphabetical order makes much sense. Surely some
parameters are more important than others. Surely there is some kind
of natural grouping that makes somewhat more sense than alphabetical
order.

Take the VACUUM command. Right now FULL, FREEZE, and VERBOSE all come
first. Those options are approximately the most important options --
especially VERBOSE. But your patch places VERBOSE dead last.

--
Peter Geoghegan

Re: Should we put command options in alphabetical order in the doc?

From

David Rowley

Date:

18 April 2023, 23:17:52

On Tue, 18 Apr 2023 at 18:53, Peter Geoghegan <pg@bowt.ie> wrote:
> Take the VACUUM command. Right now FULL, FREEZE, and VERBOSE all come
> first. Those options are approximately the most important options --
> especially VERBOSE. But your patch places VERBOSE dead last.

hmm, how can we verify that the options are kept in order of
importance? What guidance can we provide to developers adding options
about where they should slot in the new option to the docs?

"Importance order" just seems horribly subjective to me.  I'd be
interested to know if you could tell me if SKIP_LOCKED has more
importance than INDEX_CLEANUP, for example. If you can, it would seem
like trying to say apples are more important than oranges, or
vice-versa.

David

Re: Should we put command options in alphabetical order in the doc?

From

Peter Geoghegan

Date:

18 April 2023, 23:30:06

On Tue, Apr 18, 2023 at 4:18 PM David Rowley <dgrowleyml@gmail.com> wrote:
> "Importance order" just seems horribly subjective to me.

Alphabetical order seems objectively bad. At least to me.

> I'd be interested to know if you could tell me if SKIP_LOCKED has more
> importance than INDEX_CLEANUP, for example. If you can, it would seem
> like trying to say apples are more important than oranges, or
> vice-versa.

I don't accept your premise that the only thing that matters (or the
most important thing) is adherence to some unambiguous and consistent
order.

--
Peter Geoghegan

Re: Should we put command options in alphabetical order in the doc?

From

Peter Geoghegan

Date:

19 April 2023, 01:05:15

On Tue, Apr 18, 2023 at 4:30 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > I'd be interested to know if you could tell me if SKIP_LOCKED has more
> > importance than INDEX_CLEANUP, for example. If you can, it would seem
> > like trying to say apples are more important than oranges, or
> > vice-versa.
>
> I don't accept your premise that the only thing that matters (or the
> most important thing) is adherence to some unambiguous and consistent
> order.

In the case of VACUUM, the current devel order is:

FULL, FREEZE, VERBOSE, ANALYZE, DISABLE_PAGE_SKIPPING, SKIP_LOCKED,
INDEX_CLEANUP, PROCESS_MAIN, PROCESS_TOAST,
TRUNCATE, PARALLEL, SKIP_DATABASE_STATS, ONLY_DATABASE_STATS, BUFFER_USAGE_LIMIT

I think that this order is far superior to alphabetical order, which
is tantamount to random order. The first 4 items are indeed the really
important ones to users, in my experience.

I do have some minor quibbles beyond that, though. These are:

* PARALLEL deserves to be at the start, maybe 4th or 5th overall.

* DISABLE_PAGE_SKIPPING should be later, since it's really only a
testing option that probably never proved useful in production. In
particular, it has little business being before SKIP_LOCKED, which is
much more important and relevant.

* TRUNCATE and INDEX_CLEANUP are similar options, and ought to be side
by side. I would put PROCESS_MAIN and PROCESS_TOAST after those two
for the same reason.

While I'm certain that nobody will agree with me on every little
detail, I have to imagine that most would find my preferred ordering
quite understandable and unsurprising, at a high level -- this is not
a hopelessly idiosyncratic ranking, that could just as easily have
been generated by a PRNG. People may not easily agree that "apples are
more important than oranges, or vice-versa", but what does it matter?
I've really only put each option into buckets of items with *roughly*
the same importance. All of the details beyond that don't matter to
me, at all.

--
Peter Geoghegan

Re: Should we put command options in alphabetical order in the doc?

From

Alvaro Herrera

Date:

19 April 2023, 08:47:47

On 2023-Apr-18, Peter Geoghegan wrote:

> While I'm certain that nobody will agree with me on every little
> detail, I have to imagine that most would find my preferred ordering
> quite understandable and unsurprising, at a high level -- this is not
> a hopelessly idiosyncratic ranking, that could just as easily have
> been generated by a PRNG. People may not easily agree that "apples are
> more important than oranges, or vice-versa", but what does it matter?
> I've really only put each option into buckets of items with *roughly*
> the same importance. All of the details beyond that don't matter to
> me, at all.

I agree with you that roughly bucketing items is a good approach.
Within each bucket we can then sort alphabetically.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it."                   (New York Times, about Microsoft PowerPoint)

Re: Should we put command options in alphabetical order in the doc?

From

Peter Eisentraut

Date:

19 April 2023, 08:52:06

On 19.04.23 01:30, Peter Geoghegan wrote:
>> I'd be interested to know if you could tell me if SKIP_LOCKED has more
>> importance than INDEX_CLEANUP, for example. If you can, it would seem
>> like trying to say apples are more important than oranges, or
>> vice-versa.
> 
> I don't accept your premise that the only thing that matters (or the
> most important thing) is adherence to some unambiguous and consistent
> order.

My thinking is, if I want to look up FREEZE on the VACUUM man page, I 
would welcome some easily identifiable way of locating it.  At that 
point, I don't know whether FREEZE is important or what kind of option 
it is.  For reference material, easy lookup should be a priority.  For a 
narrative chapter on VACUUM, you can introduce the options in any other 
suitable order.

Re: Should we put command options in alphabetical order in the doc?

From

Daniel Gustafsson

Date:

19 April 2023, 09:07:04

> On 19 Apr 2023, at 10:52, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

> For reference material, easy lookup should be a priority.

+1. Alphabetical ordering is consistent with POLA.

> For a narrative chapter on VACUUM, you can introduce the options in any other
> suitable order.


I would even phrase it such that in this case one *should* present the options
in the order most suitable to educate the reader.

--
Daniel Gustafsson

Re: Should we put command options in alphabetical order in the doc?

From

Peter Geoghegan

Date:

19 April 2023, 18:38:48

On Wed, Apr 19, 2023 at 3:04 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > While I'm certain that nobody will agree with me on every little
> > detail, I have to imagine that most would find my preferred ordering
> > quite understandable and unsurprising, at a high level -- this is not
> > a hopelessly idiosyncratic ranking, that could just as easily have
> > been generated by a PRNG. People may not easily agree that "apples are
> > more important than oranges, or vice-versa", but what does it matter?
> > I've really only put each option into buckets of items with *roughly*
> > the same importance. All of the details beyond that don't matter to
> > me, at all.
>
> I agree with you that roughly bucketing items is a good approach.
> Within each bucket we can then sort alphabetically.

I think of these buckets as working at a logarithmic scale. The FULL,
FREEZE, VERBOSE, and ANALYZE options are multiple orders of magnitude
more important than most of the other options, and maybe one order of
magnitude more important than the PARALLEL, TRUNCATE, and
INDEX_CLEANUP options. With differences that big, you have a structure
that generalizes across all users quite well. This doesn't seem
particularly subjective.

--
Peter Geoghegan

Re: Should we put command options in alphabetical order in the doc?

From

Melanie Plageman

Date:

19 April 2023, 21:33:47

On Wed, Apr 19, 2023 at 2:39 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Wed, Apr 19, 2023 at 3:04 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > > While I'm certain that nobody will agree with me on every little
> > > detail, I have to imagine that most would find my preferred ordering
> > > quite understandable and unsurprising, at a high level -- this is not
> > > a hopelessly idiosyncratic ranking, that could just as easily have
> > > been generated by a PRNG. People may not easily agree that "apples are
> > > more important than oranges, or vice-versa", but what does it matter?
> > > I've really only put each option into buckets of items with *roughly*
> > > the same importance. All of the details beyond that don't matter to
> > > me, at all.
> >
> > I agree with you that roughly bucketing items is a good approach.
> > Within each bucket we can then sort alphabetically.
>
> I think of these buckets as working at a logarithmic scale. The FULL,
> FREEZE, VERBOSE, and ANALYZE options are multiple orders of magnitude
> more important than most of the other options, and maybe one order of
> magnitude more important than the PARALLEL, TRUNCATE, and
> INDEX_CLEANUP options. With differences that big, you have a structure
> that generalizes across all users quite well. This doesn't seem
> particularly subjective.

I actually favor query/command order followed by alphabetical order for
most of the commands David included in his patch.

Of course the parameter argument types, like boolean and integer, should
be grouped together separate from the main parameters. David fit this
into the alphabetical paradigm by doing uppercase alphabetical followed
by lowercase alphabetical. There are some specific cases where I think
this isn't working quite as intended in his patch. I've called those out
in my command-by-command code review below.

I actually think we should consider having a single location which
defines all argument types for all SQL command parameters. Then we
wouldn't need to define them for each command. We could simply link to
the definition from the synopsis. That would clean up these lists quite
a bit. Perhaps there is some variation from command to command in the
actual definitions, though (I haven't checked). I would be happy to try
and write this patch if folks are interested in the idea.

As for alphabetical ordering vs importance ordering: while I do think
that if a user does not know what parameter they are looking for, an
alphabetical ordering is unhelpful, I also think the primary issue with
grouping them by "importance" is that it is difficult to maintain. Doing
so requires a discussion of importance for every new option added. That
seems like an annoying bit of overhead to give ourselves. Having a
subjective ordering seems worse than having a rule-based ordering. I
think command/query order followed by alphabetical order is a reasonable
rule-based ordering.

I went and took a look at some of the other SQL commands' documentation
and noticed that they are all pretty different (for good reason).

ALTER ROLE parameters [1], for example, have a seemingly meaningless
order except for the fact that there are pairs of parameters. SUPERUSER
and NOSUPERUSER, INHERIT and NOINHERIT, etc. It might be a bit odd for
these to follow an absolute alphabetical ordering rule.

Many of the CREATE type SQL commands don't really have this problem
because there are only one or two options within each section of the
command and otherwise the order the parameters must appear in the query
dictates their order [2].

Others, like EXPLAIN [3], for example, obviously benefit from an
alphabetical ordering of parameters -- which David has done in the
patch. I think most of the commands that David has patched here are
good candidates for alphabetical ordering.

Below I've reviewed each command in the patch specifically:

For ANALYZE, I think this looks good in its new alphabetized form.
Though table_name is alphabetically last for the lower case parameters
and thus doesn't pose an issue, if it were alphabetically earlier, I
would still favor putting it at the end to maintain a query order then
alphabetical order ordering.

For CLUSTER, I think alphabetical order isn't working well. I think we
should maintain query order followed by alphabetical order. Even though
table_name is optional, in the event that it is included, it would
precede index_name. So, perhaps the order should be VERBOSE, boolean,
table_name, index_name -- which pretty much cancels out alphabetizing.

For COPY, I think the new ordering of COPY has some issues. table_name
is no longer first even though for COPY FROM it is required before the
other parameters. I think this is confusing. Perhaps the options should
be after the other parameters are defined. I think having the options
alphabetized at the end of the others would be nice. So, my suggested
ordering is table_name, column_name, filename, PROGRAM, STDIN, STDOUT,
then the WITH options alphabetically, WHERE, and then the parameter
argument types alphabetically. The last one (where to put the parameter
argument types) I'm not so sure about.

EXPLAIN looks good to me as is.

For REINDEX, I would again suggest a query ordering followed by
alphabetical ordering. CONCURRENTLY, TABLESPACE, VERBOSE, DATABASE,
INDEX, SCHEMA, SYSTEM, TABLE, name, then all of the parameter argument
types alphabetically. (Also, you can put CONCURRENTLY in two different
places in the REINDEX command?)

For VACUUM, I'd perhaps suggest the options in alphabetical order
followed by table_name and then column_name and then putting the
parameter argument types at the end alphabetically.

Of course, we could decide VACUUM is special and group its options by
importance because this is especially helpful for users. I think that
there are other SQL commands whose options' importance is not
particularly worth debating.

I do think we should consider deprecating and dropping documentation of
the options that are supported without parentheses (relevant to commands
like ANALYZE, CLUSTER, VACUUM, and others). It is fine if we keep the
code to make ANALYZE VERBOSE work, but I don't think it is useful to
keep that documented. That is not a concern of this patch, however.

- Melanie

[1] https://www.postgresql.org/docs/devel/sql-alterrole.html
[2] https://www.postgresql.org/docs/devel/sql-createindex.html
[3] https://www.postgresql.org/docs/devel/sql-explain.html

Re: Should we put command options in alphabetical order in the doc?

From

Tom Lane

Date:

19 April 2023, 21:45:41

Melanie Plageman <melanieplageman@gmail.com> writes:
> I do think we should consider deprecating and dropping documentation of
> the options that are supported without parentheses (relevant to commands
> like ANALYZE, CLUSTER, VACUUM, and others). It is fine if we keep the
> code to make ANALYZE VERBOSE work, but I don't think it is useful to
> keep that documented. That is not a concern of this patch, however.

I doubt it's a great idea to de-document syntax that's still allowed
and will still be widely used for years to come; that just promotes
confusion.  However, we could do something similar to what we did
for COPY years ago, and move the un-parenthesized syntax to the
"Compatibility" section.

            regards, tom lane

Re: Should we put command options in alphabetical order in the doc?

From

Peter Geoghegan

Date:

19 April 2023, 23:46:11

On Wed, Apr 19, 2023 at 2:33 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> As for alphabetical ordering vs importance ordering: while I do think
> that if a user does not know what parameter they are looking for, an
> alphabetical ordering is unhelpful, I also think the primary issue with
> grouping them by "importance" is that it is difficult to maintain. Doing
> so requires a discussion of importance for every new option added.

Not really. It's a matter that requires some amount of individual
judgement, in some cases. It may require effort, but I think that
that's likely to be worth it.

I won't be the one that quibbles over every little thing.

> For VACUUM, I'd perhaps suggest the options in alphabetical order
> followed by table_name and then column_name and then putting the
> parameter argument types at the end alphabetically.
>
> Of course, we could decide VACUUM is special and group its options by
> importance because this is especially helpful for users. I think that
> there are other SQL commands whose options' importance is not
> particularly worth debating.

That's very likely true -- it may be that most individual commands
really wouldn't be any worse off if they just used a standard
alphabetical order. I agree that consistency can be a virtue. But it's
not the highest virtue. There will be a number of important
exceptions, which will have outsized impact. VACUUM, ANALYZE, maybe
CREATE INDEX. So if there is going to be a new standard, there should
also be significant wiggle-room. Kind of like with the guidelines for
rmgr desc authors  discussion.

--
Peter Geoghegan

Re: Should we put command options in alphabetical order in the doc?

From

Melanie Plageman

Date:

20 April 2023, 12:37:52

On Wed, Apr 19, 2023 at 05:33:47PM -0400, Melanie Plageman wrote:
> On Wed, Apr 19, 2023 at 2:39 PM Peter Geoghegan <pg@bowt.ie> wrote:
> >
> > On Wed, Apr 19, 2023 at 3:04 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > > > While I'm certain that nobody will agree with me on every little
> > > > detail, I have to imagine that most would find my preferred ordering
> > > > quite understandable and unsurprising, at a high level -- this is not
> > > > a hopelessly idiosyncratic ranking, that could just as easily have
> > > > been generated by a PRNG. People may not easily agree that "apples are
> > > > more important than oranges, or vice-versa", but what does it matter?
> > > > I've really only put each option into buckets of items with *roughly*
> > > > the same importance. All of the details beyond that don't matter to
> > > > me, at all.
> > >
> > > I agree with you that roughly bucketing items is a good approach.
> > > Within each bucket we can then sort alphabetically.
> >
> > I think of these buckets as working at a logarithmic scale. The FULL,
> > FREEZE, VERBOSE, and ANALYZE options are multiple orders of magnitude
> > more important than most of the other options, and maybe one order of
> > magnitude more important than the PARALLEL, TRUNCATE, and
> > INDEX_CLEANUP options. With differences that big, you have a structure
> > that generalizes across all users quite well. This doesn't seem
> > particularly subjective.
> 
> I actually favor query/command order followed by alphabetical order for
> most of the commands David included in his patch.
> 
> Of course the parameter argument types, like boolean and integer, should
> be grouped together separate from the main parameters. David fit this
> into the alphabetical paradigm by doing uppercase alphabetical followed
> by lowercase alphabetical. There are some specific cases where I think
> this isn't working quite as intended in his patch. I've called those out
> in my command-by-command code review below.
> 
> I actually think we should consider having a single location which
> defines all argument types for all SQL command parameters. Then we
> wouldn't need to define them for each command. We could simply link to
> the definition from the synopsis. That would clean up these lists quite
> a bit. Perhaps there is some variation from command to command in the
> actual definitions, though (I haven't checked). I would be happy to try
> and write this patch if folks are interested in the idea.

I looked into this and it isn't a good idea. Out of the 183 SQL
commands, really only ANALYZE, VACUUM, COPY, CLUSTER, EXPLAIN, and
REINDEX have parameter argument types that are context-independent. And
out of those, boolean is the only type shared by all. VACUUM is the only
one with more than one parameter argument "type". So, it is basically
just a bad idea. Oh well...

- Melanie

Re: Should we put command options in alphabetical order in the doc?

From

David Rowley

Date:

20 April 2023, 12:40:48

On Wed, 19 Apr 2023 at 22:04, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Apr-18, Peter Geoghegan wrote:
>
> > While I'm certain that nobody will agree with me on every little
> > detail, I have to imagine that most would find my preferred ordering
> > quite understandable and unsurprising, at a high level -- this is not
> > a hopelessly idiosyncratic ranking, that could just as easily have
> > been generated by a PRNG. People may not easily agree that "apples are
> > more important than oranges, or vice-versa", but what does it matter?
> > I've really only put each option into buckets of items with *roughly*
> > the same importance. All of the details beyond that don't matter to
> > me, at all.
>
> I agree with you that roughly bucketing items is a good approach.
> Within each bucket we can then sort alphabetically.

If these "buckets" were subcategories, then it might be ok. I see "man
grep" categorises the command line options and then sorts
alphabetically within the category. If we could come up with a way of
categorising the options then this would satisfy what Melanie
mentioned about having the argument types listed separately. However,
I'm really not sure which categories we could have.  I really don't
have any concrete ideas here, but I'll attempt to at least start
something:

Behavioral:
ANALYZE
DISABLE_PAGE_SKIPPING
FREEZE
FULL
INDEX_CLEANUP
ONLY_DATABASE_STATS
PROCESS_MAIN
PROCESS_TOAST
SKIP_DATABASE_STATS
SKIP_LOCKED
TRUNCATE

Resource Usage:
BUFFER_USAGE_LIMIT
PARALLEL

Informational:
VERBOSE

Option Parameters:
boolean
column_name
integer
size
table_name

I'm just not sure if we have enough options to have a need to
categorise them.  Also, going by the categories I attempted to come up
with, it just feels like "Behavioral" contains too many and
"Informational" is likely only ever going to contain VERBOSE. So I'm
not very happy with them.

I'm not really feeling excited enough about this to even come up with
a draft patch. I thought I'd send out this anyway to see if anyone can
think of anything better.

FWIW, vacuumdb --help has its options in alphabetical order using the
abbreviated form of the option.

David

Re: Should we put command options in alphabetical order in the doc?

From

Daniel Gustafsson

Date:

20 April 2023, 12:57:46

> On 20 Apr 2023, at 14:40, David Rowley <dgrowleyml@gmail.com> wrote:

> I see "man grep" categorises the command line options and then sorts
> alphabetically within the category.

On FreeBSD and macOS "man grep" lists all options alphabetically.

> FWIW, vacuumdb --help has its options in alphabetical order using the
> abbreviated form of the option.

It does (as most of our binaries do) group "Connection options" separately
though, and in initdb --help and pg_dump --help we have other groupings as
well.

--
Daniel Gustafsson