WIP: Analyze whether our docs need more granular refentries. - Mailing list pgsql-hackers

From Corey Huinker
Subject WIP: Analyze whether our docs need more granular refentries.
Date
Msg-id CADkLM=ecedUyx9uFgQA=Bg4-kE3i7KFA6UUEhFmrvxCPrsim0w@mail.gmail.com
Whole thread Raw
List pgsql-hackers

In reviewing another patch, I noticed that the documentation had an xref to a fairly large page of documentation (create_table.sgml), and I wondered if that link was chosen because the original author genuinely felt the entire page was relevant, or merely because a more granular link did not exist at the time, and this link had been carried forward since then while the referenced page grew in complexity.

In the interest of narrowing the problem down to a manageable size, I wrote a script (attached) to find all xrefs and rank them by criteria[1] that I believe hints at the possibility that the xrefs should be more granular than they are.

I intend to use the script output below as a guide for manually reviewing the references and seeing if there are opportunities to guide the reader to the relevant section of those pages.

In case anyone is curious, here is a top excerpt of the script output:

file_name                          link_name                     link_count  line_count  num_refentries
---------------------------------  ----------------------------  ----------  ----------  --------------
ref/psql-ref.sgml                  app-psql                      20          5215        1            
ecpg.sgml                          ecpg-sql-allocate-descriptor  4           10101       17            
ref/create_table.sgml              sql-createtable               23          2437        1            
ref/select.sgml                    sql-select                    23          2207        1            
ref/create_function.sgml           sql-createfunction            30          935         1            
ref/alter_table.sgml               sql-altertable                12          1776        1            
ref/pg_dump.sgml                   app-pgdump                    11          1545        1            
ref/pg_basebackup.sgml             app-pgbasebackup              11          1008        1            
ref/create_type.sgml               sql-createtype                10          1029        1            
ref/create_index.sgml              sql-createindex               9           999         1            
ref/postgres-ref.sgml              app-postgres                  10          845         1            
ref/copy.sgml                      sql-copy                      7           1081        1            
ref/create_role.sgml               sql-createrole                13          511         1            
ref/grant.sgml                     sql-grant                     13          507         1            
ref/create_foreign_table.sgml      sql-createforeigntable        14          455         1            
ref/insert.sgml                    sql-insert                    8           792         1
ref/pg_ctl-ref.sgml                app-pg-ctl                    8           713         1            
ref/create_trigger.sgml            sql-createtrigger             7           777         1            
ref/set.sgml                       sql-set                       15          332         1            
ref/create_aggregate.sgml          sql-createaggregate           6           805         1            
ref/initdb.sgml                    app-initdb                    8           588         1            
ref/create_policy.sgml             sql-createpolicy              7           655         1            
dblink.sgml                        contrib-dblink-connect        1           2136        19            
ref/create_subscription.sgml       sql-createsubscription        9           472         1 
 

Some of these will clearly be false positives. For instance, dblink.sgml and ecpg.sgml have a lot of refentries, but they seem to lack a global "top" refentry which I assumed would be there.

On the other hand, I have to wonder if the references to psql might be to a specific feature of the tool, and perhaps we can create refentries to those.

[1] The criteria is: must be first refentry in file, file must be at least 200 lines long, then rank by lines*references, 2x for referencing the top refentry when others exist
Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: libpq support for NegotiateProtocolVersion
Next
From: Nathan Bossart
Date:
Subject: Re: GUC values - recommended way to declare the C variables?