once more: documentation search indexing - Mailing list pgsql-www

From Andres Freund
Subject once more: documentation search indexing
Date
Msg-id 20210612202912.hw3ppjzxoto3ldtx@alap3.anarazel.de
Whole thread Raw
Responses Re: once more: documentation search indexing  ("Jonathan S. Katz" <jkatz@postgresql.org>)
List pgsql-www
Hi,

in a recent twitter discussion [1] $subject again has been brought
up. Unsurprisingly - it's still awful.

It's been brought up many times before:
- https://www.postgresql.org/message-id/CA%2BOCxoyVwmmZkWUJCez2hCqa89iGv%3Dvq58NF1yQkTg9gtpkn%3Dg%40mail.gmail.com
- https://www.postgresql.org/message-id/CAHyXU0wu7w%3DOpeHtvpei4J9SAr7TTmdRJOyCWF6MRXpQcFNHGw%40mail.gmail.com
- https://www.postgresql.org/message-id/CANNMO%2B%2BkxJmaaB7X6hq_8SqcEruySZrF%3DUkcPm-EG1JCKVascw%40mail.gmail.com
- https://www.postgresql.org/message-id/38c68b83-30ae-c039-acd0-9e853997edc4@2ndquadrant.com
- https://www.postgresql.org/message-id/560614CA.1080304@mail.com
- ...

One issue around the topic is that we seem to get bogged down in finding
a perfect solution to how to present "versioned document" to google,
preventing us from making small incremental adjustments. Since it seems
unlikely that we'll get a perfect solution anytime soon (we'd have found
it already), I'd like to try to see if we can find a way to agree on
some incremental steps.

Suggested small steps:

- add a docs/current link to https://www.postgresql.org/docs/. Often
  enough that's what a user wants anyway, and it's not useful to add
  additional steps for users and search engines to navigate to
  docs/current/.

  I can see us either making it a separate row in the versioned table,
  or to split the most recent released version's link into a /current/
  and $major link.


- put version in page titles where it makes sense. E.g. change
  "PostgreSQL: Documentation: 10: 6.1. Inserting Data" to
  "PostgreSQL 10 Documentation: 6.1. Inserting Data"

  The current ordering doesn't seem like it has much going for it, and
  it can't help search engines to have the version number people might
  search for removed from the product name.

  Right now this seem to contribute to less than helpful titles in
  search engine results. Searching anonymously for "postgres alter
  table" I get the less than helpful "Documentation: 12: ALTER TABLE -
  PostgreSQL" on google.

  It might also be worth to go a bit further and put the documentation
  version *after* the page title, given that it's most likely already
  clear to the reader that this is about postgres. I.e. something like
  "ALTER TABLE - Documentation for PostgreSQL 14"


- Consider removing chapter numbers from page titles. I'd argue that the
  particular chapter number for content isn't interesting as the title. E.g.
  https://www.postgresql.org/docs/12/plpgsql-declarations.html#PLPGSQL-DECLARATION-PARAMETERS
  has a title of "PostgreSQL: Documentation: 12: 42.3. Declarations"

  (see also previous item). The 42.3 piece seems pointless in a title of
  a website - although the actual chapter name could be helpful, because
  it's not immediately obvious that the page refers to plpgsql.


- Add a meta description - even just including what we have for the
  og:description thing seems like it would often be better what google
  is kind of forced to make up?


Greetings,

Andres Freund

[1] https://twitter.com/samokhvalov/status/1403410028334256128



pgsql-www by date:

Previous
From: Andres Freund
Date:
Subject: Re: no mailing list hits in google
Next
From: "Jonathan S. Katz"
Date:
Subject: Re: once more: documentation search indexing