Re: an attempt to fix the Google search problem - Mailing list pgsql-www

From Daniel Gustafsson
Subject Re: an attempt to fix the Google search problem
Date
Msg-id F8F6AA24-7951-4938-99B9-933B4AC1A9C9@yesql.se
Whole thread Raw
In response to Re: an attempt to fix the Google search problem  (Magnus Hagander <magnus@hagander.net>)
List pgsql-www
> On 09 Nov 2016, at 18:07, Magnus Hagander <magnus@hagander.net> wrote:
>
> On Wed, Nov 9, 2016 at 6:34 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com
<mailto:peter.eisentraut@2ndquadrant.com>>wrote: 
> It is a well-known problem that a Google search for something in the
> PostgreSQL documentation will usually return hits in old documentation
> versions first, because those pages have been around for the longest.
>
> I believe I have a promising fix for that.  By adding a <link
> rel="canonical"> to the documentation pages that point to the "current"
> version, search engines will be encouraged to return the current version
> search results.
>
> I had heard that the Django project had the same problem and got this
> solution from there.  See for example the source of this page:
> <https://docs.djangoproject.com/en/1.10/topics/db/models/
<https://docs.djangoproject.com/en/1.10/topics/db/models/>>. Here is 
> also some information from Google about this:
> <https://webmasters.googleblog.com/2013/04/5-common-mistakes-with-relcanonical.html
<https://webmasters.googleblog.com/2013/04/5-common-mistakes-with-relcanonical.html>>
>
> I think this is worth trying.  A one-line patch is attached.
>
> By that article you linked, it's important not to link to pages that don't exist. So we should at least verify that
thepage does exist in the current version (the same way that we do for the links at the top of the pages for old
versions). IIRC someone (sorry, this is a long time ago, can't remember who or why) mentioned that the pages can get
severelypunished if the canonical link goes to a 404. 

While I can’t cite a source supporting that Google punish 4XX responses, I have
first-hand experience in that they in fact do (or at least have done).

> We did try this at some point ages and ages ago and it didn't help, but I agree it's probably worth another try. But
wedefinitely need to be careful not to destroy existing google ranking. 

The backing RFC states that the target document must be a duplicate or superset
of the context document, and Google says similar.  The current version of a doc
page fit that but we should be careful when doc pages have been substantially
rewritten, targetting a completely different page could lead to punishment.

cheers ./daniel


pgsql-www by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: CSS updates for new documentation build
Next
From: Greg Stark
Date:
Subject: Re: an attempt to fix the Google search problem