Re: Fixing Google Search on the docs (redux) - Mailing list pgsql-www

From Greg Stark
Subject Re: Fixing Google Search on the docs (redux)
Date
Msg-id CAM-w4HOheMMcDOJUCZn32YwKEux_VJYjPKjVXufLWnGkrWon_g@mail.gmail.com
Whole thread Raw
In response to Re: Fixing Google Search on the docs (redux)  (Dave Page <dpage@pgadmin.org>)
Responses Re: Fixing Google Search on the docs (redux)  (Magnus Hagander <magnus@hagander.net>)
List pgsql-www
> all other URLs will be considered duplicate URLs and crawled less often

What Google crawls and what Google considers a valid search result to
serve users are two independent questions. Google may well crawl the
non-canonical results but never serve them. The crawl would still, for
example, add weight to pages linked from it. It's always really hard
to tell when reading Google docs whether they're talking about crawl
behaviour or search results behaviour.

> - Where a page has been removed entirely, mark the most recent version of it as the canonical one instead of the
/current/version).
 

This seems like a significant advance on previous ideas. If we have
enough meta data available to do this that would be a big win. I think
it's rare that we remove information from a page but keep the same
page. Generally things like recovery.conf would mean removing whole
pages replacing them with new pages that document new functionality.



pgsql-www by date:

Previous
From: Dave Page
Date:
Subject: Re: Fixing Google Search on the docs (redux)
Next
From: Magnus Hagander
Date:
Subject: Re: Fixing Google Search on the docs (redux)