Re: no mailing list hits in google - Mailing list pgsql-www

From Tom Lane
Subject Re: no mailing list hits in google
Date
Msg-id 23549.1567015180@sss.pgh.pa.us
Whole thread Raw
In response to Re: no mailing list hits in google  (Magnus Hagander <magnus@hagander.net>)
List pgsql-www
Magnus Hagander <magnus@hagander.net> writes:
> It blocks /list/ which has the subjects only. The actual emails in
> /message-id/ are not blocked by robots.txt.  I don't know why they stopped
> appearing in the searches... Nothing has been changed around that for many
> years from *our* side.

If I go to

https://www.postgresql.org/message-id/

I get a page saying "Not Found".  So I'm not clear on how a web crawler
would descend through that to individual messages.

Even if it looks different to a robot, what would it look like exactly?
A flat space of umpteen zillion immediate-child pages?  It seems not
improbable that Google's search engine would intentionally decide not to
index that, or unintentionally just fail due to some internal resource
limit.  (This theory can explain why it used to work and no longer does:
we got past whatever the limit is.)

Andres' idea of allowing access to /list/ would allow the archives to be
traversed in more bite-size pieces, which might fix the issue.

            regards, tom lane



pgsql-www by date:

Previous
From: Andres Freund
Date:
Subject: Re: no mailing list hits in google
Next
From: Magnus Hagander
Date:
Subject: Re: no mailing list hits in google