Re: Yet another infrastructure problem - Mailing list pgsql-www

From Magnus Hagander
Subject Re: Yet another infrastructure problem
Date
Msg-id 8555D6DC-5EA9-4F35-9AEA-97B09CF929CD@hagander.net
Whole thread Raw
In response to Re: Yet another infrastructure problem  (Russell Smith <mr-russ@pws.com.au>)
List pgsql-www
On 26 okt 2008, at 02.03, Russell Smith <mr-russ@pws.com.au> wrote:

> Magnus Hagander wrote:
>> Greg Sabino Mullane wrote:
>>
>>> People have been complaining on IRC that nothing can be
>>> downloaded from our site, as the mirror-picking script throws
>>> an internal error.
>>>
>>> When are we going to fix our infrastructure properly?
>>>
>>
>> As Stefan has already posted on this very list, he is performing
>> maintenance on that machine in order to move it to new hardware.
>>
>> //Magnus
>>
>>
> We are still missing the one important thing "Notification"  lots and
> lots of people use the website that will never go near the lists,  
> irc or
> anything else.  Notifying the email lists of downtime will stop the
> heavily involved community from complaining, but it does absolutely
> nothing for general user trying to download something from the  
> internet.

That is a very good point. And it actually goes to many other parts of  
the project, and not just the infrastructure. Basically the  
authoritative version of *all* important information is the lists.

>
> You can argue about replication, downtime and the like until you are
> blue in the face.  There will always be some downtime.  The question  
> is
> how do people know about it, when is it and what do they do about it?

Agreed.


> Until reading this thread I had never even thought about how  
> PostgreSQL
> does or doesn't notify people about downtime or potential downtime.
> Reading down thread this notification issue appears to have been
> ignored.  To me it seems like relatively low hanging fruit to allow
> messages to be posted on the website about planned outages, and
> notifications of recent unplanned

So how do you deal with a case like the one discussed here, where the  
web is what didn't work? The static fromtends were up, but not the  
master which is used to update them...


> outages.  Complaining on IRC is one of
> the only ways to find out what'so going on at the moment for a casual
> user.

The casual user would be using the lists, certainly not irc. Peope who  
aren't deep in the project certainly will hit the lists first, because  
that's what we say on our website.

Now what they really do is email webmaster, which a lot of peope did.

That said, I agree a better way would be good to have.

> When Marc's hosting had trouble a couple of years back, the only
> way to find out anything was on irc.

That outlines one of the major problems. It must not be too hard to  
deal with for the guy trying to fix the actual problem. Sending an  
email is *easy*, and stefan did so in this case. But as you also note,  
even this is too much for some people.

We could publish a snapshot of our nagios data, but I doubt that would  
actually be helpful to these peope.


> I'd look into this, but I'd need a lot more knowledge about how the  
> web
> stuff is setup, and I'm probably not going to be able to glean that  
> from
> people in a couple of weeks.  But if I can.  Great!.
>

Hey, give it a shot. Just remember that the technical part is the easy  
part.  Creating a process and getting buyin for that is going to be  
the hard part.

/Magnus



pgsql-www by date:

Previous
From: Russell Smith
Date:
Subject: Re: Yet another infrastructure problem
Next
From: "Dave Page"
Date:
Subject: Re: Yet another infrastructure problem