Re: Scheduled maintenance affecting gitmaster - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Scheduled maintenance affecting gitmaster
Date
Msg-id AANLkTikcWBvvJxCVfX_rtaG7vH+qw6Rw_=bzLuBp09Do@mail.gmail.com
Whole thread Raw
In response to Re: Scheduled maintenance affecting gitmaster  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
On Mon, Feb 14, 2011 at 16:15, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, Feb 14, 2011 at 10:39, Stefan Kaltenbrunner
> <Stefan@kaltenbrunner.cc> wrote:
>> On 02/14/2011 10:09 AM, Magnus Hagander wrote:
>>> On Mon, Feb 14, 2011 at 07:13, Stefan Kaltenbrunner
>>> <stefan@kaltenbrunner.cc> wrote:
>>>> On 02/14/2011 01:27 AM, Tom Lane wrote:
>>>>>
>>>>> Magnus Hagander<magnus@hagander.net>  writes:
>>>>>>
>>>>>> Unfortunately, one of the worst-case scenarios appears to have
>>>>>> happened - a machine did not come back up after a reboot.
>>>>>> ...
>>>>>> We'll get back to you with more information as soon as we have it.
>>>>>
>>>>> I didn't see any followup to this?
>>>>
>>>> yeah - the hosting company managed to reboot the box for us which brought it
>>>> back to life in the middle of the night (with both magnus and me asleep).
>>>
>>> Indeed. But the good news is that once it came back up, the VM with
>>> the git server started ok :-)
>>>
>>>
>>>>> gitmaster seems to be responding as of now, is it safe to push?
>>>>
>>>> yes it is - however we will need to schedule another maintenance window soon
>>>> to finish the stuff we actually wanted to do.
>>>
>>> So, after some discussion with Stefan, we (well, I guess I) decided we
>>> should just go ahead and declare the maintenance window not closed
>>> yet, and finish off the upgrade right now :-) Given that the majority
>>> of our commits don't happen now, we'll hopefully have it done by the
>>> time the US folks wake up again.
>>>
>>> So, maintenance window again, starting now, and we'll let you know as
>>> soon as we're done. And we're definitely hoping for the machine to
>>> come back up properly this time :-)
>>
>> and it did not... We are trying to figure out what the actual problem
>> here really is because it seems to boot just fine when powercycled just
>> not with a software initiated reboot.
>> We will notify once we have more information...
>
> Status update on this - Stefan is currently working with the
> datacenter people on getting this fixed (they are now available
> on-site), since we are now having an actual issue with the machine
> (GRUB failure on boot) rather than just a failure to shut down.

We are still having issues with this box. For that reason, we have
moved the gitmaster server over to another box, where it's now up and
running. DNS has been updated, but it will take some time for it to
sync out. For those of you who want access now, you ca nreach the new
master server at 98.129.198.116.

Note, however, that mail relaying from this machine does not currently
work, so commit messages will be out for a bit longer while we work on
clearing things up.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: sepgsql contrib module
Next
From: Daniel Farina
Date:
Subject: Re: Replication server timeout patch