On 27 April 2018 at 03:05, Wei Shan <weishan.ang@gmail.com> wrote:
> Gitlab had some issues recently with PostgreSQL.
>
> http://www.theregister.co.uk/2018/04/27/gitlab_outage_april_2018/
> https://docs.google.com/document/d/1_IzyO-jwqb7UFl0A28D1gR4EaU99cEnoUSD9o8q4eZw/edit#
>
> From their timeline report, it looks like they were using repmgr. Somehow, I
> don't understand how they manage to get split-brain from a primary failure
> with a 5 node cluster.
>
> Does anyone have more insights to this?
repmgr can be used successfully to avoid this, so I doubt it is at
fault, but bug reports welcome
AFAIK they employ a PostgreSQL service provider, so presumably that
provider can/will comment if there are any concerns the community
should be aware of
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services