Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections - Mailing list pgsql-general

From Sean Laurent
Subject Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections
Date
Msg-id CAK=aZ=kBGOYkxYjNppec4dTg6SocyD0NW6y-t8H57PaQrL+90Q@mail.gmail.com
Whole thread Raw
In response to Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Fri, Oct 7, 2011 at 12:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Sean Laurent <sean@studyblue.com> writes:
> > We've been running into a particularly strange problem that I'm trying to
> > better understand. The super short version is that our application servers
> > lose their connection to the database when I run a backup during periods of
> > higher load and fail to reconnect.
>
> That's just weird.  It sounds like the "xfs_freeze" operation, or the
> snapshotting operation, is somehow interrupting network traffic.  I'd
> not expect such a thing on a normal server, but who knows what's
> connected to what in an Amazon EC2 instance?
>
> Anyway, I'd suggest trying to instrument something to prove or disprove
> that there's a networking failure involved.  It might be as simple as
> watching "ping" behavior ...

Agreed that's it very weird. EBS volumes are effectively networked
attached storage, so blaming network connectivity was my first
inclination as well. Unfortunately, it's definitely not a network
failure:

- AWS support team has not detected any network outages affecting the
EC2 instance or the EBS volumes at any time remotely near when our
outages occurred.
- I can consistently ping the database instance from the application
servers while the problem is occurring.
- I can SSH into the database instance and access Postgres while the
problem is occurring.

--
Sean Laurent
Director of Operations
StudyBlue, Inc.

pgsql-general by date:

Previous
From: Sean Laurent
Date:
Subject: Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections
Next
From: "J.V."
Date:
Subject: how to find primary key field name?