Re: SMP-PPC spinlocks in 7.2.4? - Mailing list pgsql-general

From Tom Lane
Subject Re: SMP-PPC spinlocks in 7.2.4?
Date
Msg-id 13048.1047419183@sss.pgh.pa.us
Whole thread Raw
In response to SMP-PPC spinlocks in 7.2.4?  (eric soroos <eric-psql@soroos.net>)
Responses Re: SMP-PPC spinlocks in 7.2.4?  (eric soroos <eric-psql@soroos.net>)
List pgsql-general
eric soroos <eric-psql@soroos.net> writes:
> The last few days, I've been dealing with a client who has drastically upped their usage of the database and in doing
sois causing deadlocks. I was running 7.2 or 7.2.1, I upgraded to a locally compiled 7.2.4.  I've run a vacuum full on
thedatabases.  

> Sometimes the clients have a ps ax status of async_notify, sometimes there's just a stack of selects and updates that
gethung. (I'd estimate 6 deadlocks since Saturday).  It seems to coincide with times of extra activity, such as when
thedatabases are being backed up with pg_dump.  

Hm.  Do they use query-cancels at all?  The reference to async_notify
makes me wonder if this is related to the recently-discovered
async_notify bug that could prevent fast-mode shutdowns.  I'm not
certain how that might lead to an apparent deadlock, but a query cancel
arriving during async_notify would surely improve the odds of trouble.

If you don't mind running a slightly customized version, you might try
back-patching this fix:
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/async.c.diff?r1=1.91&r2=1.91.2.1
into 7.2.4 and see if that improves matters.

If it doesn't, I'd be interested to look into the matter, but I'd
probably need access to the machine to see what is going on.

> I've also noticed the following in cron logs from nightly vacuums

> NOTICE: Rel pg_attribute: Uninitialized page 59 - fixing
> NOTICE: Rel pg_attribute: Uninitialized page 60 - fixing

These are harmless.

> Is there anything I can do to debug this?  I'm willing to give it a
> shot, but I'm also rapidly preparing a single proc linux/intel machine
> to take over db duties.

I think you're mistaken to be blaming the hardware...

            regards, tom lane

pgsql-general by date:

Previous
From: Joseph Shraibman
Date:
Subject: Re: ERROR: out of free buffers: time to abort!
Next
From: eric soroos
Date:
Subject: Re: SMP-PPC spinlocks in 7.2.4?