Re: Table locking problems? - Mailing list pgsql-performance

From John A Meinel
Subject Re: Table locking problems?
Date
Msg-id 42F9255A.4080604@arbash-meinel.com
Whole thread Raw
In response to Re: Table locking problems?  (Dan Harris <fbsd@drivefaster.net>)
Responses Re: Table locking problems?
List pgsql-performance
Dan Harris wrote:
>
> On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:
>
>> Dan,
>>
>> Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
>> clarification, since RAID 0 is still a single-point of failure even if
>> RAID1 is on top of RAID0.
>
>
> Well, you tell me if I stated incorrectly.  There are two raid
> enclosures with 7 drives in each.  Each is on its own bus on a dual-
> channel controller.  Each box has a stripe across its drives and the
> enclosures are mirrors of each other.  I understand the controller
> could be a single point of failure, but I'm not sure I understand  your
> concern about the RAID structure itself.

In this configuration, if you have a drive fail on both controllers, the
entire RAID dies. Lets label them A1-7, B1-7, because you stripe within
a set, if a single one of A dies, and a single one of B dies, you have
lost your entire mirror.

The correct way of doing it, is to have A1 be a mirror of B1, and then
stripe above that. Since you are using 2 7-disk enclosures, I'm not sure
how you can do it well, since it is not an even number of disks. Though
if you are using software RAID, there should be no problem.

The difference is that in this scenario, *all* of the A drives can die,
and you haven't lost any data. The only thing you can't lose is a
matched pair (eg losing both A1 and B1 will cause complete data loss)

I believe the correct notation for this last form is RAID 1 + 0 (RAID10)
since you have a set of RAID1 drives, with a RAID0 on-top of them.

>
>>
>> How many users are connected when your update / delete queries are
>> hanging? Have you done an analyze verbose on those queries?
>
>
> Most of the traffic is from programs we run to do analysis of the  data
> and managing changes.  At the time I noticed it this morning,  there
> were 10 connections open to the database.  That rarely goes  above 20
> concurrent.  As I said in my other response, I believe that  the log
> will only contain the query at the point the query finishes,  so if it
> never finishes...
>
>>
>> Have you made changes to the postgresql.conf? kernel.vm settings? IO
>> scheduler?
>
>
> I set shmmax appropriately for my shared_buffers setting, but that's
> the only kernel tweak.
>
>>
>> If you're not doing so already, you may consider running sar  (iostat) to
>> monitor when the hanging occurs if their is a memory / IO bottleneck
>> somewhere.
>>
>
> I will try that.  Thanks
>

When you discover that an update is hanging, can you get into the
database, and see what locks currently exist? (SELECT * FROM pg_locks)

That might help you figure out what is being locked and possibly
preventing your updates.

It is also possible that your UPDATE query is trying to do something
funny (someone just recently was talking about an UPDATE that wanted to
do a hash join against 12M rows). Which probably meant that it had to
spill to disk, where a merge join would have worked better.

John
=:->

Attachment

pgsql-performance by date:

Previous
From: Dan Harris
Date:
Subject: Re: Table locking problems?
Next
From: Dan Harris
Date:
Subject: Re: Table locking problems?