Re: INSERT/UPDATE waiting (another example) - Mailing list pgsql-sql
From | Wayne Piekarski |
---|---|
Subject | Re: INSERT/UPDATE waiting (another example) |
Date | |
Msg-id | 199905060331.NAA26678@helpdesk.senet.com.au Whole thread Raw |
Responses |
Re: [HACKERS] Re: INSERT/UPDATE waiting (another example)
|
List | pgsql-sql |
> Date: Tue, 4 May 1999 11:33:34 +0200 (CEST) > From: Dirk Lutzebaeck <lutzeb@aeccom.com> > Subject: INSERT/UPDATE waiting > > Hello, > > somehow the backend is hanging on my system (6.5beta1, Linux 2.2.6): > > postgres 29957 0.0 1.5 4124 2048 ? S 18:23 0:00 /usr/local/pgsql/bin/postgres localhost lutzeb cs UPDATE waiting > postgres 29980 0.0 1.6 4124 2064 ? S 18:25 0:00 /usr/local/pgsql/bin/postgres localhost lutzeb cs UPDATE waiting > postgres 30005 0.0 1.6 4124 2088 ? S 18:27 0:00 /usr/local/pgsql/bin/postgres localhost lutzeb cs UPDATE waiting > postgres 30012 0.0 2.1 4532 2696 ? S 18:28 0:00 /usr/local/pgsql/bin/postgres localhost lutzeb cs INSERT waiting > postgres 30030 0.0 3.0 5780 3916 ? S 18:28 0:00 /usr/local/pgsql/bin/postgres localhost lutzeb cs idle > > [about 34 processes] > > What is happening here? Can't find anything in the documentation. Hi everyone, We just deployed a large system a few weeks ago which involves using PostgreSQL 6.4.2 and CGI based interfaces, and we usually have around 10-20 connections always running, the system is very busy during 9-5 hours. The machine is running FreeBSD 2.2.7 and has 256 mb of RAM, lots of file descriptors. We are experiencing exactly the same problem as above - during the day, all of a sudden Postgres will completely jam up, with all processing in one of the following states: (from ps -axwwwwww) SELECT waiting DELETE waiting INSERT waiting idle waiting idle Sometimes the ps output will also return postgres backends with garbage high-ascii characters in their name, got no idea why here either.... Originally, we thought it was a bad query somewhere with lock statements in the wrong order, causing the deadlock, and so we reduced one of the constants compiled into the backend which controls how often deadlocks were checked for and set it to something like ten seconds. I then forced the database to go into a real deadlock by doing: 1: BEGIN 2: BEGIN 1: LOCK table a 2: LOCK table b 1: LOCK table b 2: LOCK table a Ten seconds later, one is aborted due to the deadlock checking - this is great, as everything clears up and continues no problems. However, during operation, every so often for no apparent reason, all the backends doing work all jam up with "waiting" in their name, otherwise they are idle - and nothing happens. I let it sit for about 5 minutes one time, and nothing happened. I had to kill everything so the staff could actually use the system. For a while, we went hunting through our code looking for improperly ordered lock statements and things like that. We found one which was fixed but the problem still happens - there may be more of these in the code, there probably are in fact, but I'm under the impression the deadlock checking is supposed to get around that and kill backends nicely to resolve the conflict? Also, one other thing we discovered was making the following deadlock: S1: BEGIN S1: LOCK table a S2: SELECT * FROM a, b where a.id = b.id S1: LOCK table b If we did explain on the SELECT, and it chose to scan A first, it would work but if we used an index or rearranged the select statement, and B was scanned first we would get a deadlock, since the select couldn't complete. Now I would have hoped that locks for the join would be acquired either all at once, or none at all. I don't want to have to wrap lock statements around the select because all I want to do is read some data, I don't want to make updates! Anyways, back to the real problem, I enabled a constant in the code and whenever the deadlock checking occurs during this lockup problem, and it prints the following out to the log file: DEBUG: DumpAllLocks: xidLook->tag.lock = NULL DEBUG: DumpAllLocks: xidLook->tag.lock = NULL DEBUG: DumpAllLocks: xidLook->tag.lock = NULL Not sure what this means - I'd really like it to show what kind of locks the postgres processes are waiting for, as it could help me resolve the problem. Is there anything else I can compile in to do this? I notice there are quite a few symbols you can define for locks. So right now I am at a loss on what to do - whenever everything jams up, I do a ps -axwwww and kill off all postgres processes which are idle waiting, then SELECT waiting, and sometimes it clears up. Failing that I then kill the ones making changes like delete, insert, update - although I hate doing that because I found that when I went around ruthlessly killing backends to resolve conflicts, tables got corrupted and we experienced problems like BTP_CHAIN, and vacuums would start failing, requiring us to dump the tables and reload, which was pretty bad :( I've had other problems as well, but I'll save them for another email. So if anyone can offer any advice, I'd be eternally grateful! Right now I'm getting lots of flak from people, saying I should have used Oracle, or mSQL, etc, but Postgres works great and allows me to embed code into the backend, and all kinds of other cool features. I really am looking forward to 6.5 as the MVCC stuff sounds great, but right now I need to get this working reliably until then. thanks, Wayne ------------------------------------------------------------------------------ Wayne Piekarski Tel: (08) 8221 5221 Research & Development Manager Fax: (08) 8221 5220 SE Network Access Pty Ltd Mob: 0407 395 889 222 Grote Street Email: wayne@senet.com.au Adelaide SA 5000 WWW: http://www.senet.com.au