7.0RC1: possible query and backend problem - Mailing list pgsql-general
From | Michael Blakeley |
---|---|
Subject | 7.0RC1: possible query and backend problem |
Date | |
Msg-id | p04310103b523f49d6641@blakeley.com Whole thread Raw |
Responses |
Re: 7.0RC1: possible query and backend problem
Re: 7.0RC1: possible query and backend problem |
List | pgsql-general |
I've just upgraded to 7.0RC1 on a Solaris 2.6+patches system (U5 with 256MB). I'm having some trouble with a brand-new query, so I don't know if this is a new bug or an old one. It has two parts: query execution and backend robustness. First, I'm getting a crash with this query (via Apache mod_perl and DBI, or via psql): Table U: id varchar(128) primary key, stop datetime, start datetime Table S1: id varchar(128) primary key, url varchar(512); Table S2: id varchar(128) primary key, url varchar(512); SELECT id,date_part('epoch',sum(stop-start)),count(*),S1.url,S2.url FROM U WHERE 'now'::datetime-start<'1 month'::interval AND (id=S1.id OR id=S2.id) GROUP BY id,S1.url,S2.url; The idea is to select whichever of S1.url and S2.url is non-null. From psql, I get: pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. connection to server was lost And now there are no postgres processes running. I suspect that the problem is that, in my data set, either S1.url or S2.url will be null for any given row, and this is causing problems for GROUP BY (I can also believe that my SQL is bad - but that shouldn't crash postgres, should it?). If I stick to just S1, the query runs ok. But... now there are no postgres processes running at all. # ps -fu postgres UID PID PPID C STIME TTY TIME CMD In the postgres log: >Server process (pid 11189) exited with status 139 at Wed Apr 19 16:37:25 2000 >Terminating any active server processes... >NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend died >abnormally and possibly corrupted shared memory. > I have rolled back the current transaction and am going to >terminate your database system connection and exit. > Please reconnect to the database system and repeat your query. >Server processes were terminated at Wed Apr 19 16:37:25 2000 >Reinitializing shared memory and semaphores >IpcMemoryCreate: shmget failed (Invalid argument) key=5432110, >size=144, permission=700 >This type of error is usually caused by an improper >shared memory or System V IPC semaphore configuration. >For more information, see the FAQ and platform-specific >FAQ's in the source directory pgsql/doc or on our >web site at http://www.postgresql.org. >IpcMemoryIdGet: shmget failed (Invalid argument) key=5432110, >size=144, permission=0 >IpcMemoryAttach: shmat failed (Invalid argument) id=-2 >FATAL 1: AttachSLockMemory: could not attach segment Now, the real problem here is that postgres seems to be trying to restart, and fails. It complains about shmget - my /etc/system says: set shmsys:shminfo_shmmax = 0x40000000 set shmsys:shminfo_shmmni = 100 set shmsys:shminfo_shmseg = 32 set semsys:seminfo_semmns = 1024 set semsys:seminfo_semmni = 1024 set semsys:seminfo_semmnu = 1024 set semsys:seminfo_semmsl = 1024 set semsys:seminfo_semmap = 1024 So I don't think this is really a config problem (if it were, the db wouldn't start up to begin with). Rather, I'd guess that postgres isn't freeing a memory block during the "Terminating any active server processes..." phase. Running ipcs -a immediately after the crash shows several segments (although oddly, I have no trouble starting the db again as root - maybe there's a permissions issue at work here?): # ipcs -a IPC status from <running system> as of Wed Apr 19 16:45:42 2000 T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME Shared Memory: m 800 0x0052e32e --rw------- postgres postgres postgres postgres 0 120 12737 12737 13:01:36 13:01:36 13:01:36 m 801 0x0052e325 --rw------- postgres postgres postgres postgres 0 15373312 12737 12737 13:01:36 13:01:36 13:01:36 m 802 0x0052e32b --rw------- postgres postgres postgres postgres 0 102184 12737 12737 13:01:36 13:01:36 13:01:36 m 3 0x500005a3 --rw-r--r-- root root root root 1 68 232 232 15:29:35 15:29:35 15:29:35 Anyway, I hope this report is helpful. Thanks to the developers for their hard work on PG7. -- Mike
pgsql-general by date: