Thread: postgres idle process and other problems

postgres idle process and other problems

From
Peter Peltonen
Date:
We've had postgresql 7.1.2 and Orion application server 1.4.7 serving our
app successfully for a few months now without any problems on Red Hat 7.1
system.

Recently we found out that our app was not responding. We stopped postgres
and orion and tried starting them. The app wouldn't work. After each site
loaded we find these strange idle postgres processes:

--<snip>--
tcp_da S    pts/1      0:00 postgres: postgres ourapp <our-public-ip>
idle
tcp_da S    pts/1      0:00 postgres: postgres ourapp <our-public-ip>
idle in
transaction
tcp_da S    pts/1      0:00 postgres: postgres designforum <our-public-ip>
idle
in transaction
--</snip>--

After bunch of restarts the app suddenly started working.

In postgres log we see this:

--<snip>--
DEBUG:  database system was shut down at 2001-11-30 21:06:03 EET
DEBUG:  CheckPoint record at (0, 144404276)
DEBUG:  Redo record at (0, 144404276); Undo record at (0, 0); Shutdown
TRUE
DEBUG:  NextTransactionId: 1290155; NextOid: 44983
DEBUG:  database system is in production state
Server process (pid 14764) exited with status 9 at Fri Nov 30 21:12:18
2001
Terminating any active server processes...
Server processes were terminated at Fri Nov 30 21:12:18 2001
Reinitializing shared memory and semaphores
DEBUG:  database system was interrupted at 2001-11-30 21:11:58 EET
DEBUG:  CheckPoint record at (0, 144437508)
DEBUG:  Redo record at (0, 144437508); Undo record at (0, 0); Shutdown
FALSE
DEBUG:  NextTransactionId: 1290586; NextOid: 53175
DEBUG:  database system was not properly shut down; automatic recovery in
progress...
DEBUG:  ReadRecord: record with zero len at (0, 144437572)
DEBUG:  redo is not required
The Data Base System is starting up
The Data Base System is starting up
The Data Base System is starting up
The Data Base System is starting up
DEBUG:  database system is in production state
pq_recvbuf: unexpected EOF on client connection
pq_recvbuf: unexpected EOF on client connection
ERROR:  Bad date external representation 'date(2001-1111-9)'
pq_recvbuf: recv() failed: Connection reset by peer
pq_recvbuf: unexpected EOF on client connection
pq_recvbuf: unexpected EOF on client connection
ERROR:  parser: parse error at or near "k"
ERROR:  parser: parse error at or near "k"
ERROR:  parser: parse error at or near "k"
ERROR:  parser: parse error at or near "k"
pq_recvbuf: unexpected EOF on client connection
pq_recvbuf: unexpected EOF on client connection
--</snip>--

Those pq_recvbuf there is a bunch more in the log.

What are those ERRORs? Any ideas what has happened and is going on?

Regards,
Peter


Re: postgres idle process and other problems

From
Tom Lane
Date:
Peter Peltonen <peter.peltonen@fivetec.com> writes:
> Server process (pid 14764) exited with status 9 at Fri Nov 30 21:12:18
> 2001

Is there a core dump file left over from that crash?  (Look in the
$PGDATA/base/nnn subdirectories.)

            regards, tom lane

Re: postgres idle process and other problems

From
Peter Peltonen
Date:
On Mon, Dec 03, 2001 at 09:59:27AM -0500, Tom Lane wrote:
> Is there a core dump file left over from that crash?  (Look in the
> $PGDATA/base/nnn subdirectories.)
>
>             regards, tom lane

No there isn't. All the directories look like this:

--<snip>--
[root@nizza base]# ls
1  18719  18720  25493
[root@nizza base]# cd 1
[root@nizza 1]# ls
1215  16567  16867  17074  17112  17139  17160  17181  17213  17261
1216  16579  16934  17086  17115  17142  17163  17184  17216  17273
1219  16600  16948  17097  17118  17145  17166  17187  17228  17276
1247  16617  16960  17100  17121  17148  17169  17190  17231  17288
1249  16642  17033  17103  17124  17151  17172  17193  17243
pg_internal.init
1255  16653  17045  17106  17133  17154  17175  17196  17246  PG_VERSION
1259  16685  17058  17109  17136  17157  17178  17201  17258
--</snip>--

Any ideas what might have happened?

Should we be worried about the idle process thing?

Regards,
Peter


ERROR: MemoryContextAlloc: invalid request size

From
"Rich Ryan"
Date:
I'm getting errors like the following when I do queries/copies/dumps on
certain tables.
ERROR:  MemoryContextAlloc: invalid request size 4294967293
The number will sometimes differ. I'm running 7.1.3 on RedHat Linux 6.2.
Dual-proc with 2GB RAM. I upgraded to 7.1.3 a few weeks ago. This is the
first time this has happened. I can't think of any configuration changes I
made since installing it. I run the same import and export script every
night. Any clues would be appreciated.
Thanks,
Rich




Re: ERROR: MemoryContextAlloc: invalid request size

From
Tom Lane
Date:
"Rich Ryan" <rich@usedcars.com> writes:
> I'm getting errors like the following when I do queries/copies/dumps on
> certain tables.
> ERROR:  MemoryContextAlloc: invalid request size 4294967293

A first guess is corrupted data: the length word of some variable-length
field contains garbage.  You could probably track down the affected
row(s) by seeing how much you can retrieve without error.  See past
discussions of similar problems.

            regards, tom lane

Re: postgres idle process and other problems

From
Tom Lane
Date:
Peter Peltonen <peter.peltonen@fivetec.com> writes:
> On Mon, Dec 03, 2001 at 09:59:27AM -0500, Tom Lane wrote:
>> Is there a core dump file left over from that crash?  (Look in the
>> $PGDATA/base/nnn subdirectories.)

> No there isn't. All the directories look like this:

You probably are starting the postmaster with "ulimit -c 0" or local
spelling thereof, which prevents core dumps.  Might want to change
that, so that you have some hope of debugging crashes.  The crash
shown in your log needs to be looked into, if it's reproducible.

> Should we be worried about the idle process thing?

Looked to me like a bunch of clients that hadn't disconnected.

            regards, tom lane

Re: postgres idle process and other problems

From
"Steve Brett"
Date:
what's Orion Application Server ?

i've had this kind of error before and traced it to a bug in some Zeos
Delphi Postgres components.

it (in my case) was caused by a transaction not comitting and then postgres
waiting for the transaction to commit and everything else in the transaction
queue piling up behind it.

i noticed by realising postgresql was locking up when i tried to vacuum.

it sounds like a buggy bit of transaction code in Orion app server ....

Steve

"Peter Peltonen" <peter.peltonen@fivetec.com> wrote in message
news:20011203144814.G1859@pihlaja.kotilo...
> We've had postgresql 7.1.2 and Orion application server 1.4.7 serving our
> app successfully for a few months now without any problems on Red Hat 7.1
> system.
>
> Recently we found out that our app was not responding. We stopped postgres
> and orion and tried starting them. The app wouldn't work. After each site
> loaded we find these strange idle postgres processes:
>
> --<snip>--
> tcp_da S    pts/1      0:00 postgres: postgres ourapp <our-public-ip>
> idle
> tcp_da S    pts/1      0:00 postgres: postgres ourapp <our-public-ip>
> idle in
> transaction
> tcp_da S    pts/1      0:00 postgres: postgres designforum <our-public-ip>
> idle
> in transaction
> --</snip>--
>
> After bunch of restarts the app suddenly started working.
>
> In postgres log we see this:
>
> --<snip>--
> DEBUG:  database system was shut down at 2001-11-30 21:06:03 EET
> DEBUG:  CheckPoint record at (0, 144404276)
> DEBUG:  Redo record at (0, 144404276); Undo record at (0, 0); Shutdown
> TRUE
> DEBUG:  NextTransactionId: 1290155; NextOid: 44983
> DEBUG:  database system is in production state
> Server process (pid 14764) exited with status 9 at Fri Nov 30 21:12:18
> 2001
> Terminating any active server processes...
> Server processes were terminated at Fri Nov 30 21:12:18 2001
> Reinitializing shared memory and semaphores
> DEBUG:  database system was interrupted at 2001-11-30 21:11:58 EET
> DEBUG:  CheckPoint record at (0, 144437508)
> DEBUG:  Redo record at (0, 144437508); Undo record at (0, 0); Shutdown
> FALSE
> DEBUG:  NextTransactionId: 1290586; NextOid: 53175
> DEBUG:  database system was not properly shut down; automatic recovery in
> progress...
> DEBUG:  ReadRecord: record with zero len at (0, 144437572)
> DEBUG:  redo is not required
> The Data Base System is starting up
> The Data Base System is starting up
> The Data Base System is starting up
> The Data Base System is starting up
> DEBUG:  database system is in production state
> pq_recvbuf: unexpected EOF on client connection
> pq_recvbuf: unexpected EOF on client connection
> ERROR:  Bad date external representation 'date(2001-1111-9)'
> pq_recvbuf: recv() failed: Connection reset by peer
> pq_recvbuf: unexpected EOF on client connection
> pq_recvbuf: unexpected EOF on client connection
> ERROR:  parser: parse error at or near "k"
> ERROR:  parser: parse error at or near "k"
> ERROR:  parser: parse error at or near "k"
> ERROR:  parser: parse error at or near "k"
> pq_recvbuf: unexpected EOF on client connection
> pq_recvbuf: unexpected EOF on client connection
> --</snip>--
>
> Those pq_recvbuf there is a bunch more in the log.
>
> What are those ERRORs? Any ideas what has happened and is going on?
>
> Regards,
> Peter
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)