Home > mailing lists

Re: URGENT: Database keeps crashing - suspect damaged RAM - Mailing list pgsql-general

From	Markus Wollny
Subject	Re: URGENT: Database keeps crashing - suspect damaged RAM
Date	August 7, 2002 11:29:15
Msg-id	2266D0630E43BB4290742247C891057501B13229@dozer.computec.de Whole thread Raw
In response to	URGENT: Database keeps crashing - suspect damaged RAM ("Markus Wollny" <Markus.Wollny@computec.de>)
List	pgsql-general

Tree view

Hi!

I think I'll have to bow down to you gurus - again :) I upgraded to
2.4.16 (there are no RPMs for 2.4.19 and I didn't want to compile from
source - yet), and the symptoms have disappeared altogether. Which is
strange because, as I already told, the very same config isn't giving me
any trouble on a different machine... Anyway: I'll shun 2.4.10 from now
on.

Regards,

    Markus

> -----Ursprüngliche Nachricht-----
> Von: Jeff Davis [mailto:list-pgsql-general@empires.org]
> Gesendet: Dienstag, 6. August 2002 20:29
> An: Markus Wollny; Tom Lane
> Cc: pgsql-general@postgresql.org
> Betreff: Re: [GENERAL] URGENT: Database keeps crashing - 
> suspect damaged
> RAM
> 
> 
> Virtual memory problems on linux have certainly happened 
> before; perhaps your 
> running a kernel that had some major ones. Maybe if you 
> upgraded to 2.4.19?
> 
> Regards,
>     Jeff Davis
> 
> On Tuesday 06 August 2002 11:02 am, Markus Wollny wrote:
> > Hi!
> > 
> >     -----Ursprüngliche Nachricht----- 
> >     Von: Tom Lane 
> >     Gesendet: Di 06.08.2002 18:59 
> >     An: Markus Wollny 
> >     Cc: pgsql-general@postgresql.org 
> >     Betreff: Re: [GENERAL] URGENT: Database keeps crashing - suspect
> > damaged RAM 
> >     
> >     
> > 
> >     "Markus Wollny" <Markus.Wollny@computec.de> writes:
> >
> >     > So: Is it bad RAM? How can I make sure? What else could it be?
> >
> >     
> >     Have you tried running memtest86?  I've never used that myself
> > but
> >     some folks on the list say it works well.
> > 
> >     
> > 
> > No, I haven't tried that yet, but I'm surely going to do so 
> tomorrow.
> > 
> >
> >     > Here's a small excerpt from the logfile:
> >
> >     
> >
> >     > 2002-08-06 17:36:23 [17296]  DEBUG:  _mdfd_blind_getseg:
> >
> > couldn't open
> >
> >     > /var/lib/pgsql/data/base/base/16596/16671: Cannot allocate
> >
> > memory
> >     
> >     Is it possible that you are running with inadequate swap space,
> > a small
> >     data segment limit (ulimit -d), or something else that would
> > make the
> >     kernel refuse to give memory to a backend process?
> > 
> > I shouldn't think so; the machine has 2 GB RAM (that was more than
> > sufficient for the same DB, applications and load on a different
> > machine) and 4 GB swap:
> > Disk geometry for /dev/sda: 0.000-51834.000 megabytes
> > Disk label type: msdos
> > Minor    Start       End     Type      Filesystem  Flags
> > 1          0.031     15.688  primary   ext3        boot
> > 2         15.688   4118.225  primary   linux-swap
> > 3       4118.225  24599.531  primary   ext3
> > 4      24599.531  51826.882  primary   ext3
> > 
> > Taking a closer look I am a bit confused: I allocated 4GB the swap
> > partition, as you can see above, but free only reports 2GB? That's
> > strange, but cannot be the cause, I think, as the working 
> machine has
> > got just 2 GB swap, too. ulimit is set to "unlimited" and 
> there was RAM
> > available during load. As a matter of fact, right now free reports:
> > 
> >              total       used       free     shared    buffers
> > cached
> > Mem:       2061536    2053816       7720          0       4496
> > 1825620
> > -/+ buffers/cache:     223700    1837836
> > Swap:      2097136     124800    1972336
> > 
> > on our fallback-machine, and that's the very same database 
> and very same
> > application, it is running. When taking a look at total 
> disk usage of
> > the database, I get a total of 1,8 GB. When I switched to the new
> > machine, there were about 30-50 open connections, max. 
> connections is
> > set to 512 on both machines. The crashes occurred immediately after
> > making the DB accessible to our application, so most of the DB was
> > definitely not yet in memory. And again - our 
> fallback-machine which has
> > got no RAID and slower processors can handle the very same 
> DB under the
> > very same load with no such problems - I never ever encountered this
> > "cannot allocate memory" error before.
> > 
> >
> >     > 2002-08-06 17:40:53 [16530]  DEBUG:  connection startup failed
> >
> > (fork
> >
> >     > failure): Cannot allocate memory
> >     > 2002-08-06 17:52:50 [16530]  DEBUG:  connection startup failed
> >
> > (fork
> >
> >     > failure): Cannot allocate memory
> >
> >     
> >     Still looks like inadequate memory --- but now I'm thinking that
> > it's a
> >     system-wide condition, ie, you just plain haven't got enough RAM
> > for the
> >     number of processes you're trying to start.
> >     
> >
> >     > 2002-08-06 17:52:54 [16530]  DEBUG:  server process (pid
> >
> > 18237) was
> >
> >     > terminated by signal 9
> >
> >     
> >     Postgres never issues any kill -9 on itself, but I've heard that
> > the
> >     Linux kernel may start killing processes when it's desperately
> > low on
> >     memory.
> >     
> >     Other than the signal 9, everything I see in this trace is
> > either a
> >     cannot-allocate-memory failure or followup effects from one.
> > How many
> >     backends are you trying to start up, anyway?  Might you have a
> > runaway
> >     client that keeps opening new backend connections?
> >     
> > 
> > Must be something else - the number of connections was not 
> at all high
> > (<100), the server-load wasn't more than 3.5 (on a 
> 4-processor machine),
> > there was RAM available at the time, both physical and 
> swap, I haven't
> > got any surplus daemons running... I think I'll be able to 
> harden the
> > bad-RAM-issue tomorrow using memtest86.
> > 
> > Thank you!
> > 
> > Regards,
> > 
> >      Markus
> > 
> >
> > ---------------------------(end of 
> broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to 
> majordomo@postgresql.org)
> 
>

pgsql-general by date:

From: Elielson Fontanezi
Date: 07 August 2002, 07:32:08
Subject: J2EE IDE - to help J2EE development

From: Robert Treat
Date: 07 August 2002, 11:37:07
Subject: Re: MySQL or Postgres ?

Re: URGENT: Database keeps crashing - suspect damaged RAM - Mailing list pgsql-general

Previous

Next