Thread: VACUUM ANALYZE failed on linux

VACUUM ANALYZE failed on linux

From
Oleg Broytmann
Date:
Hello!
  I have a production server with Postgres 6.4.2. Every night crom runs
maintainance script, that contayned VACUUM ANALYZE (I use psql).  Few days ago the scrip failed with usual "... backend
closed
connection". I changed it to just VACUUM (this worked) and started
investigation. Please note, the system is RedHat 5.1 on Pentium.
  I dumped the datbase and reloaded it, then ran VACUUM ANALYZE. It failed
(I removed pg_vlock, of course).  I loaded the dump into 6.4.2 on my debugging server - Ultra-1, Solaris
2.5.1 and ran VACUUM ANALYZE. It worked.  I loaded the dump into 6.4.2 on my debugging Pentium with Debian 2.0 and
ran VACUUM ANALYZE. It failed.
  Seems 6.4.2 has problems on linux. Dump file is small (30K in bzip2) - I
can send it if someone want to try to reproduce it.
  BTW, while reloading, I noticed postgres eats virtual memory like a
hungry beast. My RedHat booted on loading (but after reboot db loaded ok). I
have to free much memory on Solaris to load the dump. Does "COPY FROM stdin"
really require so much memory? And how I will feel when my database will
grow even bigger? Sooner or later I couldn't load my own dump. Will I need
to split the dump into chunks?

Oleg.
----    Oleg Broytmann     http://members.xoom.com/phd2/     phd2@earthling.net          Programmers don't die, they
justGOSUB without RETURN.
 



Re: [HACKERS] VACUUM ANALYZE failed on linux

From
"Thomas G. Lockhart"
Date:
>    I have a production server with Postgres 6.4.2.
>    Seems 6.4.2 has problems on linux. Dump file is small (30K in 
>    bzip2) - I can send it if someone want to try to reproduce it.

Yes, send me the file. Unless gzip is *much* larger, please send in that
format.
                     - Tom


Re: [HACKERS] VACUUM ANALYZE failed on linux

From
jwieck@debis.com (Jan Wieck)
Date:
>
> >    I have a production server with Postgres 6.4.2.
> >    Seems 6.4.2 has problems on linux. Dump file is small (30K in
> >    bzip2) - I can send it if someone want to try to reproduce it.
>
> Yes, send me the file. Unless gzip is *much* larger, please send in that
> format.

    I'm already on it and seem's I've found the problem.

    Oleg is using a database schema with check constraints (which
    are executed during COPY FROM). The  function  ExecRelCheck()
    parses  each  constraint  for  each  tuple  every  time  with
    stringToNode().

    First this is wasted efford, second only the  outermost  node
    of  the  qualification  tree  built  with  stringToNode()  is
    pfree()'d in the loop.  Without debugging it I can tell  that
    a  simple  constraint  like  'attr != 0' will produce an Expr
    pointing to an Oper and a  List  with  one  Var  and  another
    Const.  So  only  one of 4 palloc()'d nodes is pfree()'d, the
    other 3 hang aroung until transaction end.

    But it's a little wired here and we cannot put the constraint
    qual-trees  into the Relation structure for a long time. This
    will later cause these nodes hang around in the Cache context
    where they shouldn't.  Don't know how to optimize here yet.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #