Thread: Database is slow, vacuum hangs
Hi- This morning, my database suddenly got very slow. I had been testing an import process, so I had loaded, deleted, and reloaded about 140MB of data several times. After the last reload, I noticed that the database was very slow. Queries that normally executed in seconds didn't return after several minutes. I don't know if they ever would have returned results, as I cancelled after about 5 minutes, so the database might have been completely locked. I decided to delete all of the data, do a vacuum, and then reload one more time. The deletes were slow, but worked, so the tables are all empty now. I started the vacuum and let it run for 4 hours while I did other work & it never came back. During this time, I noticed that two postmaster processes were constantly running- each taking up almost exactly half of the CPU time. There was no disk activity. After I canceled the vacuum, the two postmaster processes remained and continued to use all of the CPU time. (they're still there.) postgres.log doesn't contain any new messages. I guess the obvious next step is to restart postgres, but I don't want to blow away any evidence. Can anyone suggest tests I should do to learn what happened before trying a restart? I'm running PGSQL 7.1.3 on Debian Linux 2.4.14 Thanks- -Nick -------------------------------------------------------------------------- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
"Nick Fankhauser" <nickf@ontko.com> writes: > The deletes were slow, but worked, so the tables are all empty now. I > started the vacuum and let it run for 4 hours while I did other work & it > never came back. Is the vacuum actually running (accumulating any CPU time)? Or is it just waiting on a lock held by one of those other guys? > During this time, I noticed that two postmaster processes > were constantly running- each taking up almost exactly half of the CPU time. > There was no disk activity. After I canceled the vacuum, the two postmaster > processes remained and continued to use all of the CPU time. (they're still > there.) And what are they doing --- who started them, what query was sent to 'em, etc? If you haven't a clue, try attaching to each process with a debugger and printing out the global variable debug_query_string. regards, tom lane
> Is the vacuum actually running (accumulating any CPU time)? Or is it > just waiting on a lock held by one of those other guys? After a little blip at startup, it stops using CPU. Here's a snippet of the "ps": 21966 pts/0 S 0:00 /bin/sh /usr/lib/postgresql/bin/vacuumdb temp 21971 pts/0 S 0:00 /usr/lib/postgresql/bin/psql -c VACUUM -d temp 21972 ? S 0:00 postgres: postgres temp [local] VACUUM waiting > And what are they doing --- who started them, what query was sent to > 'em, etc? Here they are- they appear to be selects via JDBC initiated by Tomcat, but I can't think of a way to find out what the actual query was (the browser sessions are gone): 18980 ? R 193:28 postgres: www-data temp 127.0.0.1 SELECT 16093 ? R 197:05 postgres: www-data temp 127.0.0.1 SELECT > If you haven't a clue, try attaching to each process with a debugger and > printing out the global variable debug_query_string. How do I do that? Thanks, NickF
"Nick Fankhauser" <nickf@ontko.com> writes: >> If you haven't a clue, try attaching to each process with a debugger and >> printing out the global variable debug_query_string. > How do I do that? Having identified the process PIDs, you can do gdb /path/to/postgres-executable gdb> attach PID gdb> p debug_query_string gdb> bt -- might as well do this while you're here gdb> quit This assumes you have gdb handy. There is a way to do this with adb too, but I don't use it enough to remember. regards, tom lane
On Fri, 2002-03-01 at 22:16, Tom Lane wrote: > Is the vacuum actually running (accumulating any CPU time)? Or is it > just waiting on a lock held by one of those other guys? > And what are they doing --- who started them, what query was sent to > 'em, etc? > If you haven't a clue, try attaching to each process with a debugger and > printing out the global variable debug_query_string. I've got a problem like that, but I only make a database backup and vacuum analize. Then, every morning, the postgresql.log is stopped at 7:35 AM. But well, in my case, the backup & vacuum run fine. I can't know why happen that. In a past log (before happed that), don't display anything "wrong". Anybody can help me, please? Thank you. Have a nice day ;-) TooManySecrets