Thread: Database is slow, vacuum hangs

Database is slow, vacuum hangs

From
"Nick Fankhauser"
Date:
Hi-

This morning, my database suddenly got very slow. I had been testing an
import process, so I had loaded, deleted, and reloaded about 140MB of data
several times. After the last reload, I noticed that the database was very
slow. Queries that normally executed in seconds didn't return after several
minutes. I don't know if they ever would have returned results, as I
cancelled after about 5 minutes, so the database might have been completely
locked.

I decided to delete all of the data, do a vacuum, and then reload one more
time.

The deletes were slow, but worked, so the tables are all empty now. I
started the vacuum and let it run for 4 hours while I did other work & it
never came back. During this time, I noticed that two postmaster processes
were constantly running- each taking up almost exactly half of the CPU time.
There was no disk activity. After I canceled the vacuum, the two postmaster
processes remained and continued to use all of the CPU time. (they're still
there.)

postgres.log doesn't contain any new messages.

I guess the obvious next step is to restart postgres, but I don't want to
blow away any evidence.

Can anyone suggest tests I should do to learn what happened before trying a
restart?

I'm running PGSQL 7.1.3 on Debian Linux 2.4.14

Thanks-

-Nick

--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/


Re: Database is slow, vacuum hangs

From
Tom Lane
Date:
"Nick Fankhauser" <nickf@ontko.com> writes:
> The deletes were slow, but worked, so the tables are all empty now. I
> started the vacuum and let it run for 4 hours while I did other work & it
> never came back.

Is the vacuum actually running (accumulating any CPU time)?  Or is it
just waiting on a lock held by one of those other guys?

> During this time, I noticed that two postmaster processes
> were constantly running- each taking up almost exactly half of the CPU time.
> There was no disk activity. After I canceled the vacuum, the two postmaster
> processes remained and continued to use all of the CPU time. (they're still
> there.)

And what are they doing --- who started them, what query was sent to
'em, etc?

If you haven't a clue, try attaching to each process with a debugger and
printing out the global variable debug_query_string.

            regards, tom lane

Re: Database is slow, vacuum hangs

From
"Nick Fankhauser"
Date:
> Is the vacuum actually running (accumulating any CPU time)?  Or is it
> just waiting on a lock held by one of those other guys?

After a little blip at startup, it stops using CPU. Here's a snippet of the
"ps":

21966 pts/0    S      0:00 /bin/sh /usr/lib/postgresql/bin/vacuumdb temp
21971 pts/0    S      0:00 /usr/lib/postgresql/bin/psql -c VACUUM    -d temp
21972 ?        S      0:00 postgres: postgres temp [local] VACUUM waiting



> And what are they doing --- who started them, what query was sent to
> 'em, etc?

Here they are- they appear to be selects via JDBC initiated by Tomcat, but I
can't think of a way to find out what the actual query was (the browser
sessions are gone):

18980 ?        R    193:28 postgres: www-data temp 127.0.0.1 SELECT
16093 ?        R    197:05 postgres: www-data temp 127.0.0.1 SELECT



> If you haven't a clue, try attaching to each process with a debugger and
> printing out the global variable debug_query_string.

How do I do that?


Thanks,

NickF



Re: Database is slow, vacuum hangs

From
Tom Lane
Date:
"Nick Fankhauser" <nickf@ontko.com> writes:
>> If you haven't a clue, try attaching to each process with a debugger and
>> printing out the global variable debug_query_string.

> How do I do that?

Having identified the process PIDs, you can do

    gdb /path/to/postgres-executable

    gdb> attach PID
    gdb> p debug_query_string
    gdb> bt            -- might as well do this while you're here
    gdb> quit

This assumes you have gdb handy.  There is a way to do this with adb
too, but I don't use it enough to remember.

            regards, tom lane

Log's stopped.

From
Manuel Trujillo
Date:
On Fri, 2002-03-01 at 22:16, Tom Lane wrote:
> Is the vacuum actually running (accumulating any CPU time)?  Or is it
> just waiting on a lock held by one of those other guys?
> And what are they doing --- who started them, what query was sent to
> 'em, etc?
> If you haven't a clue, try attaching to each process with a debugger and
> printing out the global variable debug_query_string.

I've got a problem like that, but I only make a database backup and
vacuum analize. Then, every morning, the postgresql.log is stopped at
7:35 AM. But well, in my case, the backup & vacuum run fine.

I can't know why happen that. In a past log (before happed that), don't
display anything "wrong".

Anybody can help me, please?

Thank you.

Have a nice day ;-)
TooManySecrets