Thread: VACUUM is hanging

VACUUM is hanging

From
Ashley Clark
Date:
Maybe someone will know what's going on here.

I'm running 7.1 and vacuuming the db regularly at 4am each day. But
these last few days as it vacuums it hangs reaping dead processes.

2001-05-04 04:00:09 [27285]  DEBUG:  --Relation personnel--
2001-05-04 04:00:09 [27285]  DEBUG:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, Crash 0,
UnUsed0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. 
2001-05-04 04:00:09 [27285]  DEBUG:  Index personnel_pkey: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 04:00:09 [27285]  DEBUG:  Index personnel_name: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 04:00:09 [27285]  DEBUG:  Index personnel_truck: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 04:00:09 [27285]  DEBUG:  Index personnel_ssn: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 04:00:09 [27285]  DEBUG:  Analyzing...
/usr/lib/postgresql/bin/postmaster: reaping dead processes...
/usr/lib/postgresql/bin/postmaster: CleanupProc: pid 27299 exited with status 0
/usr/lib/postgresql/bin/postmaster: reaping dead processes...

Then there are hundreds of the last three lines until I get here the
next morning and try to access the database, that then seems to give it
a kick in the pants, for a while at least, then it returns to its
inane suicide ritual. Any ideas?

Restarting the DB and running a vacuum by hand in psql works, but I
don't know if it will clog up tonight or not...

--
really hacks

Attachment

Re: VACUUM is hanging

From
Tom Lane
Date:
Ashley Clark <aclark@ghoti.org> writes:
> 2001-05-04 04:00:09 [27285]  DEBUG:  Analyzing...
> /usr/lib/postgresql/bin/postmaster: reaping dead processes...
> /usr/lib/postgresql/bin/postmaster: CleanupProc: pid 27299 exited with stat=
> us 0
> /usr/lib/postgresql/bin/postmaster: reaping dead processes...

Since the PID mentioned by the postmaster is NOT the same one running
the vacuum, presumably this is just an indication that some other
backend finished its work and exited normally.  It's unclear what the
process doing the vacuum could be waiting for; can you attach to it
with gdb and get a backtrace?

            regards, tom lane

Re: VACUUM is hanging

From
Ashley Clark
Date:
* Tom Lane in "Re: [GENERAL] VACUUM is hanging" dated 2001/05/04 11:09
* wrote:

> Since the PID mentioned by the postmaster is NOT the same one running
> the vacuum, presumably this is just an indication that some other
> backend finished its work and exited normally.

I would agree except that this is a product in testing and there should
be NOTHING accessing it at 4am. Currently, I'm the only one with access
to the DB.

> It's unclear what the process doing the vacuum could be waiting for;
> can you attach to it with gdb and get a backtrace?

I'll try to recreate the situation and get a backtrace, if that's
successful I guess I'll have to recompile with -g to get really useful
information.

--
really shack

Attachment

Re: VACUUM is hanging

From
Tom Lane
Date:
Ashley Clark <aclark@ghoti.org> writes:
> * Tom Lane in "Re: [GENERAL] VACUUM is hanging" dated 2001/05/04 11:09
> * wrote:
>> Since the PID mentioned by the postmaster is NOT the same one running
>> the vacuum, presumably this is just an indication that some other
>> backend finished its work and exited normally.

> I would agree except that this is a product in testing and there should
> be NOTHING accessing it at 4am.

The postmaster will be firing off periodic checkpoint subprocesses even
if there's no user activity.  That's probably what those are.

>> It's unclear what the process doing the vacuum could be waiting for;
>> can you attach to it with gdb and get a backtrace?

> I'll try to recreate the situation and get a backtrace, if that's
> successful I guess I'll have to recompile with -g to get really useful
> information.

The backtrace might be useful even if it just shows routine names...

            regards, tom lane

Re: VACUUM is hanging

From
Ashley Clark
Date:
* Tom Lane in "Re: [GENERAL] VACUUM is hanging" dated 2001/05/04 11:09
* wrote:

> It's unclear what the process doing the vacuum could be waiting for;
> can you attach to it with gdb and get a backtrace?

OK, here's the backtrace of the process listed in the log following:

2001-05-04 19:18:15 [6273]   DEBUG:  --Relation personnel--
2001-05-04 19:18:15 [6273]   DEBUG:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, Crash 0,
UnUsed0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. 
2001-05-04 19:18:15 [6273]   DEBUG:  Index personnel_pkey: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 19:18:15 [6273]   DEBUG:  Index personnel_name: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 19:18:15 [6273]   DEBUG:  Index personnel_truck: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 19:18:15 [6273]   DEBUG:  Index personnel_ssn: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
2001-05-04 19:18:15 [6273]   DEBUG:  Analyzing...

Backtrace:

#0  0x402323db in semop () from /lib/libc.so.6
#1  0x80fe0c2 in IpcSemaphoreLock (semId=7425, sem=9, interruptOK=1 '\001')
    at ipc.c:426
#2  0x8102059 in ProcSleep (lockMethodTable=0x82ebf10, lockmode=7,
    lock=0x403d8860, holder=0x403d9768) at proc.c:666
#3  0x8101377 in WaitOnLock (lockmethod=1, lockmode=7, lock=0x403d8860,
    holder=0x403d9768) at lock.c:955
#4  0x8101139 in LockAcquire (lockmethod=1, locktag=0xbfffefb0, xid=16748,
    lockmode=7) at lock.c:739
#5  0x81005f4 in LockRelation (relation=0x4054a828, lockmode=7) at lmgr.c:141
#6  0x806fb4c in heap_open (relationId=19226, lockmode=7) at heapam.c:596
#7  0x80b1d6d in vacuum_rel (relid=19226) at vacuum.c:400
#8  0x80b1aba in vac_vacuum (VacRelP=0x0, analyze=1 '\001', anal_cols2=0x0)
    at vacuum.c:245
#9  0x80b1a2e in vacuum (vacrel=0x0, verbose=0, analyze=1 '\001',
    anal_cols=0x0) at vacuum.c:163
#10 0x8107ddc in ProcessUtility (parsetree=0x8329980, dest=Remote)
    at utility.c:708
#11 0x8105b5a in pg_exec_query_string (
    query_string=0x83297f0 "vacuum  analyze;", dest=Remote,
    parse_context=0x8303f68) at postgres.c:777
#12 0x8106c74 in PostgresMain (argc=4, argv=0xbffff380, real_argc=3,
    real_argv=0xbffffd44, username=0x82f97c9 "postgres") at postgres.c:1908
#13 0x80f1234 in DoBackend (port=0x82f9560) at postmaster.c:2111
#14 0x80f0df4 in BackendStartup (port=0x82f9560) at postmaster.c:1894
#15 0x80eff10 in ServerLoop () at postmaster.c:992
#16 0x80ef889 in PostmasterMain (argc=3, argv=0xbffffd44) at postmaster.c:682
#17 0x80d00f2 in main (argc=3, argv=0xbffffd44) at main.c:147

I also seem to have found the culprit. I have Apache::DBI set up and
it's holding something open. Restarting Apache let the vacuum finish
normally. Any suggestions as to how I can remedy this situation. I'd
like to avoid having to stop/start Apache before/after the vacuum.

--
creaky halls

Attachment

Re: VACUUM is hanging

From
Tom Lane
Date:
Ashley Clark <aclark@ghoti.org> writes:
> #5  0x81005f4 in LockRelation (relation=3D0x4054a828, lockmode=3D7) at lmgr=
> .c:141
> #6  0x806fb4c in heap_open (relationId=3D19226, lockmode=3D7) at heapam.c:5=
> 96
> #7  0x80b1d6d in vacuum_rel (relid=3D19226) at vacuum.c:400
> #8  0x80b1aba in vac_vacuum (VacRelP=3D0x0, analyze=3D1 '\001', anal_cols2=
> =3D0x0)
>     at vacuum.c:245
> #9  0x80b1a2e in vacuum (vacrel=3D0x0, verbose=3D0, analyze=3D1 '\001',
>     anal_cols=3D0x0) at vacuum.c:163

Looks like it's waiting to get an exclusive lock on the table it plans
to vacuum next (which is the one with OID 19226 in pg_class, if you want
to find out).

> I also seem to have found the culprit. I have Apache::DBI set up and
> it's holding something open. Restarting Apache let the vacuum finish
> normally.

Try not to leave open transactions hanging around in your Apache code.
An idle database connection couldn't block VACUUM like that; it'd have
to have been in the middle of a BEGIN block, or maybe even an unfinished
query.

            regards, tom lane

Re: VACUUM is hanging

From
Ashley Clark
Date:
* Tom Lane in "Re: VACUUM is hanging" dated 2001/05/04 22:05 wrote:

> > I also seem to have found the culprit. I have Apache::DBI set up
> > and it's holding something open. Restarting Apache let the vacuum
> > finish normally.
>
> Try not to leave open transactions hanging around in your Apache
> code. An idle database connection couldn't block VACUUM like that;
> it'd have to have been in the middle of a BEGIN block, or maybe even
> an unfinished query.

That appears to be the culprit. I don't really understand why an
uncommitted SELECT statement could/should lock the table though. How
does that differ from just an open transaction where no query has
happened yet?

--
chalk slayer

Attachment

Re: VACUUM is hanging

From
"David Wall"
Date:
> Try not to leave open transactions hanging around in your Apache code.
> An idle database connection couldn't block VACUUM like that; it'd have
> to have been in the middle of a BEGIN block, or maybe even an unfinished
> query.

It's possible that the JDBC library always creates a new transaction after a
commit/rollback. It wouldn't have any locks or the like, though.

David



Re: Re: VACUUM is hanging

From
Tom Lane
Date:
Ashley Clark <aclark@ghoti.org> writes:
>> Try not to leave open transactions hanging around in your Apache
>> code. An idle database connection couldn't block VACUUM like that;
>> it'd have to have been in the middle of a BEGIN block, or maybe even
>> an unfinished query.

> That appears to be the culprit. I don't really understand why an
> uncommitted SELECT statement could/should lock the table though. How
> does that differ from just an open transaction where no query has
> happened yet?

If you've just done BEGIN then you aren't holding any locks.  But having
done a SELECT, you have a read lock on that table, which will persist
until you commit or abort the transaction.  A read lock doesn't normally
prevent other transactions from doing things with the table --- but it
does block "exclusive" locks, which are obtained for actions like DROP
TABLE, ALTER TABLE, and VACUUM.  Those guys want to know that there
ain't no one messin' with that table.

            regards, tom lane

Re: Re: VACUUM is hanging

From
Neil Conway
Date:
On Fri, May 04, 2001 at 09:44:10PM -0500, Ashley Clark wrote:
> * Tom Lane in "Re: VACUUM is hanging" dated 2001/05/04 22:05 wrote:
>
> > > I also seem to have found the culprit. I have Apache::DBI set up
> > > and it's holding something open. Restarting Apache let the vacuum
> > > finish normally.
> >
> > Try not to leave open transactions hanging around in your Apache
> > code. An idle database connection couldn't block VACUUM like that;
> > it'd have to have been in the middle of a BEGIN block, or maybe even
> > an unfinished query.
>
> That appears to be the culprit. I don't really understand why an
> uncommitted SELECT statement could/should lock the table though. How
> does that differ from just an open transaction where no query has
> happened yet?

If you are using Apache::DBI (as you mentioned previously), I
don't think this can be the case. Apache::DBI registers a
PerlCleanupHandler with Apache which will rollback any
uncommitted transactions after every request.

Cheers,

Neil

Re: VACUUM is hanging - jdbc?

From
"David Wall"
Date:
> Try not to leave open transactions hanging around in your Apache code.
> An idle database connection couldn't block VACUUM like that; it'd have
> to have been in the middle of a BEGIN block, or maybe even an unfinished
> query.

It's possible that the JDBC library always creates a new transaction after a
commit/rollback. It wouldn't have any locks or the like, though.

David