Thread: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals

The following bug has been logged online:

Bug reference:      3942
Logged by:          Valentine Gogichashvili
Email address:      valgog@gmail.com
PostgreSQL version: 8.2.3
Operating system:   Linux 2.6.8 SMP
Description:        Related to BUG #2568 and BUG #2859: VACUUM process hangs
and does not respond to the kill signals
Details:

We are running several servers with the 'same' configuration.

Recently two of many servers started to experience similar problems. It is
quite difficult to reproduce. So I am providing the information that I have
managed to get  by now. The problem on the second server was to be seen only
once. The problem on the first server is happening once a day or so. It does
not seem to actually depend on the server load.

The problem is normally noticed when the update sessions on one of the
biggest table are hanging. The server has a 'statement_timeout' parameter
set globally to 20 seconds and cronjob vacuum jobs are being run every 2
hours like:
"set statement_timeout to 0; VACUUM VERBOSE ANALYZE;", but never the less
the update sessions are not killed for hours.

I cannot kill VACUUM session explicitly as well.

More info:

rumata@dbserver-04:~$ ps -ef | grep VACUUM | grep -v grep
postgres 24359 21321  0 13:47 ?        00:00:12 postgres: postgres
rumataindex [local] VACUUM
rumata@dbserver-04:~$ sudo su - postgres
postgres@dbserver-04:~$ psql rumataindex
Welcome to psql 8.2.3, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

rumataindex=# select * from pg_locks where pid = 24359;
   locktype    | database | relation | page | tuple | transactionid |
classid | objid | objsubid | transaction |  pid  |           mode
| granted
---------------+----------+----------+------+-------+---------------+-------
--+-------+----------+-------------+-------+--------------------------+-----
----
 relation      |    17717 |    18917 |      |       |               |
 |       |          |  1045306585 | 24359 | ShareUpdateExclusiveLock | t
 relation      |    17717 |  1892961 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94216 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94213 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94214 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94206 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |   185865 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |   952119 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94212 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94215 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94207 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |   185867 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94217 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 transactionid |          |          |      |       |    1045306585 |
 |       |          |  1045306585 | 24359 | ExclusiveLock            | t
 relation      |    17717 |    94208 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94211 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
(16 rows)

rumataindex=# \q
postgres@dbserver-04:~$ kill 24359
postgres@dbserver-04:~$ psql rumataindex
Welcome to psql 8.2.3, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

rumataindex=# select * from pg_locks where pid = 24359;
   locktype    | database | relation | page | tuple | transactionid |
classid | objid | objsubid | transaction |  pid  |           mode
| granted
---------------+----------+----------+------+-------+---------------+-------
--+-------+----------+-------------+-------+--------------------------+-----
----
 relation      |    17717 |    18917 |      |       |               |
 |       |          |  1045306585 | 24359 | ShareUpdateExclusiveLock | t
 relation      |    17717 |  1892961 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94216 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94213 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94214 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94206 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |   185865 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |   952119 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94212 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94215 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94207 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |   185867 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94217 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 transactionid |          |          |      |       |    1045306585 |
 |       |          |  1045306585 | 24359 | ExclusiveLock            | t
 relation      |    17717 |    94208 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
 relation      |    17717 |    94211 |      |       |               |
 |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
(16 rows)

rumataindex=# \q

In addition, I want to mention, that "VACUUM VERBOSE ANALYZE;" session owned
locks are the oldest ones, when  the issue is occurring.

The only way to restart the server is to force immediate server shutdown
like:

postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
waiting for server to shut
down............................................................... failed
postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
waiting for server to shut
down............................................................... failed
pg_ctl: server does not shut down
postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m i
waiting for server to shut down.... done
server stopped
postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . start
server starting

There are no errors or warning related to VACUUM to be noticed in the log
files. Or any logging related to the sessions, that are still hanging.

With best regards,

-- Valentine Gogichashvili
Valentine Gogichashvili wrote:
> The following bug has been logged online:
>
> Bug reference:      3942
> Logged by:          Valentine Gogichashvili
> Email address:      valgog@gmail.com
> PostgreSQL version: 8.2.3

You should upgrade to the latest 8.2.X minor release, regardless of this
issue...

> Operating system:   Linux 2.6.8 SMP
> Description:        Related to BUG #2568 and BUG #2859: VACUUM process hangs
> and does not respond to the kill signals
> Details:
>
> We are running several servers with the 'same' configuration.
>
> Recently two of many servers started to experience similar problems. It is
> quite difficult to reproduce. So I am providing the information that I have
> managed to get  by now. The problem on the second server was to be seen only
> once. The problem on the first server is happening once a day or so. It does
> not seem to actually depend on the server load.
>
> The problem is normally noticed when the update sessions on one of the
> biggest table are hanging. The server has a 'statement_timeout' parameter
> set globally to 20 seconds and cronjob vacuum jobs are being run every 2
> hours like:
> "set statement_timeout to 0; VACUUM VERBOSE ANALYZE;", but never the less
> the update sessions are not killed for hours.

Sounds like you have a client connected to the database, but doing
nothing. statement_timeout won't do anything to a backend that isn't
actively running a query, even if it has a transaction open. You could
do "SELECT * FROM pg_stat_activity" and look for any backends in "idle
in transaction" state.

> I cannot kill VACUUM session explicitly as well.

How are you trying to kill it?

>
> More info:
>
> rumata@dbserver-04:~$ ps -ef | grep VACUUM | grep -v grep
> postgres 24359 21321  0 13:47 ?        00:00:12 postgres: postgres
> rumataindex [local] VACUUM
> rumata@dbserver-04:~$ sudo su - postgres
> postgres@dbserver-04:~$ psql rumataindex
> Welcome to psql 8.2.3, the PostgreSQL interactive terminal.
>
> Type:  \copyright for distribution terms
>        \h for help with SQL commands
>        \? for help with psql commands
>        \g or terminate with semicolon to execute query
>        \q to quit
>
> rumataindex=# select * from pg_locks where pid = 24359;
>    locktype    | database | relation | page | tuple | transactionid |
> classid | objid | objsubid | transaction |  pid  |           mode
> | granted
> ---------------+----------+----------+------+-------+---------------+-------
> --+-------+----------+-------------+-------+--------------------------+-----
> ----
>  relation      |    17717 |    18917 |      |       |               |
>  |       |          |  1045306585 | 24359 | ShareUpdateExclusiveLock | t
>  relation      |    17717 |  1892961 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94216 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94213 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94214 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94206 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185865 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   952119 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94212 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94215 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94207 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185867 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94217 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  transactionid |          |          |      |       |    1045306585 |
>  |       |          |  1045306585 | 24359 | ExclusiveLock            | t
>  relation      |    17717 |    94208 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94211 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
> (16 rows)
>
> rumataindex=# \q
> postgres@dbserver-04:~$ kill 24359
> postgres@dbserver-04:~$ psql rumataindex
> Welcome to psql 8.2.3, the PostgreSQL interactive terminal.
>
> Type:  \copyright for distribution terms
>        \h for help with SQL commands
>        \? for help with psql commands
>        \g or terminate with semicolon to execute query
>        \q to quit
>
> rumataindex=# select * from pg_locks where pid = 24359;
>    locktype    | database | relation | page | tuple | transactionid |
> classid | objid | objsubid | transaction |  pid  |           mode
> | granted
> ---------------+----------+----------+------+-------+---------------+-------
> --+-------+----------+-------------+-------+--------------------------+-----
> ----
>  relation      |    17717 |    18917 |      |       |               |
>  |       |          |  1045306585 | 24359 | ShareUpdateExclusiveLock | t
>  relation      |    17717 |  1892961 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94216 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94213 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94214 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94206 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185865 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   952119 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94212 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94215 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94207 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185867 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94217 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  transactionid |          |          |      |       |    1045306585 |
>  |       |          |  1045306585 | 24359 | ExclusiveLock            | t
>  relation      |    17717 |    94208 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94211 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
> (16 rows)
>
> rumataindex=# \q
>
> In addition, I want to mention, that "VACUUM VERBOSE ANALYZE;" session owned
> locks are the oldest ones, when  the issue is occurring.
>
> The only way to restart the server is to force immediate server shutdown
> like:
>
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
> waiting for server to shut
> down............................................................... failed
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
> waiting for server to shut
> down............................................................... failed
> pg_ctl: server does not shut down
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m i
> waiting for server to shut down.... done
> server stopped
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . start
> server starting
>
> There are no errors or warning related to VACUUM to be noticed in the log
> files. Or any logging related to the sessions, that are still hanging.

Is there anything in the log indicating why it refuses to shut down?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com