Re: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals
Date
Msg-id 47AC9B5B.4090701@enterprisedb.com
Whole thread Raw
In response to BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals  ("Valentine Gogichashvili" <valgog@gmail.com>)
List pgsql-bugs
Valentine Gogichashvili wrote:
> The following bug has been logged online:
>
> Bug reference:      3942
> Logged by:          Valentine Gogichashvili
> Email address:      valgog@gmail.com
> PostgreSQL version: 8.2.3

You should upgrade to the latest 8.2.X minor release, regardless of this
issue...

> Operating system:   Linux 2.6.8 SMP
> Description:        Related to BUG #2568 and BUG #2859: VACUUM process hangs
> and does not respond to the kill signals
> Details:
>
> We are running several servers with the 'same' configuration.
>
> Recently two of many servers started to experience similar problems. It is
> quite difficult to reproduce. So I am providing the information that I have
> managed to get  by now. The problem on the second server was to be seen only
> once. The problem on the first server is happening once a day or so. It does
> not seem to actually depend on the server load.
>
> The problem is normally noticed when the update sessions on one of the
> biggest table are hanging. The server has a 'statement_timeout' parameter
> set globally to 20 seconds and cronjob vacuum jobs are being run every 2
> hours like:
> "set statement_timeout to 0; VACUUM VERBOSE ANALYZE;", but never the less
> the update sessions are not killed for hours.

Sounds like you have a client connected to the database, but doing
nothing. statement_timeout won't do anything to a backend that isn't
actively running a query, even if it has a transaction open. You could
do "SELECT * FROM pg_stat_activity" and look for any backends in "idle
in transaction" state.

> I cannot kill VACUUM session explicitly as well.

How are you trying to kill it?

>
> More info:
>
> rumata@dbserver-04:~$ ps -ef | grep VACUUM | grep -v grep
> postgres 24359 21321  0 13:47 ?        00:00:12 postgres: postgres
> rumataindex [local] VACUUM
> rumata@dbserver-04:~$ sudo su - postgres
> postgres@dbserver-04:~$ psql rumataindex
> Welcome to psql 8.2.3, the PostgreSQL interactive terminal.
>
> Type:  \copyright for distribution terms
>        \h for help with SQL commands
>        \? for help with psql commands
>        \g or terminate with semicolon to execute query
>        \q to quit
>
> rumataindex=# select * from pg_locks where pid = 24359;
>    locktype    | database | relation | page | tuple | transactionid |
> classid | objid | objsubid | transaction |  pid  |           mode
> | granted
> ---------------+----------+----------+------+-------+---------------+-------
> --+-------+----------+-------------+-------+--------------------------+-----
> ----
>  relation      |    17717 |    18917 |      |       |               |
>  |       |          |  1045306585 | 24359 | ShareUpdateExclusiveLock | t
>  relation      |    17717 |  1892961 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94216 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94213 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94214 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94206 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185865 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   952119 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94212 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94215 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94207 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185867 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94217 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  transactionid |          |          |      |       |    1045306585 |
>  |       |          |  1045306585 | 24359 | ExclusiveLock            | t
>  relation      |    17717 |    94208 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94211 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
> (16 rows)
>
> rumataindex=# \q
> postgres@dbserver-04:~$ kill 24359
> postgres@dbserver-04:~$ psql rumataindex
> Welcome to psql 8.2.3, the PostgreSQL interactive terminal.
>
> Type:  \copyright for distribution terms
>        \h for help with SQL commands
>        \? for help with psql commands
>        \g or terminate with semicolon to execute query
>        \q to quit
>
> rumataindex=# select * from pg_locks where pid = 24359;
>    locktype    | database | relation | page | tuple | transactionid |
> classid | objid | objsubid | transaction |  pid  |           mode
> | granted
> ---------------+----------+----------+------+-------+---------------+-------
> --+-------+----------+-------------+-------+--------------------------+-----
> ----
>  relation      |    17717 |    18917 |      |       |               |
>  |       |          |  1045306585 | 24359 | ShareUpdateExclusiveLock | t
>  relation      |    17717 |  1892961 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94216 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94213 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94214 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94206 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185865 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   952119 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94212 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94215 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94207 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |   185867 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94217 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  transactionid |          |          |      |       |    1045306585 |
>  |       |          |  1045306585 | 24359 | ExclusiveLock            | t
>  relation      |    17717 |    94208 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
>  relation      |    17717 |    94211 |      |       |               |
>  |       |          |  1045306585 | 24359 | RowExclusiveLock         | t
> (16 rows)
>
> rumataindex=# \q
>
> In addition, I want to mention, that "VACUUM VERBOSE ANALYZE;" session owned
> locks are the oldest ones, when  the issue is occurring.
>
> The only way to restart the server is to force immediate server shutdown
> like:
>
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
> waiting for server to shut
> down............................................................... failed
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
> waiting for server to shut
> down............................................................... failed
> pg_ctl: server does not shut down
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m i
> waiting for server to shut down.... done
> server stopped
> postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . start
> server starting
>
> There are no errors or warning related to VACUUM to be noticed in the log
> files. Or any logging related to the sessions, that are still hanging.

Is there anything in the log indicating why it refuses to shut down?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: "Valentine Gogichashvili"
Date:
Subject: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals
Next
From: "Arnaud"
Date:
Subject: BUG #3943: ecpg doesn't like "inet" operator