Thread: automatic vacuum on pg_statistic pg_toast area blocks all queries in hot standby
automatic vacuum on pg_statistic pg_toast area blocks all queries in hot standby
From
piuschan
Date:
Hi, We have a master/hot standy databases set up with streaming replication. The pair are running: PostgreSQL 9.1.11 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit Linux 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 19:59:55 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux We just find the following situation: on the master db server: (1) at 16:58:22 automatic vacuum of table ""pg_toast.pg_toast_2619"": index scans: 1 almost at the same time, queries at hot standby are blocked, all waiting for a share lock on pg_toast_2619, for example: process 26822 still waiting for AccessShareLock on relation 2840 of database 16414 after 1000.089 ms process 26820 still waiting for AccessShareLock on relation 2840 of database 16414 after 1000.085 ms process 26819 still waiting for AccessShareLock on relation 2840 of database 16414 after 1000.079 ms and etc... relation 2840 is actually pg_toast_2619 of pg_statistic: DB=# select relname from pg_class where reltoastrelid=2840; relname -------------- pg_statistic (1 row) Any idea why? Thanks a lot, Pius -- View this message in context: http://postgresql.1045698.n5.nabble.com/automatic-vacuum-on-pg-statistic-pg-toast-area-blocks-all-queries-in-hot-standby-tp5807143.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
Hello. What happens on the master: 1. A typical request puts a AccessShareLock on the table. 2. Autovacuum puts cleans dead entries from the table. This is possible because AccessShareLock does not conflict with ExclusiveLock. http://www.postgresql.org/docs/9.1/static/explicit-locking.html 3. Autovacuum finish successful. 4. Unlock AccessShareLock. What happens on the slave: 1. From WAL puts a AccessShareLock on table. 2. From WAL recovery Autovacuum result, but there is a conflict. WAL don't recovery, waiting time seconds from parameter max_standby_streaming_delay before canceling queries. 3. Queries waiting unlock AccessShareLock. Unlock does not occur because of paragraph 2. 4. Queries canceling by statement_timeout or conflict recovery. 5. WAL recovery and unlock AccessShareLock. If enable hot_standby_feedback then master also waiting slave time seconds from max_standby_streaming_delay parameter. Solution: 1. Disable hot_standby_feedback. 2. select txid_current() into res1 3. Waiting seconds. 4. select txid_current() into res2 5. Change postgresql.conf on master. vacuum_defer_cleanup_age = res2 - res1 + reserve. Good luck. -- View this message in context: http://postgresql.1045698.n5.nabble.com/automatic-vacuum-on-pg-statistic-pg-toast-area-blocks-all-queries-in-hot-standby-tp5807143p5820168.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
Re: automatic vacuum on pg_statistic pg_toast area blocks all queries in hot standby
From
piuschan
Date:
Hi, Thanks for your feedback. Someone finally replies to this thread. I am not quite clear of your solution, so I need more info from you. First of all, we do not enable hot_standby_feedback on SLAVE DB. postgres=# show hot_standby_feedback; hot_standby_feedback ---------------------- off (1 row) Secondly, where do you run "select txid_current() into res1" and "select txid_current() into res2" ? Even if you set MASTER vacuum_defer_cleanup_age to certain value, autovacuum still can kick in anytime and may conflict query on the target table. So setting vacuum_defer_cleanup_age to a value can really reduce the chance of such situation? Thanks a lot, Pius -- View this message in context: http://postgresql.1045698.n5.nabble.com/automatic-vacuum-on-pg-statistic-pg-toast-area-blocks-all-queries-in-hot-standby-tp5807143p5820171.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
> Secondly, where do you run "select txid_current() into res1" and "select txid_current() into res2" ? It does not matter where. Important to count the number of transactions that takes place during max_standby_streaming_delay. Example res2-res1 = 25000, my value vacuum_defer_cleanup_age = 40000 > Even if you set MASTER vacuum_defer_cleanup_age to certain value, > autovacuum still can kick in anytime and may conflict query on the target > table. So setting vacuum_defer_cleanup_age to a value can really reduce > the chance of such situation? Due to this parameter, at the time when Autovacuum clean the dead rows they will be dead on the slave, and thus no queries will not use them. Therefore, the conflict will not be. > Moreover, in my post, the vacuum is on pg_statistic toast area, why did it > block queries on SLAVE? Autovacuum serves all tables, including a pg_catalog. This problem occurs on any table. For example pg_class or my_table. Autovacuum remove rows in moment AccessShareLock, this is row and unlock AccessShare receive to slave with WAL. WAL not recorded on slave due to a conflict recovery. At the same time queries are waiting for a lock on the slave. -- View this message in context: http://postgresql.1045698.n5.nabble.com/automatic-vacuum-on-pg-statistic-pg-toast-area-blocks-all-queries-in-hot-standby-tp5807143p5820183.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
Re: automatic vacuum on pg_statistic pg_toast area blocks all queries in hot standby
From
piuschan
Date:
Good point, thanks a lot for your help. Pius -- View this message in context: http://postgresql.1045698.n5.nabble.com/automatic-vacuum-on-pg-statistic-pg-toast-area-blocks-all-queries-in-hot-standby-tp5807143p5820192.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.