Thread: BUG #6296: High level backend-write activity on db
The following bug has been logged online: Bug reference: 6296 Logged by: Maksym Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 9.0.5 Operating system: Linux RedHat Description: High level backend-write activity on db Details: Again I not sure it is actual bug or something else go wrong with my database. One of my databases just started produce 256 wal segments in 5 minutes instead of 100-200 wal segments per hour (averages). In the same time write and read activity on DB stay same (according to the pg_stat_database data and historical graphs). No anti-wraparound vacuum and such things going on. Situation staying same during last few days. I found some other anomalies: 1)5 minutes after SELECT pg_stat_reset_shared('bgwriter'); mirtesen=# SELECT * from pg_stat_bgwriter ; checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc -------------------+-----------------+--------------------+---------------+- -----------------+-----------------+--------------- 0 | 2 | 171675 | 804 | 0 | 1010429 | 1131106 (1 row) eg almost any new buffer allocated by backend leading to backend write out dirty page (and that situation do not change overtime). That is very strange... because in DB a lot non-dirty pages with usagecount=0: mirtesen=# SELECT usagecount,count(*),isdirty FROM public.pg_buffercache GROUP BY isdirty,usagecount ORDER BY isdirty,usagecount ; usagecount | count | isdirty ------------+---------+--------- 0 | 902086 | f 1 | 1541321 | f 2 | 1045230 | f 3 | 500173 | f 4 | 1183640 | f 5 | 3926461 | f 1 | 10866 | t 2 | 5127 | t 3 | 6987 | t 4 | 17445 | t 5 | 35687 | t | 17 | (12 rows) so 1:1 backend buffer allocation to backend dirty page writes seems exceptionally strange. And that situation stay same during last few days. I out of ideas now. PS: I going to build xlogdump and try examine what filling all my wal files.
On Wed, Nov 16, 2011 at 4:07 AM, Maksym Boguk <maxim.boguk@gmail.com> wrote: > Again I not sure it is actual bug or something else go wrong with my > database. > > One of my databases just started produce 256 wal segments in 5 minutes > instead of 100-200 wal segments per hour (averages). > > In the same time write and read activity on DB stay same (according to the > pg_stat_database data and historical graphs). > No anti-wraparound vacuum and such things going on. > Situation staying same during last few days. > > > I found some other anomalies: > 1)5 minutes after > SELECT pg_stat_reset_shared('bgwriter'); > > mirtesen=3D# SELECT * from pg_stat_bgwriter ; > =A0checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_cle= an | > maxwritten_clean | buffers_backend | buffers_alloc > -------------------+-----------------+--------------------+--------------= -+- > -----------------+-----------------+--------------- > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0 | =A0 =A0 =A0 =A0 =A0 =A0 =A0 2 | =A0 = =A0 =A0 =A0 =A0 =A0 171675 | =A0 =A0 =A0 =A0 =A0 804 | > =A0 =A0 =A0 =A0 =A0 =A0 =A00 | =A0 =A0 =A0 =A0 1010429 | =A0 =A0 =A0 1131= 106 > (1 row) > > eg almost any new buffer allocated by backend leading to backend write out > dirty page (and that situation do not change overtime). This sure sounds like a ring buffer is being used, suggesting VACUUM or COPY IN activity. What does pg_stat_activity say? --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Wed, Nov 16, 2011 at 4:07 AM, Maksym Boguk <maxim.boguk@gmail.com> wro= te: >> Again I not sure it is actual bug or something else go wrong with my >> database. >> >> One of my databases just started produce 256 wal segments in 5 minutes >> instead of 100-200 wal segments per hour (averages). >> >> In the same time write and read activity on DB stay same (according to t= he >> pg_stat_database data and historical graphs). >> No anti-wraparound vacuum and such things going on. >> Situation staying same during last few days. >> >> >> I found some other anomalies: >> 1)5 minutes after >> SELECT pg_stat_reset_shared('bgwriter'); >> >> mirtesen=3D# SELECT * from pg_stat_bgwriter ; >> =C2=A0checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers= _clean | >> maxwritten_clean | buffers_backend | buffers_alloc >> -------------------+-----------------+--------------------+-------------= --+- >> -----------------+-----------------+--------------- >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 | =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 171675 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 804 | >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 | =C2=A0 =C2=A0 =C2=A0= =C2=A0 1010429 | =C2=A0 =C2=A0 =C2=A0 1131106 >> (1 row) >> >> eg almost any new buffer allocated by backend leading to backend write o= ut >> dirty page (and that situation do not change overtime). > > This sure sounds like a ring buffer is being used, suggesting VACUUM > or COPY IN activity. > > What does pg_stat_activity say? > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > I very sorry, it was a false alert. Monitoring was based on sum() over pg_stat_all_tables. But problem was bugged cron script which were every minute looping over create table something_tmp as query (like 20M entries); instant after drop table something_tmp; As a result of that instant drop - an insert volumes were missed by monito= ring. PS: xlogdump really great tool and it was good help in locating and fixing problem. --=20 Maxim Boguk Senior Postgresql DBA.