Re: xid wrap / optimize frozen tables? - Mailing list pgsql-hackers

From Nils Goroll
Subject Re: xid wrap / optimize frozen tables?
Date
Msg-id 5561B941.3080102@schokola.de
Whole thread Raw
In response to Re: xid wrap / optimize frozen tables?  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: xid wrap / optimize frozen tables?  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Re: xid wrap / optimize frozen tables?  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
Hi Jeff and all,

On 23/05/15 22:13, Jeff Janes wrote:
> Are you sure it is the read IO that causes the problem?

Yes. Trouble is here that we are talking about a 361 GB table
                                  List of relationsSchema |            Name             |   Type   |  Owner   |    Size
  |
 
Description
--------+-----------------------------+----------+----------+------------+-------------public | *redacted*_y2015m04
   | table    | postgres | 361 GB     |
 

and while we have
shared_buffers = 325GBhuge_pages = on

this is not the only table of this size (total db size ist 1.8tb) and more
current data got written to *redacted*_y2015m05 (the manually-partitioned table
for may), so most of the m04 data would have got evicted from the cache when
this issue surfaced initially.

There is one application pushing data (bulk inserts) and we have transaction
rates for this app in a log. The moment the vacuum started, these rates dropped.
Unfortunately I cannot present helpful log excerpts here as the autovacuum never
finished so far (because the admin killed the db), so we have zero logging about
past autovac events.

At the moment, the application is shut down and the machine is only running the
vacs:

query_start      | 2015-05-22 19:33:52.44334+02
waiting          | f
query            | autovacuum: VACUUM public.*redacted*_y2015m04 (to prevent
wraparound)
query_start      | 2015-05-22 19:34:02.46004+02
waiting          | f
query            | autovacuum: VACUUM ANALYZE public.*redacted*_y2015m05 (to
prevent wraparound)

so we know that any io must be caused by the vacs:

shell# uptime13:33:33 up 1 day, 18:01,  2 users,  load average: 5.75, 12.71, 8.43
shell# zpool iostat                   capacity     operations    bandwidth
pool             alloc   free   read  write   read  write
---------------  -----  -----  -----  -----  -----  -----
tank1             358G  6.90T    872     55  15.1M  3.08M


Again, we know IO capacity is insufficient, the pool is on 2 magnetic disks only
atm, so an avg read rate of 872 IOPS averaged over 42 hours is not even bad.

> I don't know happened to that, but there is another patch waiting for review and
> testing:
> 
> https://commitfest.postgresql.org/5/221/

This is really interesting, thank you very much for the pointer.

Cheers, Nils



pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: fsync-pgdata-on-recovery tries to write to more files than previously
Next
From: Bruce Momjian
Date:
Subject: Re: Run pgindent now?