Re: xid wrap / optimize frozen tables? - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: xid wrap / optimize frozen tables? |
Date | |
Msg-id | CAMkU=1yZu1hsxNNO3Xzuom5h+tXps5eHrDX081wiqeRuf+kMuA@mail.gmail.com Whole thread Raw |
In response to | Re: xid wrap / optimize frozen tables? (Nils Goroll <slink@schokola.de>) |
Responses |
Re: xid wrap / optimize frozen tables?
(Nils Goroll <slink@schokola.de>)
Re: xid wrap / optimize frozen tables? (Jeff Janes <jeff.janes@gmail.com>) |
List | pgsql-hackers |
<p dir="ltr"><br /> On May 24, 2015 6:42 AM, "Nils Goroll" <<a href="mailto:slink@schokola.de">slink@schokola.de</a>>wrote:<br /> ><br /> > Hi Jeff and all,<br /> ><br /> >On 23/05/15 22:13, Jeff Janes wrote:<br /> > > Are you sure it is the read IO that causes the problem?<br /> ><br/> > Yes. Trouble is here that we are talking about a 361 GB table<br /> ><br /> > List of relations<br /> > Schema | Name | Type | Owner | Size |<br/> > Description<br /> > --------+-----------------------------+----------+----------+------------+-------------<br/> > public | *redacted*_y2015m04 | table | postgres | 361 GB |<br /> ><br /> > and while we have<br /> ><br />> shared_buffers = 325GB<br /> > huge_pages = on<p dir="ltr">As mentioned, that is very large settingfor share buffers.<p dir="ltr">><br /> > this is not the only table of this size (total db size ist 1.8tb) andmore<br /> > current data got written to *redacted*_y2015m05 (the manually-partitioned table<br /> > for may), somost of the m04 data would have got evicted from the cache when<br /> > this issue surfaced initially.<br /> ><br/> > There is one application pushing data (bulk inserts) and we have transaction<br /> > rates for this appin a log. The moment the vacuum started, these rates dropped.<br /> > Unfortunately I cannot present helpful log excerptshere as the autovacuum never<br /> > finished so far (because the admin killed the db), so we have zero loggingabout<br /> > past autovac events.<p dir="ltr">Could you do an experiment in which you do a large sequential readon the database files and measure the impact on the queries that way? Like:<p dir="ltr">tar -cf - data_dir | wc -c<pdir="ltr">Or better, use some fancy version that throttles to the read rate observed below.<p dir="ltr">> At the moment,the application is shut down and the machine is only running the<br /> > vacs:<br /> ><br /> > query_start | 2015-05-22 19:33:52.44334+02<br /> > waiting | f<br /> > query | autovacuum:VACUUM public.*redacted*_y2015m04 (to prevent<br /> > wraparound)<br /> > query_start | 2015-05-2219:34:02.46004+02<br /> > waiting | f<br /> > query | autovacuum: VACUUM ANALYZE public.*redacted*_y2015m05(to<br /> > prevent wraparound)<br /> ><br /> > so we know that any io must be causedby the vacs:<br /> ><br /> > shell# uptime<br /> > 13:33:33 up 1 day, 18:01, 2 users, load average: 5.75,12.71, 8.43<p dir="ltr">What OS is this? This load average looks very high. Does the OS charge processes that areblocked on IO against uptime?<p dir="ltr">> shell# zpool iostat<br /> > capacity operations bandwidth<br /> > pool alloc free read write read write<br /> > --------------- ----- ----- ----- ----- ----- -----<br /> > tank1 358G 6.90T 872 55 15.1M 3.08M<pdir="ltr">I'm not familiar with zpool but this shows a lot of writing going on. If the table was already frozenand just needed to be observed as being all frozen, then it should not be dirtying one block for every 5 blocks read. <p dir="ltr">I would not be surprised if it were the reading, not the writing, which caused the performance problem.<br/><p dir="ltr">Cheers, Jeff
pgsql-hackers by date: