On Thu, Jun 28, 2012 at 6:57 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
> A second obstacle to "opportunistic wraparound vacuum" is that
> wraparound vacuum is not interruptable. If you have to kill it off and
> do something else for a couple hours, it can't pick up where it left
> off; it needs to scan the whole table from the beginning again.
Would recording a different relfrozenxid for each 1GB chunk of the
relation solve that?
>> Since your users weren't complaining about performance with one or two
>> autovac workers running (were they?),
>
> No, it's when we hit 3 that it fell over. Thresholds vary with memory
> and table size, of course.
Does that mean it worked fine with 2 workers simultaneously in large
tables, or did that situation not occur and so it is not known whether
it would have worked fine or not?
> BTW, the primary reason I think (based on a glance at system stats) this
> drove the system to its knees was that the simultaneous wraparound
> vacuum of 3 old-cold tables evicted all of the "current" data out of the
> FS cache, forcing user queries which would normally hit the FS cache
> onto disk. I/O throughput was NOT at 100% capacity.
Do you know if it was the input or the output that caused that to
happen? I would think the kernel has logic similar to BAS to prevent
reading a huge amount of data sequentially from evicting all the other
data. But that logic might be defeated if all that data is dirtied
right after being read.
If the partitions had not been touched since the last freeze, then it
should generate no dirty blocks (right?), but if they were touched
since then you could basically be writing out the entire table.
Cheers,
Jeff