Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] - Mailing list pgsql-hackers

From Gregory Smith
Subject Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date
Msg-id 5425F895.8060704@gmail.com
Whole thread Raw
In response to Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
Responses Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
List pgsql-hackers
On 9/26/14, 2:38 PM, Gavin Flower wrote:
> Curious: would it be both feasible and useful to have multiple workers
> process a 'large' table, without complicating things too much?  The
> could each start at a different position in the file.

Not really feasible without a major overhaul.  It might be mildly useful
in one rare case.  Occasionally I'll find very hot single tables that
vacuum is constantly processing, despite mostly living in RAM because
the server has a lot of memory.  You can set vacuum_cost_page_hit=0 in
order to get vacuum to chug through such a table as fast as possible.

However, the speed at which that happens will often then be limited by
how fast a single core can read from memory, for things in
shared_buffers.  That is limited by the speed of memory transfers from a
single NUMA memory bank.  Which bank you get will vary depending on the
core that owns that part of shared_buffers' memory, but it's only one at
a time.

On large servers, that can be only a small fraction of the total memory
bandwidth the server is able to reach.  I've attached a graph showing
how this works on a system with many NUMA banks of RAM, and this is only
a medium sized system.  This server can hit 40GB/s of memory transfers
in total; no one process will ever see more than 8GB/s.

If we had more vacuum processes running against the same table, there
would then be more situations where they were doing work against
different NUMA memory banks at the same time, therefore making faster
progress through the hits in shared_buffers possible. In the real world,
this situation is rare enough compared to disk-bound vacuum work that I
doubt it's worth getting excited over.  Systems with lots of RAM where
performance is regularly dominated by one big ugly table are common
though, so I wouldn't just rule the idea out as not useful either.

--
Greg Smith greg.smith@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Next
From: Josh Berkus
Date:
Subject: Re: jsonb format is pessimal for toast compression