Thread: High QPS, random index writes and vacuum
Hi all
The company I work for has a large (50+ instances, 2-4 TB each) Postgres install. One of the key problems we are facing in vanilla Postgres is vacuum behavior on high QPS (20K writes/s), random index access on UUIDs.
In one case the table is 50Gb and has 3 indexes which are also 50Gb each. It takes 20 hours to vacuum the entire thing, where bulk of the time is spent doing 'index vacuuming'. The table is then instantly vacuumed again.
I increased work_mem to 2Gb, decreased sleep threshold to 2ms and increased the IO limit to 2000. I also changed the autovacuum thresholds for this table.
I understand that doing random index writes is not a good strategy, but, 20 hours to vacuum 200Gb is excessive.
My question is: what is the recommended strategy to deal with such cases in Postgres?
Thanks very much!!
Thanks very much!!
On Mon, Apr 17, 2023 at 5:35 PM peter plachta <pplachta@gmail.com> wrote: > My question is: what is the recommended strategy to deal with such cases in Postgres? You didn't say what version of Postgres you're using... -- Peter Geoghegan
peter plachta <pplachta@gmail.com> writes: > The company I work for has a large (50+ instances, 2-4 TB each) Postgres > install. One of the key problems we are facing in vanilla Postgres is > vacuum behavior on high QPS (20K writes/s), random index access on UUIDs. Indexing on a UUID column is an antipattern, because you're pretty much guaranteed the worst-case random access patterns for both lookups and insert/delete/maintenance cases. Can you switch to timestamps or the like? There are proposals out there for more database-friendly ways of generating UUIDs than the traditional ones, but nobody's gotten around to implementing that in Postgres AFAIK. regards, tom lane
On Tue, 18 Apr 2023 at 12:35, peter plachta <pplachta@gmail.com> wrote: > I increased work_mem to 2Gb maintenance_work_mem is the configuration option that vacuum uses to control how much memory it'll make available for storage of dead tuples. I believe 1GB would allow 178,956,970 tuples to be stored before multiple passes would be required. The chunk of memory for dead tuple storage is capped at 1GB. David
Thank you Tom.
Version: I sheepishly admit it's 9.6, 10 and 11 (it's Azure Single Server, that's another story).
I am definitely looking at redoing the way we do UUIDs... but that' s not a trivial change given the volume of data we have + 24/7 workload.
I was trying to understand whether there are any known workarounds for random access + index vacuums. Are my vacuum times 'normal' ?
I am definitely looking at redoing the way we do UUIDs... but that' s not a trivial change given the volume of data we have + 24/7 workload.
I was trying to understand whether there are any known workarounds for random access + index vacuums. Are my vacuum times 'normal' ?
On Mon, Apr 17, 2023 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
peter plachta <pplachta@gmail.com> writes:
> The company I work for has a large (50+ instances, 2-4 TB each) Postgres
> install. One of the key problems we are facing in vanilla Postgres is
> vacuum behavior on high QPS (20K writes/s), random index access on UUIDs.
Indexing on a UUID column is an antipattern, because you're pretty much
guaranteed the worst-case random access patterns for both lookups and
insert/delete/maintenance cases. Can you switch to timestamps or
the like?
There are proposals out there for more database-friendly ways of
generating UUIDs than the traditional ones, but nobody's gotten
around to implementing that in Postgres AFAIK.
regards, tom lane
Thank you David -- I increased this to 1GB as well (seeing as that was the max). We are doing mostly single passes now.
On Mon, Apr 17, 2023 at 7:40 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 18 Apr 2023 at 12:35, peter plachta <pplachta@gmail.com> wrote:
> I increased work_mem to 2Gb
maintenance_work_mem is the configuration option that vacuum uses to
control how much memory it'll make available for storage of dead
tuples. I believe 1GB would allow 178,956,970 tuples to be stored
before multiple passes would be required. The chunk of memory for dead
tuple storage is capped at 1GB.
David
On Tue, Apr 18, 2023 at 2:43 PM peter plachta <pplachta@gmail.com> wrote: > I was trying to understand whether there are any known workarounds for random access + index vacuums. Are my vacuum times'normal' ? Ah, it's not going to help on the old versions you mentioned, but for what it's worth: I remember noticing that I could speed up vacuum of uncorrelated indexes using parallel vacuum (v13), huge_pages=on, maintainance_work_mem=BIG, min_dynamic_shared_memory=BIG (v14), because then the memory that is binary-searched in random order avoids thrashing the TLB.
On Mon, Apr 17, 2023 at 7:43 PM peter plachta <pplachta@gmail.com> wrote: > Version: I sheepishly admit it's 9.6, 10 and 11 (it's Azure Single Server, that's another story). If you can upgrade to 14, you'll find that there is much improved management of index updates on that version: https://www.postgresql.org/docs/current/btree-implementation.html#BTREE-DELETION But it's not clear what the problem really is here. If the problem is that you're dependent on vacuum to get acceptable response times by holding back index bloat, then an upgrade could easily help a lot. But an upgrade might not make VACUUM take less time, given that you've already tuned it fairly aggressively. It depends. An upgrade might make VACUUM go faster if you set vacuum_cost_page_miss to 2, which is the default on later versions anyway -- looks like you didn't touch that. And, as Thomas said, later versions do have parallel VACUUM, though that cannot be used by autovacuum workers. -- Peter Geoghegan