Re: Remove size limitations of vacuums dead_tuples array - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Remove size limitations of vacuums dead_tuples array
Date
Msg-id 20191010140508.zb22rihya6nmchwd@development
Whole thread Raw
In response to Remove size limitations of vacuums dead_tuples array  (Ants Aasma <ants@cybertec.at>)
Responses Re: Remove size limitations of vacuums dead_tuples array
List pgsql-hackers
On Wed, Oct 09, 2019 at 03:58:11PM +0300, Ants Aasma wrote:
>When dealing with a case where a 2TB table had 3 billion dead tuples I
>discovered that vacuum currently can't make use of more than 1GB of
>maintenance_work_mem - 179M tuples. This caused excessive amounts of index
>scanning even though there was plenty of memory available.
>
>I didn't see any good reason for having this limit, so here is a patch that
>makes use of MemoryContextAllocHuge, and converts the array indexing to use
>size_t to lift a second limit at 12GB.
>
>One potential problem with allowing larger arrays is that bsearch might no
>longer be the best way of determining if a ctid was marked dead. It might
>pay off to convert the dead tuples array to a hash table to avoid O(n log
>n) runtime when scanning indexes. I haven't done any profiling yet to see
>how big of a problem this is.
>
>Second issue I noticed is that the dead_tuples array is always allocated
>max allowed size, unless the table can't possibly have that many tuples. It
>may make sense to allocate it based on estimated number of dead tuples and
>resize if needed.
>

There already was a attempt to make this improvement, see [1]. There was
a fairly long discussion about how to best do that (using other data
structure, not just a simple array). It kinda died about a year ago, but
I suppose there's a lot of relevant info in that thread.

[1] https://www.postgresql.org/message-id/CAGTBQpbDCaR6vv9%3DscXzuT8fSbckf%3Da3NgZdWFWZbdVugVht6Q%40mail.gmail.com


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Ildar Musin
Date:
Subject: Compressed pluggable storage experiments
Next
From: Stephen Frost
Date:
Subject: Re: Transparent Data Encryption (TDE) and encrypted files