Re: Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Vacuum: allow usage of more than 1GB of work mem
Date
Msg-id CABOikdPmjbNu9t4B12a+Eu5QW+uZD_2fDmkKesaRHQykfvUgoQ@mail.gmail.com
Whole thread Raw
In response to Re: Vacuum: allow usage of more than 1GB of work mem  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Vacuum: allow usage of more than 1GB of work mem  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers


On Wed, Sep 14, 2016 at 10:53 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:


One thing not quite clear to me is how do we create the bitmap
representation starting from the array representation in midflight
without using twice as much memory transiently.  Are we going to write
the array to a temp file, free the array memory, then fill the bitmap by
reading the array from disk?

We could do that. Or may be compress TID array when consumed half m_w_m and do this repeatedly with remaining memory. For example, if we start with 1GB memory, we decide to compress at 512MB. Say that results in 300MB for bitmap. We then continue to accumulate TID and do another round of fold up when another 350MB is consumed.

I think we should maintain per offset count of number of dead tuples to choose the most optimal bitmap size (that needs overflow region). We can also track how many blocks or block ranges have at least one dead tuple to know if it's worthwhile to have some sort of indirection. Together that can tell us how much compression can be achieved and allow us to choose the most optimal representation.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Next
From: Heikki Linnakangas
Date:
Subject: Re: Tuplesort merge pre-reading