Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From David Klika
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 84a6d065-1dc3-4b37-af7b-75904d967ab4@atlas.cz
Whole thread Raw
In response to Adding REPACK [concurrently]  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hello

Great to hear about this feature.

You speak about table rewrite (suppose a whole-table rewrite). I would
like to share idea of an alternative approach that also takes into
account amount of WAL generated during the operation. Applicable to
non-clustered case only.

Let's consider a large table where 80% blocks are fine (filled enough by
live tuples). The table could be scanned from the beginning (left side)
to identify "not enough filled" blocks and also from the end (right
side) to process live tuples by moving them to the blocks identified
by the left side scan. The work is over when both scan reaches the same
position.

Example:

_ stands for filled enough blocks

D stands for blocks with (many) dead tuples

123456789
___DD____

Left scan identifies page #4 and tuples from the right scan (page #9)
are moved here. The same with tuples from #8 to #5. Two pages from the
data file are trimmed and (only) pages #4 and #5 are written in WAL,
others are untouched.

Regards
David




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Use func(void) for functions with no parameters
Next
From: "Jelte Fennema-Nio"
Date:
Subject: Re: Safer hash table initialization macro