Re: Compress prune/freeze records with Delta Frame of Reference algorithm - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Compress prune/freeze records with Delta Frame of Reference algorithm
Date
Msg-id b301de3c-0945-4401-995a-18d5939d0290@iki.fi
Whole thread
In response to Re: Compress prune/freeze records with Delta Frame of Reference algorithm  (Evgeny Voropaev <evgeny.voropaev@tantorlabs.com>)
List pgsql-hackers
On 21/04/2026 08:41, Evgeny Voropaev wrote:
> Hello hackers,
> 
>> Can this DFoR code replace integerset.c easily? Can we use it for
>> the vacuum dead TID list? For GIN posting lists? Where else?
> 
> Heikki, thank you for your attention and proposals. I'm learning areas
> you proposed to be developed. This took time, since I am not adept at
> them. Last week I also have been developing the DFoR patch to support
> unsorted sequences. That's why there was the delay in answering.
> 
> About GIN.
> Since GIN exploits TIDs sequences and saves it on the disk, it can be
> the most appropriate candidate to be developed with DFoR.

+1. And maybe the tid lists in to B-tree tuples too while we're at it.

For GIN posting lists, one important property of the current compression 
scheme is that removing TIDs never makes the list larger than the 
original. That's important for VACUUM, see 

https://github.com/postgres/postgres/blob/d3bba041543593eb5341683107d899734dc8e73e/src/backend/access/gin/ginpostinglist.c#L55

> About the dead TID list.
> If I'm not mistaken, the dead TID list exists only in RAM and never on
> the disk or in the network. So, what is the advantage supposed to be
> achieved due to using compression in the dead TID list?

Reduces memory usage. And if it's faster to lookup than the current data 
structure, that too. I don't know if that works out.

> About the GiST vacuuming and the use of integerset in it.
> The integerset implements a tree in addition to compression.
> DFoR now performs only compression. Moreover the size of a pack is
> flexible (varying), which must become an issue for its usage in the
> tree. It needs more thorough further elaboration to be developed.

Hmm. The integerset is a sparse list of integers, just like Frame of 
Reference. The tree inside it is just an implementation detail. I was 
thinking that you could replace the whole tree with DFoR, but I suppose 
you cannot do random lookups in a DFoR compressed list, so you'd still 
need the tree.

- Heikki




pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: logical: fix recomputation required LSN on restart_lsn-only advancement
Next
From: Chao Li
Date:
Subject: Re: Adding REPACK [concurrently]