Home > mailing lists

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From	Te
Subject	Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date	November 16, 2019 16:46:02
Msg-id	48b7809d50c27cd65cc949bf4c01ff1b@bloodgate.com Whole thread Raw
In response to	Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>)
List	pgsql-hackers

Tree view

Moin,

On 2019-11-16 01:04, Peter Geoghegan wrote:
> On Fri, Nov 15, 2019 at 5:16 AM Robert Haas <robertmhaas@gmail.com> 
> wrote:
>> Hmm. Well, maybe I'm just behind the times. But that same wikipedia
>> article also says that deduplication works on large chunks "such as
>> entire files or large sections of files" thus differentiating it from
>> compression algorithms which work on the byte level, so it seems to me
>> that what you are doing still sounds more like ad-hoc compression.
> 
> I see your point.
> 
> One reason for my avoiding the word "compression" is that other DB
> systems that have something similar don't use the word compression
> either. Actually, they don't really call it *anything*. Posting lists
> are simply the way that secondary indexes work. The "Modern B-Tree
> techniques" book/survey paper mentions the idea of using a TID list in
> its "3.7 Duplicate Key Values" section, not in the two related
> sections that follow ("Bitmap Indexes", and "Data Compression").
> 
> That doesn't seem like a very good argument, now that I've typed it
> out. The patch applies deduplication/compression/whatever at the point
> where we'd otherwise have to split the page, unlike GIN. GIN eagerly
> maintains posting lists (doing in-place updates for most insertions
> seems pretty bad to me). My argument could reasonably be made about
> GIN, which really does consider posting lists the natural way to store
> duplicate tuples. I cannot really make that argument about nbtree with
> this patch, though -- delaying a page split by re-encoding tuples
> (changing their physical representation without changing their logical
> contents) justifies using the word "compression" in the name.
> 
>> > Can you suggest an alternative?
>> 
>> My instinct is to pick a name that somehow involves compression and
>> just put enough other words in there to make it clear e.g. duplicate
>> value compression, or something of that sort.
> 
> Does anyone else want to weigh in on this? Anastasia?
> 
> I will go along with whatever the consensus is. I'm very close to the
> problem we're trying to solve, which probably isn't helping me here.

I'm in favor of deduplication and not compression. Compression is a more 
generic term and can involve deduplication, but it hasn't to do so. (It 
could for instance just encode things in a more compact form). While 
deduplication does not involve compression, it just means store multiple 
things once, which by coincidence also amounts to using less space like 
compression can do.

ZFS also follows this by having both deduplication (store the same 
blocks only once with references) and compression (compress block 
contents, regardless wether they are stored once or many times).

So my vote is for deduplication (if I understand the thread correctly 
this is what the code no does, by storing the exact same key not that 
many times but only once with references or a count?).

best regards,

Tels

pgsql-hackers by date:

From: Amit Kapila
Date: 16 November 2019, 16:14:23
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions

From: Tom Lane
Date: 16 November 2019, 18:33:17
Subject: Re: Postgres on IBM z/OS 2.2.0 and 2.3.0

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

Previous

Next