Re: [PATCH] Compression dictionaries for JSONB - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Re: [PATCH] Compression dictionaries for JSONB
Date
Msg-id CAJ7c6TO8XuQTGmH8o8h5_vQrHjnHHiM5+wTgG6ZACm3Wo3YP9A@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Compression dictionaries for JSONB  (Andres Freund <andres@anarazel.de>)
Responses Re: [PATCH] Compression dictionaries for JSONB
List pgsql-hackers
Hi Andres,

> As I said, I don't think we should extend dictionaries. For this to work we'll
> likely need a new / extended compressed toast datum header of some form, with
> a reference to the dictionary. That'd likely be needed even with updatable
> dictionaries, as we IIRC don't know which column a toasted datum is for, but
> we need to know, to identify the dictionary. As we need that field anyway, we
> can easily have multiple dictionaries.

So I summarized the requirements we agreed on so far and ended up with
the following list:

* This is going to be a PostgreSQL feature, not an extension, not a
bunch of hooks, etc;
* We are not going to support lazy/partial decompression since this is
too complicated in a general case and Postgres is not a specialized
document-oriented DBMS (there is a normalization after all);
* This should be a relation-level optimization option, not something
visible to every user of the table (not a custom type, etc);
* This is going to be an attribute-level compression;
* The dictionaries should be created automatically (maybe not in a PoC
but in the final implementation) since people are not good at it;
* We are going to be using the existing compression algorithms like
LZ4/ZSTD, not to invent new ones;
* When created, a dictionary version is immutable, i.e. no new entries
can be added. New version of a dictionary can be created when the data
evolves. The compressed data stores the dictionary version used for
compression. A dictionary version can't be deleted while data exists
that uses this version of a dictionary;
* Dictionaries are created automatically from sampled data during
ANALIZE. We compare the efficiency of a new dictionary vs the
efficiency of the old one (or the lack of such) on sampled data and
depending on the results decide whether it's worth creating a new
version of a dictionary;
* This is going to work for multiple types: TEXT, JSON, JSONB, XML,
BYTEA etc. Ideally for user-defined types too;

Hopefully I didn't miss anything.

While thinking about how a user interface could look like it occured
to me that what we are discussing could be merely a new STORAGE
strategy. Currently there are PLAIN, MAIN, EXTERNAL and EXTENDED.
Let's call a new strategy DICTIONARY, with typstorage = d.

When user wants a given attribute to be compressed, he/she says:

ALTER TABLE foo ALTER COLUMN bar SET STORAGE DICTIONARY;

And the compression algorithms is chosen as usual:

ALTER TABLE foo ALTER COLUMN bar SET COMPRESSION lz4;

When there are no dictionaries yet, DICTIONARY works the same as
EXTENDED. When a dictionary is trained the data is compressed using
the latest version of this dictionary. For visibility we are going to
need some sort of pg_stat_dictionaries view that shows the existing
dictionaries, how much space they consume, etc.

If we choose this approach there are a couple of questions/notes that
come to mind:

* The default compression algorithm is PGLZ and unlike LZ4 it doesn't
support training dictionaries yet. This should be straightforward to
implement though, or alternatively shared dictionaries could work only
for LZ4;
* Currently users have control of toast_tuple_target but not
TOAST_TUPLE_THRESHOLD. Which means for tuples smaller than 1/4 of the
page size shared dictionaries are not going to be triggered. Which is
not necessarily a bad thing. Alternatively we could give the users
toast_tuple_threshold setting. This shouldn't necessarily be part of
this CF entry discussion however, we can always discuss it separately;
* Should we allow setting DICTIONARY storage strategy for a given
type, i.e. CREATE TYPE baz STORAGE = DICTIONARY? I suggest we forbid
it in the first implementation, just for the sake of simplicity.
* It looks like we won't be able to share a dictionary between
multiple columns. Which again is not necessarily a bad thing: data in
these columns can be completely different (e.g. BYTEA and XML),
columns can be dropped independently, etc. If a user is interested in
sharing a dictionary between several columns he/she can join these
columns in a single JSONB column.
* TOAST currently doesn't support ZSTD. IMO this is not a big deal and
adding the corresponding support can be discussed separately.
* If memory serves, there were not so many free bits left in TOAST
pointers. The pointers don't store a storage strategy though so
hopefully this will not be a problem. We'll see.

Please let me know what you think about all this. I'm going to prepare
an updated patch for the next CF so I could use early feedback.

-- 
Best regards,
Aleksander Alekseev



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Allowing parallel-safe initplans
Next
From: Greg Stark
Date:
Subject: Re: Temporary tables versus wraparound... again