Re: Zstandard support for toast compression - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Zstandard support for toast compression
Date
Msg-id 20220520201742.GM9030@tamriel.snowman.net
Whole thread Raw
In response to Re: Zstandard support for toast compression  (Nikolay Shaplov <dhyan@nataraj.su>)
Responses Re: Zstandard support for toast compression
Re: Zstandard support for toast compression
List pgsql-hackers
Greetings,

* Nikolay Shaplov (dhyan@nataraj.su) wrote:
> В письме от вторник, 17 мая 2022 г. 23:01:07 MSK пользователь Tom Lane
> написал:
>
> Hi! I came to this branch looking for a patch to review, but I guess I would
> join the discussion instead of reading the code.

Seems that's what would be helpful now thanks for joining the
discussion.

> > >> Yeah - I think we had better reserve the fourth bit pattern for
> > >> something extensible e.g. another byte or several to specify the
> > >> actual method, so that we don't have a hard limit of 4 methods. But
> > >> even with such a system, the first 3 methods will always and forever
> > >> be privileged over all others, so we'd better not make the mistake of
> > >> adding something silly as our third algorithm.
> > >
> > > In such a situation, would they really end up being properly distinct
> > > when it comes to what our users see..?  I wouldn't really think so.
> >
> > It should be transparent to users, sure, but the point is that the
> > first three methods will have a storage space advantage over others.
> > Plus we'd have to do some actual work to create that extension mechanism.
>
> Postgres is well known for extensiblility. One can write your own
> implementation of almost everything and make it an extension.
> Though one would hardly need more than one (or two) additional compression
> methods, but which method one will really need is hard to say.

A thought I've had before is that it'd be nice to specify a particular
compression method on a data type basis.  Wasn't the direction that this
was taken, for reasons, but I wonder about perhaps still having a data
type compression method and perhaps one of these bits might be "the data
type's (default?) compression method".  Even so though, having an
extensible way to add new compression methods would be a good thing.

For compression methods that we already support in other parts of the
system, seems clear that we should allow those to be used for column
compression too.  We should certainly also support a way to specify on a
compression-type specific level what the compression level should be
though.

> So I guess it would be much better to create and API for creating and
> registering own compression method and create build in Zstd compression method
> that can be used (or optionally not used) via that API.

While I generally agree that we want to provide extensibility in this
area, given that we already have zstd as a compile-time option and it
exists in other parts of the system, I don't think it makes sense to
require users to install an extension to use it.

> Moreover I guess this API (may be with some modification) can be used for
> seamless data encryption, for example.

Perhaps.. but this kind of encryption wouldn't allow indexing and
certainly lots of other metadata would still be unencrypted (the entire
system catalog being a good example..).

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Inquiring about my GSoC Proposal.
Next
From: Tom Lane
Date:
Subject: Re: check for null value before looking up the hash function