Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Chris Travers |
---|---|
Subject | Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | CAN-RpxBuBvFFp-Ynq6y6ehwo1VKt-VtHgM0NTmxhQCXUiaqdjQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Custom compression methods (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Custom compression methods
|
List | pgsql-hackers |
On Tue, Mar 19, 2019 at 12:19 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 3/19/19 10:59 AM, Chris Travers wrote:
>
>
> Not discussing whether any particular committer should pick this up but
> I want to discuss an important use case we have at Adjust for this sort
> of patch.
>
> The PostgreSQL compression strategy is something we find inadequate for
> at least one of our large deployments (a large debug log spanning
> 10PB+). Our current solution is to set storage so that it does not
> compress and then run on ZFS to get compression speedups on spinning disks.
>
> But running PostgreSQL on ZFS has some annoying costs because we have
> copy-on-write on copy-on-write, and when you add file fragmentation... I
> would really like to be able to get away from having to do ZFS as an
> underlying filesystem. While we have good write throughput, read
> throughput is not as good as I would like.
>
> An approach that would give us better row-level compression would allow
> us to ditch the COW filesystem under PostgreSQL approach.
>
> So I think the benefits are actually quite high particularly for those
> dealing with volume/variety problems where things like JSONB might be a
> go-to solution. Similarly I could totally see having systems which
> handle large amounts of specialized text having extensions for dealing
> with these.
>
Sure, I don't disagree - the proposed compression approach may be a big
win for some deployments further down the road, no doubt about it. But
as I said, it's unclear when we get there (or if the interesting stuff
will be in some sort of extension, which I don't oppose in principle).
I would assume that if extensions are particularly stable and useful they could be moved into core.
But I would also assume that at first, this area would be sufficiently experimental that folks (like us) would write our own extensions for it.
>
> But hey, I think there are committers working for postgrespro, who might
> have the motivation to get this over the line. Of course, assuming that
> there are no serious objections to having this functionality or how it's
> implemented ... But I don't think that was the case.
>
>
> While I am not currently able to speak for questions of how it is
> implemented, I can say with very little doubt that we would almost
> certainly use this functionality if it were there and I could see plenty
> of other cases where this would be a very appropriate direction for some
> other projects as well.
>
Well, I guess the best thing you can do to move this patch forward is to
actually try that on your real-world use case, and report your results
and possibly do a review of the patch.
Yeah, I expect to do this within the next month or two.
IIRC there was an extension [1] leveraging this custom compression
interface for better jsonb compression, so perhaps that would work for
you (not sure if it's up to date with the current patch, though).
[1]
https://www.postgresql.org/message-id/20171130182009.1b492eb2%40wp.localdomain
Yeah I will be looking at a couple different approaches here and reporting back. I don't expect it will be a full production workload but I do expect to be able to report on benchmarks in both storage and performance.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Best Regards,
Chris Travers
Head of Database
Saarbrücker Straße 37a, 10405 Berlin
pgsql-hackers by date: