Re: jsonb format is pessimal for toast compression - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: jsonb format is pessimal for toast compression
Date
Msg-id 5424EE33.5010009@agliodbs.com
Whole thread Raw
In response to Re: jsonb format is pessimal for toast compression  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On 09/25/2014 08:10 PM, Tom Lane wrote:
> I wrote:
>> The "offsets-and-lengths" patch seems like the approach we ought to
>> compare to my patch, but it looks pretty unfinished to me: AFAICS it
>> includes logic to understand offsets sprinkled into a mostly-lengths
>> array, but no logic that would actually *store* any such offsets,
>> which means it's going to act just like my patch for performance
>> purposes.
> 
>> In the interests of pushing this forward, I will work today on
>> trying to finish and review Heikki's offsets-and-lengths patch
>> so that we have something we can do performance testing on.
>> I doubt that the performance testing will tell us anything we
>> don't expect, but we should do it anyway.
> 
> I've now done that, and attached is what I think would be a committable
> version.  Having done this work, I no longer think that this approach
> is significantly messier code-wise than the all-lengths version, and
> it does have the merit of not degrading on very large objects/arrays.
> So at the moment I'm leaning to this solution not the all-lengths one.
> 
> To get a sense of the compression effects of varying the stride distance,
> I repeated the compression measurements I'd done on 14 August with Pavel's
> geometry data (<24077.1408052877@sss.pgh.pa.us>).  The upshot of that was
> 
>                     min    max    avg
> 
> external text representation        220    172685    880.3
> JSON representation (compressed text)    224    78565    541.3
> pg_column_size, JSONB HEAD repr.    225    82540    639.0
> pg_column_size, all-lengths repr.    225    66794    531.1
> 
> Here's what I get with this patch and different stride distances:
> 
> JB_OFFSET_STRIDE = 8            225    68551    559.7
> JB_OFFSET_STRIDE = 16            225    67601    552.3
> JB_OFFSET_STRIDE = 32            225    67120    547.4
> JB_OFFSET_STRIDE = 64            225    66886    546.9
> JB_OFFSET_STRIDE = 128            225    66879    546.9
> JB_OFFSET_STRIDE = 256            225    66846    546.8
> 
> So at least for that test data, 32 seems like the sweet spot.
> We are giving up a couple percent of space in comparison to the
> all-lengths version, but this is probably an acceptable tradeoff
> for not degrading on very large arrays.
> 
> I've not done any speed testing.

I'll do some tommorrow.  I should have some different DBs to test on, too.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: jsonb format is pessimal for toast compression
Next
From: Alexander Korotkov
Date:
Subject: Re: KNN-GiST with recheck