Re: BUG #11109: No Toast compression on JSON, JSONB columns - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #11109: No Toast compression on JSON, JSONB columns
Date
Msg-id 21178.1407447735@sss.pgh.pa.us
Whole thread Raw
In response to BUG #11109: No Toast compression on JSON, JSONB columns  (ljw1001@gmail.com)
List pgsql-bugs
ljw1001@gmail.com writes:
> if PG can, after compression, put an entire JSON or JSONB 'document' into
> one row/page in the toast table it does. However, if the document is too big
> to fit in one row after compression, it does no compression at all.

I can't reproduce this.

It does seem that JSONB is noticeably less compressible than the
equivalent plain-text representation, which is disappointing,
but I don't see it failing to compress at all.

I experimented with this SQL script:


create table bigjson(f1 jsonb);

-- alter table bigjson alter column f1 set storage external;

insert into bigjson values(
'{"junk":["124245etweetwet345gwtretwt43 qwrqwq qwre qwrsdflkas",
"q4535 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrssdfsd",
"124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas",
... 50000 repetitions of above two lines ...
"q4535wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrssdfsd",
"124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas",
"foo"]}'
);

select pg_size_pretty(pg_table_size('bigjson'));

drop table bigjson;


I get about 568kB table size, versus 5640kB with the "external" option
uncommented, so it is compressing about 10-to-1.  If I change the column
type to text (or json) then I get a table size of 120kB, which is better.
But on the other hand we should not put a lot of stock in the compression
rate achievable on such bogus data, so I'd not panic until I saw numbers
for more realistic test data.

What might be happening for you is that your actual test case triggers the
"first_success_by" filter in pglz_compress() because the first kilobyte of
input data isn't compressible.  But since you didn't supply a fully
reproducible test case, that's just speculation.

> This is not the behavior observed with Text compression in Toast.  For very
> large files Text compression works correctly.  I got these results running
> similar tests on larger json files:

> Column Type  - Storage       - TOAST table size
> JSONB           - EXTERNAL -  2448 MB
> JSONB           - EXTENDED - 2448 MB
> JSON             - EXTENDED - 2504 MB
> TEXT             - EXTERNAL  - 2409 MB
> TEXT             - EXTENDED -      40 MB

json and text should (and do, for me) store exactly the same.  So there's
something fishy going on here.  Maybe you've changed the typstorage
values for json and jsonb in pg_type?

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Re: BUG #11103: to_json() does not convert correctly DOMAINs type since 9.3.5 (int expected instead of string) ?
Next
From: Tom Lane
Date:
Subject: Re: BUG #11120: Decrease in no. of rows while sorting