Re: jsonb format is pessimal for toast compression - Mailing list pgsql-hackers

From Arthur Silva
Subject Re: jsonb format is pessimal for toast compression
Date
Msg-id CAO_YK0U1FxvhhLQGCk_WzueBduzCQU_UfzngKDNZ3k6_Wue1fQ@mail.gmail.com
Whole thread Raw
In response to Re: jsonb format is pessimal for toast compression  (Arthur Silva <arthurprs@gmail.com>)
List pgsql-hackers
I couldn't get my hands on the twitter data but I'm generating my own. The json template is http://paste2.org/wJ1dfcjw and data was generated with http://www.json-generator.com/. It has 35 top level keys, just in case someone is wondering.
I generated 10000 random objects and I'm inserting them repeatedly until I got 320k rows.

Test query: SELECT data->>'name', data->>'email' FROM t_json
Test storage: EXTERNAL
Test jsonb lengths quartiles: {1278,1587,1731,1871,2231}
Tom's lengths+cache aware: 455ms
HEAD: 440ms

This is a realistic-ish workload in my opinion and Tom's patch performs within 4% of HEAD.

Due to the overall lenghts I couldn't really test compressibility so I re-ran the test. This time I inserted an array of 2 objects in each row, as in: [obj, obj];
The objects where taken in sequence from the 10000 pool so contents match in both tests.

Test query: SELECT data #> '{0, name}', data #> '{0, email}', data #> '{1, name}', data #> '{1, email}' FROM t_json
Test storage: EXTENDED
HEAD: 17mb table + 878mb toast
HEAD size quartiles: {2015,2500,2591,2711,3483}
HEAD query runtime: 15s
Tom's: 220mb table + 580mb toast
Tom's size quartiles: {1665,1984,2061,2142.25,2384}
Tom's query runtime: 13s

This is an intriguing edge case that Tom's patch actually outperform the base implementation for 3~4kb jsons.

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Next
From: Craig Ringer
Date:
Subject: Re: jsonb format is pessimal for toast compression