On Thu, Aug 14, 2014 at 3:49 PM, Larry White <ljw1001@gmail.com> wrote:
> I attached a json file of approximately 513K. It contains two repetitions of
> a single json structure. The values are quasi-random. It might make a decent
> test case of meaningfully sized data.
I have a 59M in plain SQL (10M compressed, 51M on-disk table size)
collection of real-world JSON data.
This data is mostly counters and anciliary info stored in json for the
flexibility, more than anything else, since it's otherwise quite
structured: most values share a lot between each other (in key names)
but there's not much redundancy within single rows.
Value length stats (in text format):
min: 14
avg: 427
max: 23239
If anyone's interested, contact me personally (I gotta anonimize the
info a bit first, since it's production info, and it's too big to
attach on the ML).