Re: jsonb format is pessimal for toast compression - Mailing list pgsql-hackers

From Robert Haas
Subject Re: jsonb format is pessimal for toast compression
Date
Msg-id CA+TgmoZsnnGF9cjw=8u_6b9Of5mqnAK=WAsRmuA1d+LyNNE+Tg@mail.gmail.com
Whole thread Raw
In response to Re: jsonb format is pessimal for toast compression  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Mon, Sep 15, 2014 at 7:44 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 15, 2014 at 4:05 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Actually, having the keys all at the same level *is* relevant for the
>> issue we're discussing.  If those 270 keys are organized in a tree, it's
>> not the same as having them all on one level (and not as problematic).
>
> I believe Robert meant that the 270 keys are not at the top level, but
> are at some level (in other words, some object has 270 pairs). That is
> equivalent to having them at the top level for the purposes of this
> discussion.

Yes, that's exactly what I meant.

> FWIW, I am slightly concerned about weighing use cases around very
> large JSON documents too heavily. Having enormous jsonb documents just
> isn't going to work out that well, but neither will equivalent designs
> in popular document database systems for similar reasons. For example,
> the maximum BSON document size supported by MongoDB is 16 megabytes,
> and that seems to be something that their users don't care too much
> about. Having 270 pairs in an object isn't unreasonable, but it isn't
> going to be all that common either.

The JSON documents in this case were not particularly large.  These
objects were < 100kB; they just had a lot of keys.   I'm a little
baffled by the apparent theme that people think that (object size) /
(# of keys) will tend to be large.  Maybe there will be some instances
where that's the case, but it's not what I'd expect.  I would expect
people to use JSON to serialize structured data in situations where
normalizing would be unwieldly.

For example, pick your favorite Facebook or Smartphone game - Plants
vs. Zombies, Farmville, Candy Crush Saga, whatever.  Or even a
traditional board game like chess.  Think about what the game state
looks like as an abstract object.  Almost without exception, you've
got some kind of game board with a bunch of squares and then you have
a bunch of pieces (plants, crops, candies, pawns) that are positioned
on those squares.  Now you want to store this in a database.  You're
certainly not going to have a table column per square, and EAV would
be stupid, so what's left?  You could use an array, but an array of
strings might not be descriptive enough; for a square in Farmville,
for example, you might need to know the type of crop, and whether it
was fertilized with special magic fertilizer, and when it's going to
be ready to harvest, and when it'll wither if not harvested.  So a
JSON is a pretty natural structure: an array of arrays of objects.  If
you have a 30x30 farm, you'll have 900 keys.  If you have a 50x50
farm, which probably means you're spending real money to buy imaginary
plants, you'll have 2500 keys.

(For the record, I have no actual knowledge of how any of these games
are implemented under the hood.  I'm just speculating on how I would
have done it.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Triggers with DO functionality
Next
From: Alexander Korotkov
Date:
Subject: Re: Collation-aware comparisons in GIN opclasses