Thread: JSONB performance enhancement for 9.6
Hello:
Is there a plan for 9.6 to resolve the issue of very slow query/retrieval of jsonb fieldsfield values, the whole document has to be first decompressed and load to memory
before searching for the specific field key/value.
Thanks in Advance
On 11/28/2015 6:27 PM, Tom Smith wrote: > Is there a plan for 9.6 to resolve the issue of very slow > query/retrieval of jsonb fields > when there are large number (maybe several thousands) of top level keys. > Currently, if I save a large json document with top level keys of > thousands and query/retrieve > field values, the whole document has to be first decompressed and > load to memory > before searching for the specific field key/value. If it was my data, I'd be decomposing that large JSON thing into multiple SQL records, and storing as much stuff as possible in named SQL fields, using JSON in the database only for things that are too ambiguous for SQL. -- john r pierce, recycling bits in santa cruz
On Sat, 28 Nov 2015 21:27:51 -0500 Tom Smith <tomsmith1989sk@gmail.com> wrote: > > Is there a plan for 9.6 to resolve the issue of very slow query/retrieval > of jsonb fields > when there are large number (maybe several thousands) of top level keys. > Currently, if I save a large json document with top level keys of thousands > and query/retrieve > field values, the whole document has to be first decompressed and load to > memory > before searching for the specific field key/value. I could be off-base here, but have you tried: ATLER TABLE $table ALTER COLUMN $json_column SET STORAGE EXTERNAL; ? The default storage for a JSONB field is EXTENDED. Switching it to EXTERNAL will disable compression. You'll have to insert your data over again, since this change doesn't alter any existing data, but see if that change improves performance. -- Bill Moran
Hi: On Sun, Nov 29, 2015 at 1:09 PM, Bill Moran <wmoran@potentialtech.com> wrote: > On Sat, 28 Nov 2015 21:27:51 -0500 >> Currently, if I save a large json document with top level keys of thousands ** LARGE ** > The default storage for a JSONB field is EXTENDED. Switching it to > EXTERNAL will disable compression. You'll have to insert your data over > again, since this change doesn't alter any existing data, but see > if that change improves performance. Good recomendation, but see if it improves AND if it affects other queries in the system. Turning off compresion CAN decrease the eficiency ( hit ratio ) of the shared buffers and the cache, IIRC ( but worth testing anyway ). Francisco Olarte.
Tom Smith schrieb am 29.11.2015 um 03:27: > Hello: > > Is there a plan for 9.6 to resolve the issue of very slow query/retrieval of jsonb fields > when there are large number (maybe several thousands) of top level keys. > Currently, if I save a large json document with top level keys of thousands and query/retrieve > field values, the whole document has to be first decompressed and load to memory > before searching for the specific field key/value. > > Thanks in Advance If you are concerned about the compression overhead, then why don't you use (or try) JSON instead?
Hi, Thanks for everyone's response.
The issue is not just compression, but lack of "indexing" or "segmentation" when a
right now, if I query for one key, the whole doc
has to be first uncompressed and loaded and then search for the single key.
Compared to traditional way of storing each top level key with a separate column, this is huge overhead when table scan is required. Some kind of "keyed/slotted" storage for the doc could
help, (for illustration, all keys starting with 'A' would have its own storage unit, so on,
so when I search for key "A1" only that unit would be unpacked and traversed to get :"A1" value". it is like postgresql predfine 26 columns/slots for the whole doc. an internal indexing
so when I search for key "A1" only that unit would be unpacked and traversed to get :"A1" value". it is like postgresql predfine 26 columns/slots for the whole doc. an internal indexing
within each doc for fast retrieval of individual field values.
Someone mentioned a plan in roadmap for this route but I'd like to know if it is in 9.6 plan.
below url mentions the similar issue. I am not sure if it has been completely resolved.
http://stackoverflow.com/questions/26259374/jsonb-performance-degrades-as-number-of-keys-increase
below url mentions the potential issue.
https://www.reddit.com/r/PostgreSQL/comments/36rdlr/improving_select_performance_with_jsonb_vs_hstore/
below url mentions the similar issue. I am not sure if it has been completely resolved.
http://stackoverflow.com/questions/26259374/jsonb-performance-degrades-as-number-of-keys-increase
below url mentions the potential issue.
https://www.reddit.com/r/PostgreSQL/comments/36rdlr/improving_select_performance_with_jsonb_vs_hstore/
Thanks
On Sun, Nov 29, 2015 at 7:35 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:
If you are concerned about the compression overhead, then why don't you use (or try) JSON instead?Tom Smith schrieb am 29.11.2015 um 03:27:Hello:
Is there a plan for 9.6 to resolve the issue of very slow query/retrieval of jsonb fields
when there are large number (maybe several thousands) of top level keys.
Currently, if I save a large json document with top level keys of thousands and query/retrieve
field values, the whole document has to be first decompressed and load to memory
before searching for the specific field key/value.
Thanks in Advance
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Unfortunately, the keys can not be predefined or fixed. it is a doc, the reason jsonb
is used. It works well for small docs with small number of keys.would be an absolutely superior choice over MongoDB.for document data.
On Sun, Nov 29, 2015 at 12:37 AM, John R Pierce <pierce@hogranch.com> wrote:
On 11/28/2015 6:27 PM, Tom Smith wrote:Is there a plan for 9.6 to resolve the issue of very slow query/retrieval of jsonb fields
when there are large number (maybe several thousands) of top level keys.
Currently, if I save a large json document with top level keys of thousands and query/retrieve
field values, the whole document has to be first decompressed and load to memory
before searching for the specific field key/value.
If it was my data, I'd be decomposing that large JSON thing into multiple SQL records, and storing as much stuff as possible in named SQL fields, using JSON in the database only for things that are too ambiguous for SQL.
--
john r pierce, recycling bits in santa cruz
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, 29 Nov 2015 08:24:12 -0500 Tom Smith <tomsmith1989sk@gmail.com> wrote: > Hi, Thanks for everyone's response. > > The issue is not just compression, but lack of "indexing" or "segmentation" > when a > single doc has, say 2000 top level keys (with multiple levels of subkeys). > right now, if I query for one key, the whole doc > has to be first uncompressed and loaded and then search for the single key. > > Compared to traditional way of storing each top level key with a separate > column, this is huge overhead when table scan is required. Some kind of > "keyed/slotted" storage for the doc could > help, (for illustration, all keys starting with 'A' would have its own > storage unit, so on, > so when I search for key "A1" only that unit would be unpacked and > traversed to get :"A1" value". it is like postgresql predfine 26 > columns/slots for the whole doc. an internal indexing > within each doc for fast retrieval of individual field values. Sounds like you're pushing the limits of what JSONB is designed to do (at this stage, at least). I'm not aware of any improvements in recent versions (or head) that would do much to improve the situation, but I don't track ever commit either. If you really need this improvement and you're willing to wait for 9.6, then I suggest you check out the latest git version and test on that to see if anything has been done. I doubt you'll see much, though. As a thought experiment, the only way I can think to improve this use case is to ditch the current TOAST system and replace it with something that stores large JSON values in a form optimized for indexed access. That's a pretty massive change to some fairly core stuff just to optimize a single use-case of a single data type. Not saying it won't happen ... in fact, all things considered, it's pretty likely to happen at some point. As far as a current solution: my solution would be to decompose the JSON into an optimized table. I.e.: CREATE TABLE store1 ( id SERIAL PRIMARY KEY, data JSONB ); CREATE TABLE store2 ( id INT NOT NULL REFERENCES store1(id), top_level_key VARCHAR(1024), data JSONB, PRIMARY KEY(top_level_key, id) ); You can then use a trigger to ensure that store2 is always in sync with store1. Lookups can then use store2 and will be quite fast because of the index. A lot of the design is conjectural: do you even still need the data column on store1? Are there other useful indexes? etc. But, hopefully the general idea is made clear. This probably aren't the answers you want, but (to the best of my knowledge) they're the best answers available at this time. I'd really like to build the alternate TOAST storage, but I'm not in a position to start on a project that ambitious right ... I'm not even really keeping up with the project I'm currently supposed to be doing. -- Bill Moran
Is this correct? I'm fairly sure jsonb supports lazily parsing objects and each object level is actually searched using binary search.
Em 29/11/2015 11:25 AM, "Tom Smith" <tomsmith1989sk@gmail.com> escreveu:
single doc has, say 2000 top level keys (with multiple levels of subkeys).Hi, Thanks for everyone's response.
The issue is not just compression, but lack of "indexing" or "segmentation" when a
right now, if I query for one key, the whole dochas to be first uncompressed and loaded and then search for the single key.Compared to traditional way of storing each top level key with a separate column, this is huge overhead when table scan is required. Some kind of "keyed/slotted" storage for the doc couldhelp, (for illustration, all keys starting with 'A' would have its own storage unit, so on,
so when I search for key "A1" only that unit would be unpacked and traversed to get :"A1" value". it is like postgresql predfine 26 columns/slots for the whole doc. an internal indexingwithin each doc for fast retrieval of individual field values.Someone mentioned a plan in roadmap for this route but I'd like to know if it is in 9.6 plan.
below url mentions the similar issue. I am not sure if it has been completely resolved.
http://stackoverflow.com/questions/26259374/jsonb-performance-degrades-as-number-of-keys-increase
below url mentions the potential issue.
https://www.reddit.com/r/PostgreSQL/comments/36rdlr/improving_select_performance_with_jsonb_vs_hstore/ThanksOn Sun, Nov 29, 2015 at 7:35 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:If you are concerned about the compression overhead, then why don't you use (or try) JSON instead?Tom Smith schrieb am 29.11.2015 um 03:27:Hello:
Is there a plan for 9.6 to resolve the issue of very slow query/retrieval of jsonb fields
when there are large number (maybe several thousands) of top level keys.
Currently, if I save a large json document with top level keys of thousands and query/retrieve
field values, the whole document has to be first decompressed and load to memory
before searching for the specific field key/value.
Thanks in Advance
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Bill Moran <wmoran@potentialtech.com> writes: > Tom Smith <tomsmith1989sk@gmail.com> wrote: >> Is there a plan for 9.6 to resolve the issue of very slow >> query/retrieval of jsonb fields when there are large number (maybe >> several thousands) of top level keys. Currently, if I save a large >> json document with top level keys of thousands and query/retrieve field >> values, the whole document has to be first decompressed and load to >> memory before searching for the specific field key/value. > I could be off-base here, but have you tried: > ATLER TABLE $table ALTER COLUMN $json_column SET STORAGE EXTERNAL; There is just about zero chance we'll ever worry about this for compressed columns. However, even the uncompressed case does currently involve loading the whole column value, as Tom says. We did consider the possibility of such an optimization when designing the JSONB storage format, but I don't know of anyone actively working on it. In any case, it's unlikely that it'd ever be super fast, since it's certainly going to involve at least a couple of TOAST fetches. Personally I'd be looking for a different representation. If there are specific fields that are known to be needed a lot, maybe functional indexes would help? regards, tom lane
Hi:
The goal is fast retrieval of a a field value with a row when the row is alreadyOn Sun, Nov 29, 2015 at 12:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bill Moran <wmoran@potentialtech.com> writes:
> Tom Smith <tomsmith1989sk@gmail.com> wrote:
>> Is there a plan for 9.6 to resolve the issue of very slow
>> query/retrieval of jsonb fields when there are large number (maybe
>> several thousands) of top level keys. Currently, if I save a large
>> json document with top level keys of thousands and query/retrieve field
>> values, the whole document has to be first decompressed and load to
>> memory before searching for the specific field key/value.
> I could be off-base here, but have you tried:
> ATLER TABLE $table ALTER COLUMN $json_column SET STORAGE EXTERNAL;
There is just about zero chance we'll ever worry about this for compressed
columns. However, even the uncompressed case does currently involve
loading the whole column value, as Tom says. We did consider the
possibility of such an optimization when designing the JSONB storage
format, but I don't know of anyone actively working on it.
In any case, it's unlikely that it'd ever be super fast, since it's
certainly going to involve at least a couple of TOAST fetches.
Personally I'd be looking for a different representation. If there
are specific fields that are known to be needed a lot, maybe functional
indexes would help?
regards, tom lane
On 11/29/15 9:30 AM, Arthur Silva wrote: > Is this correct? I'm fairly sure jsonb supports lazily parsing objects > and each object level is actually searched using binary search. The problem is there's no support for loading just part of a TOASTed field. Even if that existed, we'd still need a way to know what byte position in the TOASTed field a key lived at. It's possible to add all that, but I think it'd be a serious amount of work. Since someone else was just wondering about storing more specific types in JSON, it might be more useful/interesting to devise a Postgres-specific way to store variable schema documents. That would give us a lot more flexibility over implementation than the JSON type ever will. Or think about it this way: there's really no great reason why everyone chose JSON. There's tons of other serialization storage formats for other languages out there, so why not one specialized to Postgres? (And of course we'd want to be able to cast from that to JSON and back...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Hi,
Congrats on the official release of 9.5
is about working with dynamic tree object.
Thanks
On Thu, Dec 3, 2015 at 5:31 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 11/29/15 9:30 AM, Arthur Silva wrote:Is this correct? I'm fairly sure jsonb supports lazily parsing objects
and each object level is actually searched using binary search.
The problem is there's no support for loading just part of a TOASTed field. Even if that existed, we'd still need a way to know what byte position in the TOASTed field a key lived at.
It's possible to add all that, but I think it'd be a serious amount of work.
Since someone else was just wondering about storing more specific types in JSON, it might be more useful/interesting to devise a Postgres-specific way to store variable schema documents. That would give us a lot more flexibility over implementation than the JSON type ever will.
Or think about it this way: there's really no great reason why everyone chose JSON. There's tons of other serialization storage formats for other languages out there, so why not one specialized to Postgres? (And of course we'd want to be able to cast from that to JSON and back...)
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
On Mon, Jan 11, 2016 at 09:01:03PM -0500, Tom Smith wrote: > Hi, > > Congrats on the official release of 9.5 > > And I'd like bring up the issue again about if 9.6 would address the jsonb > performance issue > with large number of top level keys. > It is true that it does not have to use JSON format. it is about serialization > and fast retrieval > of dynamic tree structure objects. (at top level, it might be called dynamic > columns) > So if postgresql can have its own way, that would work out too as long as it > can have intuitive query > (like what are implemented for json and jsonb) and fast retrieval of a tree > like object, > it can be called no-sql data type. After all, most motivations of using no-sql > dbs like MongoDB > is about working with dynamic tree object. > > If postgresql can have high performance on this, then many no-sql dbs would > become history. I can give you some backstory on this. TOAST was designed in 2001 as a way to store, in a data-type-agnostic way, long strings compressed and any other long data type, e.g. long arrays. In all previous cases, _part_ of the value wasn't useful. JSONB is a unique case because it is one of the few types that can be processed without reading the entire value, e.g. it has an index. We are going to be hesitant to do something data-type-specific for JSONB. It would be good if we could develop a data-type-agnostic approach to has TOAST can be improved. I know of no such work for 9.6, and it is unlikely it will be done in time for 9.6. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Roman grave inscription +
Bill Moran wrote: > As far as a current solution: my solution would be to decompose the > JSON into an optimized table. I.e.: > > CREATE TABLE store1 ( > id SERIAL PRIMARY KEY, > data JSONB > ); > > CREATE TABLE store2 ( > id INT NOT NULL REFERENCES store1(id), > top_level_key VARCHAR(1024), > data JSONB, > PRIMARY KEY(top_level_key, id) > ); Isn't this what ToroDB already does? https://www.8kdata.com/torodb/ -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 20, 2016 at 4:51 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Jan 11, 2016 at 09:01:03PM -0500, Tom Smith wrote:
> Hi,
>
> Congrats on the official release of 9.5
>
> And I'd like bring up the issue again about if 9.6 would address the jsonb
> performance issue
> with large number of top level keys.
> It is true that it does not have to use JSON format. it is about serialization
> and fast retrieval
> of dynamic tree structure objects. (at top level, it might be called dynamic
> columns)
> So if postgresql can have its own way, that would work out too as long as it
> can have intuitive query
> (like what are implemented for json and jsonb) and fast retrieval of a tree
> like object,
> it can be called no-sql data type. After all, most motivations of using no-sql
> dbs like MongoDB
> is about working with dynamic tree object.
>
> If postgresql can have high performance on this, then many no-sql dbs would
> become history.
I can give you some backstory on this. TOAST was designed in 2001 as a
way to store, in a data-type-agnostic way, long strings compressed and
any other long data type, e.g. long arrays.
In all previous cases, _part_ of the value wasn't useful. JSONB is a
unique case because it is one of the few types that can be processed
without reading the entire value, e.g. it has an index.
We are going to be hesitant to do something data-type-specific for
JSONB. It would be good if we could develop a data-type-agnostic
approach to has TOAST can be improved. I know of no such work for 9.6,
and it is unlikely it will be done in time for 9.6.
I'm looking on this time to time.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Is there any database that actually supports what the original poster wanted ?
The only thing that I know that's similar is bigtable/hbase/hypertable wide column store.break the lexicographically sorted rows into blocks of compressed XXKB, and then keeps an index on the start_key+end_key of each block.
If the index is too big, you can store the index itself in a toast value.
Note: I have no idea how to(if it can be) actually code this.
On Wed, Jan 20, 2016 at 9:32 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Wed, Jan 20, 2016 at 4:51 AM, Bruce Momjian <bruce@momjian.us> wrote:On Mon, Jan 11, 2016 at 09:01:03PM -0500, Tom Smith wrote:
> Hi,
>
> Congrats on the official release of 9.5
>
> And I'd like bring up the issue again about if 9.6 would address the jsonb
> performance issue
> with large number of top level keys.
> It is true that it does not have to use JSON format. it is about serialization
> and fast retrieval
> of dynamic tree structure objects. (at top level, it might be called dynamic
> columns)
> So if postgresql can have its own way, that would work out too as long as it
> can have intuitive query
> (like what are implemented for json and jsonb) and fast retrieval of a tree
> like object,
> it can be called no-sql data type. After all, most motivations of using no-sql
> dbs like MongoDB
> is about working with dynamic tree object.
>
> If postgresql can have high performance on this, then many no-sql dbs would
> become history.
I can give you some backstory on this. TOAST was designed in 2001 as a
way to store, in a data-type-agnostic way, long strings compressed and
any other long data type, e.g. long arrays.
In all previous cases, _part_ of the value wasn't useful. JSONB is a
unique case because it is one of the few types that can be processed
without reading the entire value, e.g. it has an index.
We are going to be hesitant to do something data-type-specific for
JSONB. It would be good if we could develop a data-type-agnostic
approach to has TOAST can be improved. I know of no such work for 9.6,
and it is unlikely it will be done in time for 9.6.I'm looking on this time to time.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, 19 Jan 2016 23:53:19 -0300 Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Bill Moran wrote: > > > As far as a current solution: my solution would be to decompose the > > JSON into an optimized table. I.e.: > > > > CREATE TABLE store1 ( > > id SERIAL PRIMARY KEY, > > data JSONB > > ); > > > > CREATE TABLE store2 ( > > id INT NOT NULL REFERENCES store1(id), > > top_level_key VARCHAR(1024), > > data JSONB, > > PRIMARY KEY(top_level_key, id) > > ); > > Isn't this what ToroDB already does? > https://www.8kdata.com/torodb/ Looks like. I wasn't aware of ToroDB, thanks for the link. -- Bill Moran
Using JSON/JSONB type in postgresql is usually due to the use case that the keys (top level included) can not be predefined. this is the major difference between NoSQL/Document and RDBMS.
Why would TOAST have to be used? Can some speciailly structured "raw" files be usedOn Wed, Jan 20, 2016 at 7:32 AM, Bill Moran <wmoran@potentialtech.com> wrote:
On Tue, 19 Jan 2016 23:53:19 -0300
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Bill Moran wrote:
>
> > As far as a current solution: my solution would be to decompose the
> > JSON into an optimized table. I.e.:
> >
> > CREATE TABLE store1 (
> > id SERIAL PRIMARY KEY,
> > data JSONB
> > );
> >
> > CREATE TABLE store2 (
> > id INT NOT NULL REFERENCES store1(id),
> > top_level_key VARCHAR(1024),
> > data JSONB,
> > PRIMARY KEY(top_level_key, id)
> > );
>
> Isn't this what ToroDB already does?
> https://www.8kdata.com/torodb/
Looks like. I wasn't aware of ToroDB, thanks for the link.
--
Bill Moran
On 20/01/16 13:32, Bill Moran wrote: > On Tue, 19 Jan 2016 23:53:19 -0300 > Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > >> Bill Moran wrote: >> >>> As far as a current solution: my solution would be to decompose the >>> JSON into an optimized table. I.e.: >>> >>> CREATE TABLE store1 ( >>> id SERIAL PRIMARY KEY, >>> data JSONB >>> ); >>> >>> CREATE TABLE store2 ( >>> id INT NOT NULL REFERENCES store1(id), >>> top_level_key VARCHAR(1024), >>> data JSONB, >>> PRIMARY KEY(top_level_key, id) >>> ); >> Isn't this what ToroDB already does? >> https://www.8kdata.com/torodb/ > Looks like. I wasn't aware of ToroDB, thanks for the link. > Hi Bill. Effectively, that's what ToroDB does. You will have a dynamic schema, but automatically created for you. It will be a relational schema, no json/jsonb needed for your data. Please read the FAQ: https://github.com/torodb/torodb/wiki/FAQ and let us know (https://groups.google.com/forum/#!forum/torodb-dev) if you would have any additional question. Thanks, Álvaro, for the reference :) Cheers, Álvaro
Hello:
I'd like to bring this JSONB performance issue again.Instead of providing column indexing(like GIN for JSONB in Postgresql).
it provides only internal data structure level indexing within each individual json object
for fast retrieval. compression is not used.
Perhaps without implementing complicated column level GIN indexing, implementing
a new variant JSON type that only handle individual json object indexing would be
feasible? Combined with current JSONB implementation, both common use cases
(one is global doc indexing, the other is fast retrieval of individual values)
would work out and make postgresql unbeatable.
On Tue, Jan 19, 2016 at 8:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Jan 11, 2016 at 09:01:03PM -0500, Tom Smith wrote:
> Hi,
>
> Congrats on the official release of 9.5
>
> And I'd like bring up the issue again about if 9.6 would address the jsonb
> performance issue
> with large number of top level keys.
> It is true that it does not have to use JSON format. it is about serialization
> and fast retrieval
> of dynamic tree structure objects. (at top level, it might be called dynamic
> columns)
> So if postgresql can have its own way, that would work out too as long as it
> can have intuitive query
> (like what are implemented for json and jsonb) and fast retrieval of a tree
> like object,
> it can be called no-sql data type. After all, most motivations of using no-sql
> dbs like MongoDB
> is about working with dynamic tree object.
>
> If postgresql can have high performance on this, then many no-sql dbs would
> become history.
I can give you some backstory on this. TOAST was designed in 2001 as a
way to store, in a data-type-agnostic way, long strings compressed and
any other long data type, e.g. long arrays.
In all previous cases, _part_ of the value wasn't useful. JSONB is a
unique case because it is one of the few types that can be processed
without reading the entire value, e.g. it has an index.
We are going to be hesitant to do something data-type-specific for
JSONB. It would be good if we could develop a data-type-agnostic
approach to has TOAST can be improved. I know of no such work for 9.6,
and it is unlikely it will be done in time for 9.6.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +
On Sun, May 1, 2016 at 6:46 AM, Tom Smith <tomsmith1989sk@gmail.com> wrote:
Below is a link of MySQL way of storing/retrieving Json key/valueHello:I'd like to bring this JSONB performance issue again.Instead of providing column indexing(like GIN for JSONB in Postgresql).it provides only internal data structure level indexing within each individual json objectfor fast retrieval. compression is not used.Perhaps without implementing complicated column level GIN indexing, implementinga new variant JSON type that only handle individual json object indexing would befeasible? Combined with current JSONB implementation, both common use cases(one is global doc indexing, the other is fast retrieval of individual values)would work out and make postgresql unbeatable.
It's called expression index ?
On Tue, Jan 19, 2016 at 8:51 PM, Bruce Momjian <bruce@momjian.us> wrote:On Mon, Jan 11, 2016 at 09:01:03PM -0500, Tom Smith wrote:
> Hi,
>
> Congrats on the official release of 9.5
>
> And I'd like bring up the issue again about if 9.6 would address the jsonb
> performance issue
> with large number of top level keys.
> It is true that it does not have to use JSON format. it is about serialization
> and fast retrieval
> of dynamic tree structure objects. (at top level, it might be called dynamic
> columns)
> So if postgresql can have its own way, that would work out too as long as it
> can have intuitive query
> (like what are implemented for json and jsonb) and fast retrieval of a tree
> like object,
> it can be called no-sql data type. After all, most motivations of using no-sql
> dbs like MongoDB
> is about working with dynamic tree object.
>
> If postgresql can have high performance on this, then many no-sql dbs would
> become history.
I can give you some backstory on this. TOAST was designed in 2001 as a
way to store, in a data-type-agnostic way, long strings compressed and
any other long data type, e.g. long arrays.
In all previous cases, _part_ of the value wasn't useful. JSONB is a
unique case because it is one of the few types that can be processed
without reading the entire value, e.g. it has an index.
We are going to be hesitant to do something data-type-specific for
JSONB. It would be good if we could develop a data-type-agnostic
approach to has TOAST can be improved. I know of no such work for 9.6,
and it is unlikely it will be done in time for 9.6.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +
No, it is within the individual json object storage. In a way, it would be part of query plan,
but strictly for the individual json object storage structure, it is not necessarily an "index"On Sun, May 1, 2016 at 4:14 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Sun, May 1, 2016 at 6:46 AM, Tom Smith <tomsmith1989sk@gmail.com> wrote:Below is a link of MySQL way of storing/retrieving Json key/valueHello:I'd like to bring this JSONB performance issue again.Instead of providing column indexing(like GIN for JSONB in Postgresql).it provides only internal data structure level indexing within each individual json objectfor fast retrieval. compression is not used.Perhaps without implementing complicated column level GIN indexing, implementinga new variant JSON type that only handle individual json object indexing would befeasible? Combined with current JSONB implementation, both common use cases(one is global doc indexing, the other is fast retrieval of individual values)would work out and make postgresql unbeatable.It's called expression index ?On Tue, Jan 19, 2016 at 8:51 PM, Bruce Momjian <bruce@momjian.us> wrote:On Mon, Jan 11, 2016 at 09:01:03PM -0500, Tom Smith wrote:
> Hi,
>
> Congrats on the official release of 9.5
>
> And I'd like bring up the issue again about if 9.6 would address the jsonb
> performance issue
> with large number of top level keys.
> It is true that it does not have to use JSON format. it is about serialization
> and fast retrieval
> of dynamic tree structure objects. (at top level, it might be called dynamic
> columns)
> So if postgresql can have its own way, that would work out too as long as it
> can have intuitive query
> (like what are implemented for json and jsonb) and fast retrieval of a tree
> like object,
> it can be called no-sql data type. After all, most motivations of using no-sql
> dbs like MongoDB
> is about working with dynamic tree object.
>
> If postgresql can have high performance on this, then many no-sql dbs would
> become history.
I can give you some backstory on this. TOAST was designed in 2001 as a
way to store, in a data-type-agnostic way, long strings compressed and
any other long data type, e.g. long arrays.
In all previous cases, _part_ of the value wasn't useful. JSONB is a
unique case because it is one of the few types that can be processed
without reading the entire value, e.g. it has an index.
We are going to be hesitant to do something data-type-specific for
JSONB. It would be good if we could develop a data-type-agnostic
approach to has TOAST can be improved. I know of no such work for 9.6,
and it is unlikely it will be done in time for 9.6.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +