Thread: Wanted: jsonb on-disk representation documentation
I'm reading the new jsonb code, trying to understand the on-disk representation. And I cannot make heads or tails of it. My first entry point was jsonb.h. Jsonb struct is the on-disk representation, so I looked at the comments above that. No help, the comments are useless for getting an overview picture. Help, anyone? - Heikki
On May 6, 2014 9:30:15 PM CEST, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >I'm reading the new jsonb code, trying to understand the on-disk >representation. And I cannot make heads or tails of it. > >My first entry point was jsonb.h. Jsonb struct is the on-disk >representation, so I looked at the comments above that. No help, the >comments are useless for getting an overview picture. > >Help, anyone? Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patchwouldn't have been merged without that and other adjustments. Andres --- Please excuse brevity and formatting - I am writing this on my mobile phone.
On Tue, May 6, 2014 at 09:48:04PM +0200, Andres Freund wrote: > On May 6, 2014 9:30:15 PM CEST, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > >I'm reading the new jsonb code, trying to understand the on-disk > >representation. And I cannot make heads or tails of it. > > > >My first entry point was jsonb.h. Jsonb struct is the on-disk > >representation, so I looked at the comments above that. No help, the > >comments are useless for getting an overview picture. > > > >Help, anyone? > > Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patchwouldn't have been merged without that and other adjustments. I also would like to know what the index-everything hash ops does? Does it index the keys, values, or both? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Tue, May 6, 2014 at 1:06 PM, Bruce Momjian <bruce@momjian.us> wrote: > I also would like to know what the index-everything hash ops does? Does > it index the keys, values, or both? It indexes both, but it isn't possible to test existence (of a key) with the hash GIN opclass. -- Peter Geoghegan
On Tue, May 6, 2014 at 12:48 PM, Andres Freund <andres@anarazel.de> wrote: > Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patchwouldn't have been merged without that and other adjustments. I'm almost certain that the only feedback of yours that I didn't incorporate was that I didn't change the name of JsonbValue, a decision I stand by, and also that I didn't add ascii art to illustrate the on-disk format. I can write a patch that adds the latter soon. -- Peter Geoghegan
FYI, http://obartunov.livejournal.com/178495.html This is hash based gin opclass for hstore with all operators support. It's pity we had no time to do the same for jsonb, but we may include it and couple of other opclasses to contrib/jsonx. Oleg On Wed, May 7, 2014 at 12:18 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, May 6, 2014 at 1:06 PM, Bruce Momjian <bruce@momjian.us> wrote: >> I also would like to know what the index-everything hash ops does? Does >> it index the keys, values, or both? > > It indexes both, but it isn't possible to test existence (of a key) > with the hash GIN opclass. > > -- > Peter Geoghegan > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-05-06 13:30:26 -0700, Peter Geoghegan wrote: > On Tue, May 6, 2014 at 12:48 PM, Andres Freund <andres@anarazel.de> wrote: > > Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision thepatch wouldn't have been merged without that and other adjustments. > > I'm almost certain that the only feedback of yours that I didn't > incorporate was that I didn't change the name of JsonbValue, a > decision I stand by, and also that I didn't add ascii art to > illustrate the on-disk format. I can write a patch that adds the > latter soon. That might or might not be true. I don't really remember. Documentation about the on-disk format is the one thing I am sure about that's not done. The reviews I did were really cursory reviews, nothing thorough. There's large parts of the code (e.g. jsonb_gin.c) I didn't even look at. And others I don't really understand. I also didn't have time to look at the later versions. The code did improve, don't get me wrong. Otherwise I'd have been very vocal about this when committed. But it's still pretty hard to read/understand code. Which imo is problematic for a feature touted being absolutely critical for postgres' success. If other's want a taste, take a peek at findJsonbValueFromSuperHeader()'s code. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, May 6, 2014 at 3:37 PM, Andres Freund <andres@anarazel.de> wrote: > That might or might not be true. I don't really remember. Documentation > about the on-disk format is the one thing I am sure about that's not > done. I think it would be best to do that with reference to a concrete example. As I said, I'll work on a patch. > The reviews I did were really cursory reviews, nothing thorough. There's > large parts of the code (e.g. jsonb_gin.c) I didn't even look at. And > others I don't really understand. I also didn't have time to look at the > later versions. The code did improve, don't get me wrong. Otherwise I'd > have been very vocal about this when committed. > But it's still pretty hard to read/understand code. Which imo is > problematic for a feature touted being absolutely critical for postgres' > success. If other's want a taste, take a peek at > findJsonbValueFromSuperHeader()'s code. I don't really know what to say to that. Lots of code in Postgres is complicated, especially if you look at one particular function without some wider context. Is your objection that the complexity is incidental rather than essential? If so, how? -- Peter Geoghegan
On 2014-05-06 15:45:39 -0700, Peter Geoghegan wrote: > I don't really know what to say to that. Lots of code in Postgres is > complicated, especially if you look at one particular function without > some wider context. > Is your objection that the complexity is incidental rather than > essential? Yes. > If so, how? If you think the following is a solution of essential complexity in *new* code for navigating one level down a relatively simple *new* datastructure - then we have a disconnect that's larger than I am willing to argue about. I can live with the argument that this code is what we have; but calling this only having the "essential complexity" is absurd. JsonbValue * findJsonbValueFromSuperHeader(JsonbSuperHeader sheader, uint32 flags, uint32 *lowbound, JsonbValue*key) { uint32 superheader = *(uint32 *) sheader; JEntry *array = (JEntry *) (sheader + sizeof(uint32)); int count = (superheader & JB_CMASK); JsonbValue *result = palloc(sizeof(JsonbValue)); Assert((flags & ~(JB_FARRAY | JB_FOBJECT)) == 0); if (flags & JB_FARRAY & superheader) { char *data = (char *) (array + (superheader & JB_CMASK)); int i; for (i = 0; i < count; i++) { JEntry *e = array + i; if (JBE_ISNULL(*e) && key->type == jbvNull) { result->type = jbvNull; result->estSize= sizeof(JEntry); } else if (JBE_ISSTRING(*e) && key->type == jbvString) { result->type = jbvString; result->val.string.val = data + JBE_OFF(*e); result->val.string.len= JBE_LEN(*e); result->estSize = sizeof(JEntry) + result->val.string.len; } else if (JBE_ISNUMERIC(*e) && key->type == jbvNumeric) { result->type = jbvNumeric; result->val.numeric = (Numeric) (data + INTALIGN(JBE_OFF(*e))); result->estSize = 2 * sizeof(JEntry) + VARSIZE_ANY(result->val.numeric); } else if (JBE_ISBOOL(*e) && key->type == jbvBool) { result->type = jbvBool; result->val.boolean= JBE_ISBOOL_TRUE(*e) != 0; result->estSize = sizeof(JEntry); } else continue; if (compareJsonbScalarValue(key, result) == 0) return result; } }else if (flags & JB_FOBJECT& superheader){ /* Since this is an object, account for *Pairs* of Jentrys */ char *data = (char *)(array + (superheader & JB_CMASK) * 2); uint32 stopLow = lowbound ? *lowbound : 0, stopMiddle; /* Object key past by caller must be a string */ Assert(key->type == jbvString);... I am not calling for a revert. I am just saying that it's imo below project standards. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, May 6, 2014 at 5:13 PM, Andres Freund <andres@2ndquadrant.com> wrote: > If you think the following is a solution of essential complexity in > *new* code for navigating one level down a relatively simple *new* > datastructure - then we have a disconnect that's larger than I am > willing to argue about You omitted the 40 lines of comments above the function. > I can live with the argument that this code is what we have; but calling > this only having the "essential complexity" is absurd. I did not say that it only had essential complexity; just that your criticism was vague. What's wrong with this particular code, precisely? What complexity is incidental? Why? I think what you're missing here is that findJsonbValueFromSuperHeader() is useful for testing "existence" - that's mostly what it does (serve as a worker function for the 3 existence-type operators). It's also used once to do a binary search for a key when testing containment, ahead of testing a corresponding value in a pair (a pair within a rhs that we're testing for containment within an lhs "this" value). Finally, it's also used once with arrays when testing containment. Why do I just match the key within findJsonbValueFromSuperHeader() for objects, and not the key/value pair you ask? Because that's not what existence is, and it's easier and clearer to have containment of nested objects and arrays handling by the higher level containment function (once we find the value from the pair, to pass back to it). There is a number of things in tension here. A further factor is the desire to avoid redundant code. Now, I guess you could make the case that the handling of the JB_ARRAY and JB_OBJECT cases could be broken out, but that isn't obviously true, since that creates redundancy for the majority of callers that only care about existence. If you're suggesting that the JB_ARRAY and JB_OBJECT cases within that function are redundant, well, they're not; I'm iterating element-wise for the former and pairwise for the latter. I'm also returning the value for the former, and the element (which in a certain sense is equivalent -- the equivalent of an object/pair "value") for the latter. Note the user-visible definition of existence if you don't know what I mean. -- Peter Geoghegan
On 05/06/2014 11:30 PM, Peter Geoghegan wrote: > On Tue, May 6, 2014 at 12:48 PM, Andres Freund <andres@anarazel.de> wrote: >> Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patchwouldn't have been merged without that and other adjustments. > > I'm almost certain that the only feedback of yours that I didn't > incorporate was that I didn't change the name of JsonbValue, a > decision I stand by, and also that I didn't add ascii art to > illustrate the on-disk format. I can write a patch that adds the > latter soon. That would be great. I found the serialization routine, convertJsonb() to be a bit of a mess. It's maintaining a custom stack of levels, which can be handy if you need to avoid recursion, but it's also relying on the native stack. And I didn't understand the point of splitting it into the "walk" and "put" functions; the division of labor between the two was far from clear IMHO. I started refactoring that, and ended up with the attached. One detail that I found scary is that the estSize field in JsonbValue is not just any rough estimate. It's used ín the allocation of the output buffer for convertJsonb(), so it has to be large enough or you hit an assertion or buffer overflow. I believe it was correct as it was, but that kind of programming is always scary. I refactored the convertJsonb() function to use a StringInfo buffer instead, and removed estSize altogether. This is still work-in-progress, but I thought I'd post this now to let people know I'm working on it. For example, the StringInfo isn't actually very well suited for this purpose, it might be better to have a custom buffer that's enlarged when needed. For my own sanity, I started writing some docs on the on-disk format. See the comments in jsonb.h for my understanding of it. I moved around the structs a bit in jsonb.h, to make the format clearer, but the actual on-disk format is unchanged. - Heikki
Attachment
Continuing the review, I don't like the "superheader" terminology. To me, "super" implies that there's some other kind of header involved, and the superheader somehow includes or the parent of that. But it actually seems to refer to the header field in the beginning of an array or object value. In essence, "superheader" is used as the common term to refer to an object that can be an array or an object. I propose that we change that to "container". Noticed something funny while looking at the convertJsonb function: postgres=# select substr(((('["' ||repeat('x', 268435455) || '", "' || repeat('y', 2) || '"]')::jsonb)::text), 268435455); substr ------------------------------------------------------------------------------ ----------------------------------------------------------- xxx", 0.000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000] (1 row) Somehow the second string element in the array, "yy", gets turned into a numeric. The reason is that although we check that the length of a single string doesn't exceed the maximum of 2^28 that can be stored in the space reserved for the length in a Jentry, there are no length checks for the end offset stored there in an array. So if the total size of elements in an array exceed 2^28, funny things like above happen. Attached is a WIP patch that fixes the above, renames "superheader" to "container", and includes the refactorings and cleanup that I posted earlier today. - Heikki
Attachment
So, apart from cleaning up the code, we really need to take a close look at the on-disk format now. The code can be cleaned up later, too, but we're going to be stuck with the on-disk format forever, so it's critical to get that right. First, a few observations: * JENTRY_ISFIRST is redundant. Whenever you deal with the Jentry struct, you know from the context which element in the array it is. * JENTRY_ISNEST is set but never used. * JENTRY_ISBOOL is defined as (JENTRY_ISNUMERIC | JENTRY_ISNEST), which seems confusing. I'm going to proceed refactoring those things, which will change the on-disk format. It's late in the release cycle - these things really should've been cleaned up earlier - but it's important to get the on-disk format right. Shout if you have any objections. - Heikki
On Wed, May 7, 2014 at 8:20 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > So, apart from cleaning up the code, we really need to take a close look at > the on-disk format now. The code can be cleaned up later, too, but we're > going to be stuck with the on-disk format forever, so it's critical to get > that right. > > First, a few observations: > > * JENTRY_ISFIRST is redundant. Whenever you deal with the Jentry struct, you > know from the context which element in the array it is. > > * JENTRY_ISNEST is set but never used. > > * JENTRY_ISBOOL is defined as (JENTRY_ISNUMERIC | JENTRY_ISNEST), which > seems confusing. > > I'm going to proceed refactoring those things, which will change the on-disk > format. It's late in the release cycle - these things really should've been > cleaned up earlier - but it's important to get the on-disk format right. > Shout if you have any objections. +1. It is saner to do that now than never. -- Michael
Hi, On 2014-05-07 14:20:19 +0300, Heikki Linnakangas wrote: > So, apart from cleaning up the code, we really need to take a close look at > the on-disk format now. The code can be cleaned up later, too, but we're > going to be stuck with the on-disk format forever, so it's critical to get > that right. +1 > First, a few observations: Agreed. I'd like to add that: * Imo we need space in jsonb ondisk values to indicate a format version. We won't fully get this right. * The jentry representation should be changed so it's possible to get the type of a entry without checking individual types.That'll make code like findJsonbValueFromSuperHeader() (well, whatever you've renamed it to) much easier to read. Preferrablyso it an have the same values (after shifting/masking) ask the JBE variants. And it needs space for futher types(or representations thereof). * I wonder if the hash/object pair representation is wise and if it shouldn't be keys combined with offsets first, and thenthe values. That will make access much faster. And that's what jsonb essentially is about. * I think both arrays and hashes should grow individual structs. With space for additional flags. * I have doubts of the wisdom of allowing to embed jbvBinary values in JsonbValues. Although that can be changed later sinceit's not on disk. > I'm going to proceed refactoring those things, which will change the on-disk > format. It's late in the release cycle - these things really should've been > cleaned up earlier - but it's important to get the on-disk format right. > Shout if you have any objections. I don't think it's likely that beta1 will be binary compatible with the final version at this point. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de> wrote: >> I'm going to proceed refactoring those things, which will change the on-disk >> format. It's late in the release cycle - these things really should've been >> cleaned up earlier - but it's important to get the on-disk format right. >> Shout if you have any objections. +1. > I don't think it's likely that beta1 will be binary compatible with the > final version at this point. I rather think we're not ready for beta1 at this point (but I expect to lose that argument). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, May 7, 2014 at 1:20 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
So, apart from cleaning up the code, we really need to take a close look at the on-disk format now. The code can be cleaned up later, too, but we're going to be stuck with the on-disk format forever, so it's critical to get that right.
First, a few observations:
* JENTRY_ISFIRST is redundant. Whenever you deal with the Jentry struct, you know from the context which element in the array it is.
* JENTRY_ISNEST is set but never used.
* JENTRY_ISBOOL is defined as (JENTRY_ISNUMERIC | JENTRY_ISNEST), which seems confusing.
I'm going to proceed refactoring those things, which will change the on-disk format. It's late in the release cycle - these things really should've been cleaned up earlier - but it's important to get the on-disk format right. Shout if you have any objections.
+1. It's now or never. If the on-disk format needs changing, not doing it now is going to leave us with a "jsonc" in the future... Better bite the bullet now.
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2014-05-07 08:50:33 -0400, Robert Haas wrote: > On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de> wrote: > > I don't think it's likely that beta1 will be binary compatible with the > > final version at this point. > > I rather think we're not ready for beta1 at this point (but I expect > to lose that argument). Well, I guess it depends on what we define 'beta1' to be. Imo evaluating problematic pieces of new code, locating unfinished pieces is part of that. I don't see much point in forbidding incompatible changes in beta1 personally. That robs th the development cycle of the only period where users can actually test the new version in a halfway sane manner and report back with things that apparently broken. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-05-07 08:50:33 -0400, Robert Haas wrote:
> On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de> wrote:> > I don't think it's likely that beta1 will be binary compatible with theWell, I guess it depends on what we define 'beta1' to be. Imo evaluating
> > final version at this point.
>
> I rather think we're not ready for beta1 at this point (but I expect
> to lose that argument).
problematic pieces of new code, locating unfinished pieces is part of
that. I don't see much point in forbidding incompatible changes in beta1
personally. That robs th the development cycle of the only period where
users can actually test the new version in a halfway sane manner and
report back with things that apparently broken.
We need to be very careful to tell people about it though. Preferrably if we *know* a dump/reload will be needed to go beta1->beta2, we should actually document that in the releasenotes of beta1 already. So people can make proper plans..
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2014-05-07 15:00:01 +0200, Magnus Hagander wrote: > On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com>wrote: > > > On 2014-05-07 08:50:33 -0400, Robert Haas wrote: > > > On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de> > > wrote: > > > > I don't think it's likely that beta1 will be binary compatible with the > > > > final version at this point. > > > > > > I rather think we're not ready for beta1 at this point (but I expect > > > to lose that argument). > > > > Well, I guess it depends on what we define 'beta1' to be. Imo evaluating > > problematic pieces of new code, locating unfinished pieces is part of > > that. I don't see much point in forbidding incompatible changes in beta1 > > personally. That robs th the development cycle of the only period where > > users can actually test the new version in a halfway sane manner and > > report back with things that apparently broken. > > > > > We need to be very careful to tell people about it though. Preferrably if > we *know* a dump/reload will be needed to go beta1->beta2, we should > actually document that in the releasenotes of beta1 already. So people can > make proper plans.. Yes, I think it actually makes sense to add that to *all* beta release notes. Even in beta2, although slightly weakened. That's not a new thing btw. E.g. 9.3 has had a catversion bump between beta1/2: git diff 09bd2acbe5ac866ce9..817a89423f429a6a8b -- src/include/catalog/catversion.h The more interesting note probably is that there quite possibly won't be pg_upgrade'ability... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 7, 2014 at 3:04 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Requiring pg_upgrade between betas might even be "good" in the sense that then we get more testing of pg_upgrade :) But requiring a dump/reload is going to hurt people more.
Yes, I think it actually makes sense to add that to *all* beta releaseOn 2014-05-07 15:00:01 +0200, Magnus Hagander wrote:
> On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com>wrote:
>
> > On 2014-05-07 08:50:33 -0400, Robert Haas wrote:
> > > On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de>
> > wrote:
> > > > I don't think it's likely that beta1 will be binary compatible with the
> > > > final version at this point.
> > >
> > > I rather think we're not ready for beta1 at this point (but I expect
> > > to lose that argument).
> >
> > Well, I guess it depends on what we define 'beta1' to be. Imo evaluating
> > problematic pieces of new code, locating unfinished pieces is part of
> > that. I don't see much point in forbidding incompatible changes in beta1
> > personally. That robs th the development cycle of the only period where
> > users can actually test the new version in a halfway sane manner and
> > report back with things that apparently broken.
> >
> >
> We need to be very careful to tell people about it though. Preferrably if
> we *know* a dump/reload will be needed to go beta1->beta2, we should
> actually document that in the releasenotes of beta1 already. So people can
> make proper plans..
notes. Even in beta2, although slightly weakened.
That's not a new thing btw. E.g. 9.3 has had a catversion bump between
beta1/2:
git diff 09bd2acbe5ac866ce9..817a89423f429a6a8b -- src/include/catalog/catversion.h
The more interesting note probably is that there quite possibly won't be
pg_upgrade'ability...
Yeah, that's the big thing really.
Requiring pg_upgrade between betas might even be "good" in the sense that then we get more testing of pg_upgrade :) But requiring a dump/reload is going to hurt people more.
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com>wrote: >> Well, I guess it depends on what we define 'beta1' to be. Imo evaluating >> problematic pieces of new code, locating unfinished pieces is part of >> that. I don't see much point in forbidding incompatible changes in beta1 >> personally. That robs th the development cycle of the only period where >> users can actually test the new version in a halfway sane manner and >> report back with things that apparently broken. > We need to be very careful to tell people about it though. Preferrably if > we *know* a dump/reload will be needed to go beta1->beta2, we should > actually document that in the releasenotes of beta1 already. So people can > make proper plans.. This seems like much ado about very little. The policy will be the same as it ever was: once beta1 is out, we prefer to avoid forcing an initdb, but we'll do it if we have to. In any case, +1 for fixing whatever needs to be fixed now; I expect to have a fix for the limited-GIN-index-length issue later today, and that really is also an on-disk format change, though it won't affect short index entries. ("Short" is TBD; I was thinking of hashing keys longer than say 128 bytes.) regards, tom lane
On 2014-05-07 09:44:36 -0400, Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com>wrote: > >> Well, I guess it depends on what we define 'beta1' to be. Imo evaluating > >> problematic pieces of new code, locating unfinished pieces is part of > >> that. I don't see much point in forbidding incompatible changes in beta1 > >> personally. That robs th the development cycle of the only period where > >> users can actually test the new version in a halfway sane manner and > >> report back with things that apparently broken. > > > We need to be very careful to tell people about it though. Preferrably if > > we *know* a dump/reload will be needed to go beta1->beta2, we should > > actually document that in the releasenotes of beta1 already. So people can > > make proper plans.. > > This seems like much ado about very little. The policy will be the same > as it ever was: once beta1 is out, we prefer to avoid forcing an initdb, > but we'll do it if we have to. I think Magnus' point is that we should tell users that we'll try but won't guarantee anything. -hackers knowing about it doesn't mean users will know. Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 7, 2014 at 4:40 AM, Andres Freund <andres@anarazel.de> wrote: > * Imo we need space in jsonb ondisk values to indicate a format > version. We won't fully get this right. That would be nice. > * The jentry representation should be changed so it's possible to get the type > of a entry without checking individual types. That'll make code like > findJsonbValueFromSuperHeader() (well, whatever you've renamed it to) > much easier to read. Preferrably so it an have the same values (after > shifting/masking) ask the JBE variants. And it needs space for futher > types (or representations thereof). I don't know what you mean. Every place that macros like JBE_ISNUMERIC() are used subsequently involves access to (say) numeric union fields. At best, you could have those functions check that the types match, and then handle each case in a switch that only looked at (say) the "candidate", but that doesn't really save much at all. It wouldn't take much to have the macros give enum constant values back as you suggest, though. > * I wonder if the hash/object pair representation is wise and if it > shouldn't be keys combined with offsets first, and then the > values. That will make access much faster. And that's what jsonb > essentially is about. I like that the physical layout reflects the layout of the original JSON document. Besides, I don't think this is obviously the case. Are you sure that it won't be more useful to make children as close to their parents as possible? Particularly given idiomatic usage. Jsonb, in point of fact, *is* fast. > * I think both arrays and hashes should grow individual structs. With > space for additional flags. Why? -- Peter Geoghegan
On 2014-05-07 10:55:28 -0700, Peter Geoghegan wrote: > On Wed, May 7, 2014 at 4:40 AM, Andres Freund <andres@anarazel.de> wrote: > > * The jentry representation should be changed so it's possible to get the type > > of a entry without checking individual types. That'll make code like > > findJsonbValueFromSuperHeader() (well, whatever you've renamed it to) > > much easier to read. Preferrably so it an have the same values (after > > shifting/masking) ask the JBE variants. And it needs space for futher > > types (or representations thereof). > > I don't know what you mean. Every place that macros like > JBE_ISNUMERIC() are used subsequently involves access to (say) numeric > union fields. At best, you could have those functions check that the > types match, and then handle each case in a switch that only looked at > (say) the "candidate", but that doesn't really save much at all. It > wouldn't take much to have the macros give enum constant values back > as you suggest, though. Compare for (i = 0; i < count; i++) { JEntry *e = array + i; if (JBE_ISNULL(*e) && key->type == jbvNull) { result->type = jbvNull; result->estSize= sizeof(JEntry); } else if (JBE_ISSTRING(*e) && key->type == jbvString) { result->type = jbvString; result->val.string.val = data + JBE_OFF(*e); result->val.string.len = JBE_LEN(*e); result->estSize = sizeof(JEntry) + result->val.string.len; } else if (JBE_ISNUMERIC(*e)&& key->type == jbvNumeric) { result->type = jbvNumeric; result->val.numeric= (Numeric) (data + INTALIGN(JBE_OFF(*e))); result->estSize = 2 * sizeof(JEntry) + VARSIZE_ANY(result->val.numeric); } elseif (JBE_ISBOOL(*e) && key->type == jbvBool) { result->type = jbvBool; result->val.boolean= JBE_ISBOOL_TRUE(*e) != 0; result->estSize = sizeof(JEntry); } else continue; if (compareJsonbScalarValue(key, result) == 0) return result; } with for (i = 0; i < count; i++) { JEntry *e = array + i; if (!JBE_TYPE_IS_SCALAR(*e)) continue; if (JBE_TYPE(*e) != key->type) continue; result = getJsonbValue(e); if (compareJsonbScalarValue(key, result) == 0) return result; } Yes, it's not a fair comparison because I made up getJsonbValue(). But it's *much* more readable regardless. And there's more places that could use it. Like the second half of findJsonbValueFromSuperHeader(). FWIW, that's one of the requests I definitely made before. > > * I wonder if the hash/object pair representation is wise and if it > > shouldn't be keys combined with offsets first, and then the > > values. That will make access much faster. And that's what jsonb > > essentially is about. > > I like that the physical layout reflects the layout of the original > JSON document. Don't see much value in that. This is a binary representation and it'd be bijective. > Besides, I don't think this is obviously the case. Are > you sure that it won't be more useful to make children as close to > their parents as possible? Particularly given idiomatic usage. Because - if done right - it would allow elementwise access without scanning previous values. > > * I think both arrays and hashes should grow individual structs. With > > space for additional flags. > Why? Because a) it will make the code more readable b) it'll allow for adding different representations of hashes/arrays. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@anarazel.de> writes: > * The jentry representation should be changed so it's possible to get the type > of a entry without checking individual types. That'll make code like > findJsonbValueFromSuperHeader() (well, whatever you've renamed it to) > much easier to read. Preferrably so it an have the same values (after > shifting/masking) ask the JBE variants. And it needs space for futher > types (or representations thereof). Further types? What further types? JSON seems unlikely to grow any other value types than what it's got. regards, tom lane
On 2014-05-07 14:48:51 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > * The jentry representation should be changed so it's possible to get the type > > of a entry without checking individual types. That'll make code like > > findJsonbValueFromSuperHeader() (well, whatever you've renamed it to) > > much easier to read. Preferrably so it an have the same values (after > > shifting/masking) ask the JBE variants. And it needs space for futher > > types (or representations thereof). > > Further types? What further types? JSON seems unlikely to grow any > other value types than what it's got. I am not thinking about user level exposed types, but e.g. a hash/object representation that allows duplicated keys and keeps the original order. Because I am pretty sure that'll come. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 05/07/2014 02:52 PM, Andres Freund wrote: > On 2014-05-07 14:48:51 -0400, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> * The jentry representation should be changed so it's possible to get the type >>> of a entry without checking individual types. That'll make code like >>> findJsonbValueFromSuperHeader() (well, whatever you've renamed it to) >>> much easier to read. Preferrably so it an have the same values (after >>> shifting/masking) ask the JBE variants. And it needs space for futher >>> types (or representations thereof). >> Further types? What further types? JSON seems unlikely to grow any >> other value types than what it's got. > I am not thinking about user level exposed types, but e.g. a hash/object > representation that allows duplicated keys and keeps the original > order. Because I am pretty sure that'll come. > That makes one of you. We've had hstore for years and nobody that I recall has asked for preservation of key order or duplicates. And any app that relies on either in JSON is still, IMNSHO, broken by design. OTOH, there are suggestions of "superjson" types that support other scalar types such as timestamps. cheers andrew
btw ... in jsonb.h there is this comment: * Jsonb Keys and string array elements are treated equivalently when* serialized to text index storage. One day we may wishto create an opclass* that only indexes values, but for now keys and values are stored in GIN* indexes in a way thatdoesn't really consider their relationship to each* other. Is this an obsolete speculation about writing jsonb_hash_ops, or is there still something worth preserving there? If so, what? regards, tom lane
On Wed, May 7, 2014 at 12:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > * Jsonb Keys and string array elements are treated equivalently when > * serialized to text index storage. One day we may wish to create an opclass > * that only indexes values, but for now keys and values are stored in GIN > * indexes in a way that doesn't really consider their relationship to each > * other. > > Is this an obsolete speculation about writing jsonb_hash_ops, or is there > still something worth preserving there? If so, what? This is not obsolete. It would equally apply to a GiST opclass that wanted to support our current definition of existence. Array elements are keys simply by virtue of being strings, but otherwise are treated as values. See the large comment block within gin_extract_jsonb(). -- Peter Geoghegan
And while I'm looking at it ... The jsonb_ops storage format for values is inherently lossy, because it cannot distinguish the string values "n", "t", "f" from JSON null or boolean true, false respectively; nor does it know the difference between a number and a string containing digits. This appears to not quite be a bug because the consistent functions force recheck for all queries that care about values (as opposed to keys). But if it's documented anywhere I don't see where. And in any case, is it a good idea? We could fairly easily change things so that these cases are guaranteed distinguishable. We're using an entire byte to convey one bit of information (key or value); I'm inclined to redefine the flag byte so that it tells not just that but which JSON datatype is involved. regards, tom lane
On Wed, May 7, 2014 at 12:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The jsonb_ops storage format for values is inherently lossy, because it > cannot distinguish the string values "n", "t", "f" from JSON null or > boolean true, false respectively; nor does it know the difference between > a number and a string containing digits. This appears to not quite be a > bug because the consistent functions force recheck for all queries that > care about values (as opposed to keys). But if it's documented anywhere > I don't see where. The fact that we *don't* set the reset flag for JsonbExistsStrategyNumber, and why that's okay is prominently documented. So I'd say that it is. > And in any case, is it a good idea? We could fairly > easily change things so that these cases are guaranteed distinguishable. > We're using an entire byte to convey one bit of information (key or > value); I'm inclined to redefine the flag byte so that it tells not just > that but which JSON datatype is involved. It seemed simpler to do it that way. As I've said before, jsonb_ops is mostly concerned with hstore-style indexing. It could also be particularly useful for expressional indexes on "tags" arrays of strings, which is a common use-case. jsonb_hash_ops on the other hand is for testing containment, which is useful for querying heavily nested documents, where typically there is a very low selectivity for keys. It's not the default largely because I was concerned about not supporting all indexable operators by default, and because I suspected that it would be preferred to have a default closer to hstore. Anyway, doing things that way for values won't obviate the need to set the reset flag, unless you come up with a much more sophisticated scheme. Existence (of keys) is only tested in respect of the top-level. Containment (where values are tested) is more complicated. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > On Wed, May 7, 2014 at 12:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Is this an obsolete speculation about writing jsonb_hash_ops, or is there >> still something worth preserving there? If so, what? > This is not obsolete. It would equally apply to a GiST opclass that > wanted to support our current definition of existence. Array elements > are keys simply by virtue of being strings, but otherwise are treated > as values. See the large comment block within gin_extract_jsonb(). It's not that aspect I'm complaining about, it's the bit about "one day we may wish to write". This comment should restrict itself to describing what jsonb_ops does, not make already-or-soon-to-be-obsolete statements about what other opclasses might or might not do. regards, tom lane
Peter Geoghegan <pg@heroku.com> writes: > On Wed, May 7, 2014 at 12:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The jsonb_ops storage format for values is inherently lossy, because it >> cannot distinguish the string values "n", "t", "f" from JSON null or >> boolean true, false respectively; nor does it know the difference between >> a number and a string containing digits. This appears to not quite be a >> bug because the consistent functions force recheck for all queries that >> care about values (as opposed to keys). But if it's documented anywhere >> I don't see where. > The fact that we *don't* set the reset flag for > JsonbExistsStrategyNumber, and why that's okay is prominently > documented. So I'd say that it is. Meh. Are you talking about the large comment block in gin_extract_jsonb? The readability of that comment starts to go downhill with its use of "reset" to refer to what everything else calls a "recheck" flag, and in any case it's claiming that we *don't* need a recheck for exists (a statement I suspect to be false, but more later). It entirely fails to explain that other query types *do* need a recheck because of arbitrary decisions about not representing JSON datatype information fully. There's another comment in gin_consistent_jsonb that's just as misleading, because it mentions (vaguely) some reasons why recheck is necessary without mentioning this one. A larger issue here is that, to the extent that this comment even has the information I'm after, the comment is in the wrong place. It is not attached to the code that's actually making the lossy representational choices (ie, make_scalar_key), nor to the code that decides whether or not a recheck is needed (ie, gin_consistent_jsonb). I don't think that basic architectural choices like these should be relegated to comment blocks inside specific functions to begin with. A README file would be better, perhaps, but there's not a specific directory associated with the jsonb code; so I think this sort of info belongs either in jsonb.h or in the file header comment for jsonb_gin.c. > Anyway, doing things that way for values won't obviate the need to set > the reset flag, unless you come up with a much more sophisticated > scheme. Existence (of keys) is only tested in respect of the > top-level. Containment (where values are tested) is more complicated. I'm not expecting that it will make things better for the current query operators. I am thinking that exact rather than lossy storage might help for future operators wanting to use this same index representation. Once we ship 9.4, it's going to be very hard to change the index representation, especially for the default opclass (cf the business about which opclass is default for type inet). So it behooves us to not throw information away needlessly. regards, tom lane
On 05/07/2014 04:13 PM, Tom Lane wrote: > A README file would be better, > perhaps, but there's not a specific directory associated with the jsonb > code; so I think this sort of info belongs either in jsonb.h or in the > file header comment for jsonb_gin.c. > > Is there any reason we couldn't have a README.jsonb? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 05/07/2014 04:13 PM, Tom Lane wrote: >> A README file would be better, >> perhaps, but there's not a specific directory associated with the jsonb >> code; so I think this sort of info belongs either in jsonb.h or in the >> file header comment for jsonb_gin.c. > Is there any reason we couldn't have a README.jsonb? We could, but the only place I can see to put it would be in backend/utils/adt/, which seems like a poor precedent; I don't want to end up with forty-two README.foo files in there. The header comments for the jsonbxxx.c files are probably better candidates for collecting this sort of info. (The larger problem here is that utils/adt/ has become a catchbasin for all kinds of stuff that can barely squeeze under the rubric of "abstract data type". But fixing that is something I don't care to tackle now.) regards, tom lane