Thread: PATCH: Add hstore_to_json()
I just realized that this was easy to do, and despite my complete lack of C skillz was able to throw this together in a coupleof hours. It might be handy to some, though the possible downsides are: * No json_to_hstore(). * Leads to requests for hstore_to_yaml(), hstore_to_xml(), etc. * Andrew Gierth said “no” when I suggested it. But it's kind of handy, too. Thoughts? Best, David
Attachment
On Wed, Dec 16, 2009 at 2:28 PM, David E. Wheeler <david@kineticode.com> wrote: > I just realized that this was easy to do, and despite my complete lack of C skillz was able to throw this together in acouple of hours. It might be handy to some, though the possible downsides are: > > * No json_to_hstore(). > * Leads to requests for hstore_to_yaml(), hstore_to_xml(), etc. > * Andrew Gierth said “no” when I suggested it. > > But it's kind of handy, too. Thoughts? I like it. The regression tests you've added seem to cover a lot of cases that aren't really different without covering some that are probably worth trying, like multiple key/value pairs. Also, the comment in the function you've added looks like a cut-and-paste from somewhere else, which might not be the best way to document. With regard to the underlying issue, why can't we just use a StringInfo and forget about it? Also, your indentation is not entirely consistent. If this gets consensus, that will have to be fixed before it can be committed, so it would be nice if you could do that rather than leaving it for the eventual committer. ...Robert
On Dec 16, 2009, at 2:45 PM, Robert Haas wrote: > I like it. The regression tests you've added seem to cover a lot of > cases that aren't really different without covering some that are > probably worth trying, like multiple key/value pairs. Also, the > comment in the function you've added looks like a cut-and-paste from > somewhere else, which might not be the best way to document. With > regard to the underlying issue, why can't we just use a StringInfo and > forget about it? Dunno. I just duped hstore_out(). I agree there should be more edge cases. > Also, your indentation is not entirely consistent. If this gets > consensus, that will have to be fixed before it can be committed, so > it would be nice if you could do that rather than leaving it for the > eventual committer. The indentation is also largely copied; wouldn't pg_indent fix it? Best, David
On Wed, Dec 16, 2009 at 5:58 PM, David E. Wheeler <david@kineticode.com> wrote: > On Dec 16, 2009, at 2:45 PM, Robert Haas wrote: > >> I like it. The regression tests you've added seem to cover a lot of >> cases that aren't really different without covering some that are >> probably worth trying, like multiple key/value pairs. Also, the >> comment in the function you've added looks like a cut-and-paste from >> somewhere else, which might not be the best way to document. With >> regard to the underlying issue, why can't we just use a StringInfo and >> forget about it? > > Dunno. I just duped hstore_out(). I agree there should be more edge cases. > >> Also, your indentation is not entirely consistent. If this gets >> consensus, that will have to be fixed before it can be committed, so >> it would be nice if you could do that rather than leaving it for the >> eventual committer. > > The indentation is also largely copied; wouldn't pg_indent fix it? Yeah, eventually, but that's not really a great way of dealing with it. http://archives.postgresql.org/pgsql-hackers/2009-12/msg01208.php ...Robert
On ons, 2009-12-16 at 11:28 -0800, David E. Wheeler wrote: > I just realized that this was easy to do, and despite my complete lack of C skillz was able to throw this together in acouple of hours. It might be handy to some, though the possible downsides are: > > * No json_to_hstore(). > * Leads to requests for hstore_to_yaml(), hstore_to_xml(), etc. > * Andrew Gierth said “no” when I suggested it. > > But it's kind of handy, too. Thoughts? Should we create a json type before adding all kinds of json formatted data? Or are we content with json as text?
On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: > Should we create a json type before adding all kinds of json formatted > data? Or are we content with json as text? json_data_type++ D
On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler <david@kineticode.com> wrote: > On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: > >> Should we create a json type before adding all kinds of json formatted >> data? Or are we content with json as text? > > json_data_type++ What would that do for us? I'm not opposed to it, but it seems like the more important thing would be to provide functions or operators that can do things like extract an array, extract a hash key, identify whether something is a hash, list, or scalar, etc. ...Robert
On Dec 18, 2009, at 8:51 AM, Robert Haas wrote: > What would that do for us? > > I'm not opposed to it, but it seems like the more important thing > would be to provide functions or operators that can do things like > extract an array, extract a hash key, identify whether something is a > hash, list, or scalar, etc. Such things would be included with such a data type, no? Best, David
Robert Haas wrote: > On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler <david@kineticode.com> wrote: > >> On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: >> >> >>> Should we create a json type before adding all kinds of json formatted >>> data? Or are we content with json as text? >>> >> json_data_type++ >> > > What would that do for us? > > I'm not opposed to it, but it seems like the more important thing > would be to provide functions or operators that can do things like > extract an array, extract a hash key, identify whether something is a > hash, list, or scalar, etc. > > > In principle it's not a bad idea to have a JSON type for several reasons. First, it's a better match than hstore for serializing an arbitrary tuple, because unlike hstore it can have nested arrays and composites, just as tuples can. Second, it might well be very useful if we could easily return results as JSON to AJAX applications, which are increasingly becoming the norm. And similarly we might be able to reduce application load if Postgres could perform operations on JSON, rather than having to return it all to the client to process. I think it would be useful if someone produced a JSON module as, say, a pgFoundry project, to start with, and we would then be better able to assess its usefulness. An interesting question would be how one might sanely index such things. You're correct that we don't necessarily need a new type, we could just make it text and have a bunch of operations, but that seems to violate the principle of data type abstraction a bit. If the operations can be sure that the object is valid JSON they could skip a bunch of sanity checks that they would otherwise need to do if just handed an arbitrary piece of text. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > You're correct that we don't necessarily need a new type, we could just > make it text and have a bunch of operations, but that seems to violate > the principle of data type abstraction a bit. I think the relevant precedent is that we have an xml type. While I surely don't want to follow the SQL committee's precedent of inventing a ton of special syntax for xml support, it might be useful to look at that for suggestions of what functionality would be useful for a json type. [ I can already hear somebody insisting on a yaml type :-( ] regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> You're correct that we don't necessarily need a new type, we could just >> make it text and have a bunch of operations, but that seems to violate >> the principle of data type abstraction a bit. >> > > I think the relevant precedent is that we have an xml type. While I > surely don't want to follow the SQL committee's precedent of inventing > a ton of special syntax for xml support, it might be useful to look at > that for suggestions of what functionality would be useful for a json > type. > > [ I can already hear somebody insisting on a yaml type :-( ] > > > Now that's a case where I think a couple of converter functions at most should meet the need. cheers andrew
Andrew Dunstan wrote: > > > Tom Lane wrote: > > Andrew Dunstan <andrew@dunslane.net> writes: > > > >> You're correct that we don't necessarily need a new type, we could just > >> make it text and have a bunch of operations, but that seems to violate > >> the principle of data type abstraction a bit. > >> > > > > I think the relevant precedent is that we have an xml type. While I > > surely don't want to follow the SQL committee's precedent of inventing > > a ton of special syntax for xml support, it might be useful to look at > > that for suggestions of what functionality would be useful for a json > > type. > > > > [ I can already hear somebody insisting on a yaml type :-( ] > > > > > > > > Now that's a case where I think a couple of converter functions at most > should meet the need. I can see this feature getting web developers more excited about Postgres. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> [ I can already hear somebody insisting on a yaml type :-( ] > Now that's a case where I think a couple of converter functions at most > should meet the need. Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json <-> xml converters and let the latter type do all the heavy lifting? (If so, this patch ought to be hstore_to_xml instead.) regards, tom lane
Tom Lane escribió: > Andrew Dunstan <andrew@dunslane.net> writes: > > Tom Lane wrote: > >> [ I can already hear somebody insisting on a yaml type :-( ] > > > Now that's a case where I think a couple of converter functions at most > > should meet the need. > > Well, actually, now that you mention it: how much of a json type would > be duplicative of the xml stuff? Would it be sufficient to provide > json <-> xml converters and let the latter type do all the heavy lifting? > (If so, this patch ought to be hstore_to_xml instead.) But then there's the matter of overhead: how much would be wasted by transforming to XML, and then parsing the XML back to transform to JSON? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> Well, actually, now that you mention it: how much of a json type would >> be duplicative of the xml stuff? Would it be sufficient to provide >> json <-> xml converters and let the latter type do all the heavy lifting? >> (If so, this patch ought to be hstore_to_xml instead.) > But then there's the matter of overhead: how much would be wasted by > transforming to XML, and then parsing the XML back to transform to JSON? Well, that would presumably happen only when sending data to or from the client. It's not obvious that it would be much more expensive than the syntax checking you'd have to do anyway. If there's some reason to think that operating on json data would be much less expensive than operating on xml, there might be a case for having two distinct sets of operations internally, but I haven't heard anybody make that argument. regards, tom lane
On Fri, Dec 18, 2009 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Tom Lane escribió: >>> Well, actually, now that you mention it: how much of a json type would >>> be duplicative of the xml stuff? Would it be sufficient to provide >>> json <-> xml converters and let the latter type do all the heavy lifting? >>> (If so, this patch ought to be hstore_to_xml instead.) > >> But then there's the matter of overhead: how much would be wasted by >> transforming to XML, and then parsing the XML back to transform to JSON? > > Well, that would presumably happen only when sending data to or from the > client. It's not obvious that it would be much more expensive than the > syntax checking you'd have to do anyway. > > If there's some reason to think that operating on json data would be > much less expensive than operating on xml, there might be a case for > having two distinct sets of operations internally, but I haven't heard > anybody make that argument. One problem is that there is not a single well-defined mapping between these types. I would say generally that XML and YAML both have more types of constructs than JSON. The obvious ways of translating an arbitrary XML document to JSON are likely not to be what people want in particular cases. I think the performance argument is compelling, too, but we can't even try benchmarking it unless we can define what we're even talking about. ...Robert
+1 for such a feature, simply to avoid the need of writing a hstore-parser (which wasn't too bad to write, but it felt unnecessary). Doesn't matter to me if it's hstore-to-json or hstore-to-xml or hstore-to-yaml. Just something that parsers are readily available for. Heck, I wouldn't mind if hstore moved to using any one of those for it's external representations by default. Tom Lane wrote: > a ton of special syntax for xml support, ...a json type... > [ I can already hear somebody insisting on a yaml type :-( ] If these were CPAN-like installable modules, I'd hope there would be eventually. Don't most languages and platforms have both YAML and JSON libraries? Yaml's user-defined types are an example of where this might be useful eventually. Tom Lane wrote: > Well, actually, now that you mention it: how much of a json type would > be duplicative of the xml stuff? Would it be sufficient to provide > json <-> xml converters and let the latter type do all the heavy lifting? I imagine eventually a JSON type could validate fields using JSON Schema. But that's drifting away from hstore. > (If so, this patch ought to be hstore_to_xml instead.) Doesn't matter to me so long as it's any format with readily available parsers.
On fre, 2009-12-18 at 11:51 -0500, Robert Haas wrote: > On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler <david@kineticode.com> wrote: > > On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: > > > >> Should we create a json type before adding all kinds of json formatted > >> data? Or are we content with json as text? > > > > json_data_type++ > > What would that do for us? At the moment it would be more of a placeholder, because if we later decide to add full-blown JSON-constructing and -destructing functionality, it would be difficult to change the signatures of all the existing functionality.
On Fri, Dec 18, 2009 at 4:39 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On fre, 2009-12-18 at 11:51 -0500, Robert Haas wrote: >> On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler <david@kineticode.com> wrote: >> > On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: >> > >> >> Should we create a json type before adding all kinds of json formatted >> >> data? Or are we content with json as text? >> > >> > json_data_type++ >> >> What would that do for us? > > At the moment it would be more of a placeholder, because if we later > decide to add full-blown JSON-constructing and -destructing > functionality, it would be difficult to change the signatures of all the > existing functionality. Good thought. ...Robert
Robert Haas wrote: > On Fri, Dec 18, 2009 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Alvaro Herrera <alvherre@commandprompt.com> writes: >> >>> Tom Lane escribió: >>> >>>> Well, actually, now that you mention it: how much of a json type would >>>> be duplicative of the xml stuff? Would it be sufficient to provide >>>> json <-> xml converters and let the latter type do all the heavy lifting? >>>> (If so, this patch ought to be hstore_to_xml instead.) >>>> >>> But then there's the matter of overhead: how much would be wasted by >>> transforming to XML, and then parsing the XML back to transform to JSON? >>> >> Well, that would presumably happen only when sending data to or from the >> client. It's not obvious that it would be much more expensive than the >> syntax checking you'd have to do anyway. >> >> If there's some reason to think that operating on json data would be >> much less expensive than operating on xml, there might be a case for >> having two distinct sets of operations internally, but I haven't heard >> anybody make that argument. >> > > One problem is that there is not a single well-defined mapping between > these types. I would say generally that XML and YAML both have more > types of constructs than JSON. The obvious ways of translating an > arbitrary XML document to JSON are likely not to be what people want > in particular cases. > Right. XML semantics are richer, as I pointed out when we were discussing the various EXPLAIN formats. > I think the performance argument is compelling, too, but we can't even > try benchmarking it unless we can define what we're even talking > about. > > > Yes, there is indeed reason to think that JSON processing, especially parsing, will be more efficient, and I suspect we can provide ways of accessing the data that are lots faster than XPath. JSON is designed to be lightweight, XML is not. Mind you, the XML processing is not too bad - I have been working much of the last few months on a large custom billing system which produces XML output to create paper/online invoices from, and the XML construction is one of the fastest parts of the whole system. cheers andrew
On Fri, Dec 18, 2009 at 7:05 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> One problem is that there is not a single well-defined mapping between >> these types. I would say generally that XML and YAML both have more >> types of constructs than JSON. The obvious ways of translating an >> arbitrary XML document to JSON are likely not to be what people want >> in particular cases. > Right. XML semantics are richer, as I pointed out when we were discussing > the various EXPLAIN formats. You say "richer"; I say "harder to map onto data structures". But we can agree to disagree on this one... I'm sure there are good tools out there. :-) >> I think the performance argument is compelling, too, but we can't even >> try benchmarking it unless we can define what we're even talking >> about. > > Yes, there is indeed reason to think that JSON processing, especially > parsing, will be more efficient, and I suspect we can provide ways of > accessing the data that are lots faster than XPath. JSON is designed to be > lightweight, XML is not. > > Mind you, the XML processing is not too bad - I have been working much of > the last few months on a large custom billing system which produces XML > output to create paper/online invoices from, and the XML construction is one > of the fastest parts of the whole system. That doesn't surprise me very much. If there's a problem with operations on XML, I think it tends to be more on the parsing side than the generation side. But even there I agree it's not terrible. The main reason I like JSON is for the simpler semantics - there's exactly one way to serialize and deserialize a data structure, and everyone agrees on what it is so the error cases are all handled by the parser itself, rather than left to the application programmer. ...Robert
On Fri, Dec 18, 2009 at 4:39 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On fre, 2009-12-18 at 11:51 -0500, Robert Haas wrote: >> On Fri, Dec 18, 2009 at 11:32 AM, David E. Wheeler <david@kineticode.com> wrote: >> > On Dec 18, 2009, at 4:49 AM, Peter Eisentraut wrote: >> > >> >> Should we create a json type before adding all kinds of json formatted >> >> data? Or are we content with json as text? >> > >> > json_data_type++ >> >> What would that do for us? > > At the moment it would be more of a placeholder, because if we later > decide to add full-blown JSON-constructing and -destructing > functionality, it would be difficult to change the signatures of all the > existing functionality. I've been mulling this over and I think this is a pretty good idea. If we could get it done in time for 8.5, we could actually change the output type of EXPLAIN (FORMAT JSON) to the new type. If not, I'm inclined to say that we should postpone adding any more functions that generate json output until such time as we have a real type for it. I wouldn't feel too bad about changing the output type of EXPLAIN (FORMAT JSON) from text to json in 8.6, because it's relatively difficult to be depending on that for anything very important. It's much easier to be depending on something like this, and changing it later could easily break working applications. Anyone have an interest in taking a crack at this? ...Robert
On Dec 29, 2009, at 6:14 PM, Robert Haas wrote: > I've been mulling this over and I think this is a pretty good idea. > If we could get it done in time for 8.5, we could actually change the > output type of EXPLAIN (FORMAT JSON) to the new type. If not, I'm > inclined to say that we should postpone adding any more functions that > generate json output until such time as we have a real type for it. I > wouldn't feel too bad about changing the output type of EXPLAIN > (FORMAT JSON) from text to json in 8.6, because it's relatively > difficult to be depending on that for anything very important. It's > much easier to be depending on something like this, and changing it > later could easily break working applications. +1 > Anyone have an interest in taking a crack at this? There are a bunch of C libraries listed on http://www.json.org/. Perhaps one has a suitable license and clean enough implementationto be used? Best, David
On Wed, Dec 30, 2009 at 12:38 PM, David E. Wheeler <david@kineticode.com> wrote: > On Dec 29, 2009, at 6:14 PM, Robert Haas wrote: > >> I've been mulling this over and I think this is a pretty good idea. >> If we could get it done in time for 8.5, we could actually change the >> output type of EXPLAIN (FORMAT JSON) to the new type. If not, I'm >> inclined to say that we should postpone adding any more functions that >> generate json output until such time as we have a real type for it. I >> wouldn't feel too bad about changing the output type of EXPLAIN >> (FORMAT JSON) from text to json in 8.6, because it's relatively >> difficult to be depending on that for anything very important. It's >> much easier to be depending on something like this, and changing it >> later could easily break working applications. > > +1 > >> Anyone have an interest in taking a crack at this? > > There are a bunch of C libraries listed on http://www.json.org/. Perhaps one has a suitable license and clean enough implementationto be used? It looks like they are all very permissive, though I wonder what the legal effect of a license clause that the software be used for Good and not Evil might be. I guess the question is whether we would slurp one of these into our code base, or whether we would add an analog of --with-libxml and provide only a stub implementation when the library is not present. Any opinions? Does anyone know whether any of these implementations are commonly packaged already? ...Robert
On Dec 30, 2009, at 9:53 AM, Robert Haas wrote: > It looks like they are all very permissive, though I wonder what the > legal effect of a license clause that the software be used for Good > and not Evil might be. Yeah, that might be too restrictive, given that PostgreSQL is used by government agencies and porn sites. Not that a givengov or porn site is inherently evil, mind, but some are. ;-P > I guess the question is whether we would slurp one of these into our > code base, or whether we would add an analog of --with-libxml and > provide only a stub implementation when the library is not present. > Any opinions? Does anyone know whether any of these implementations > are commonly packaged already? I doubt that they have similar interfaces, so we'd probably have to rely on one. I'd probably favor embedding, personally,it's less work for admins. Best, David
David E. Wheeler wrote: >> I guess the question is whether we would slurp one of these into our >> code base, or whether we would add an analog of --with-libxml and >> provide only a stub implementation when the library is not present. >> Any opinions? Does anyone know whether any of these implementations >> are commonly packaged already? >> > > I doubt that they have similar interfaces, so we'd probably have to rely on one. I'd probably favor embedding, personally,it's less work for admins. > > > I think we are getting the cart way before the horse. I'd like to see at least the outline of an API before we go any further. JSON is, shall we say, lightly specified, and doesn't appear to have any equivalent to XPath and friends, for example. How will we extract values from a JSON object? How will we be able to set values inside them? In ECMAScript it's not a problem, because the objects returned are just like any other objects, but that's not the case here. These are the sorts of questions we need to answer before we look at any implementation details, I think. cheers andrew
On Wed, Dec 30, 2009 at 1:23 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > I think we are getting the cart way before the horse. I'd like to see at > least the outline of an API before we go any further. JSON is, shall we say, > lightly specified, and doesn't appear to have any equivalent to XPath and > friends, for example. How will we extract values from a JSON object? How > will we be able to set values inside them? In ECMAScript it's not a problem, > because the objects returned are just like any other objects, but that's not > the case here. These are the sorts of questions we need to answer before we > look at any implementation details, I think. I think the idea that Peter was proposing was to start by creating a type that doesn't necessarily have a lot of operators or functions associated with it, with the thought of adding those later. It would still need to validate the input, of course. Anyhow, that might be a bad way to approach the problem, but I think that's how we got here. ...Robert
Robert Haas wrote: > On Wed, Dec 30, 2009 at 1:23 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > >> I think we are getting the cart way before the horse. I'd like to see at >> least the outline of an API before we go any further. JSON is, shall we say, >> lightly specified, and doesn't appear to have any equivalent to XPath and >> friends, for example. How will we extract values from a JSON object? How >> will we be able to set values inside them? In ECMAScript it's not a problem, >> because the objects returned are just like any other objects, but that's not >> the case here. These are the sorts of questions we need to answer before we >> look at any implementation details, I think. >> > > I think the idea that Peter was proposing was to start by creating a > type that doesn't necessarily have a lot of operators or functions > associated with it, with the thought of adding those later. It would > still need to validate the input, of course. > > Anyhow, that might be a bad way to approach the problem, but I think > that's how we got here. > > > That does not at all seem like a good way to go. Until we know what operations we want to support we have no idea which library to use. We can not assume that they will all support what we want to do. cheers andrew
On Wed, Dec 30, 2009 at 2:26 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Robert Haas wrote: >> On Wed, Dec 30, 2009 at 1:23 PM, Andrew Dunstan <andrew@dunslane.net> >> wrote: >>> I think we are getting the cart way before the horse. I'd like to see at >>> least the outline of an API before we go any further. JSON is, shall we >>> say, >>> lightly specified, and doesn't appear to have any equivalent to XPath and >>> friends, for example. How will we extract values from a JSON object? How >>> will we be able to set values inside them? In ECMAScript it's not a >>> problem, >>> because the objects returned are just like any other objects, but that's >>> not >>> the case here. These are the sorts of questions we need to answer before >>> we >>> look at any implementation details, I think. >>> >> >> I think the idea that Peter was proposing was to start by creating a >> type that doesn't necessarily have a lot of operators or functions >> associated with it, with the thought of adding those later. It would >> still need to validate the input, of course. >> >> Anyhow, that might be a bad way to approach the problem, but I think >> that's how we got here. >> > That does not at all seem like a good way to go. Until we know what > operations we want to support we have no idea which library to use. We can > not assume that they will all support what we want to do. Well that is a bit of a problem, yes... Doesn't seem insurmountable, though, just one more thing to think about as we're having this conversation. Someone else will need to weigh in on this point though, as I don't use JSON in a way that would make anything beyond validation particularly relevant. ...Robert
On ons, 2009-12-30 at 12:53 -0500, Robert Haas wrote: > It looks like they are all very permissive, though I wonder what the > legal effect of a license clause that the software be used for Good > and not Evil might be. It's not without issues, apparently: http://grep.be/blog/en/computer/legal/good_not_evil
On ons, 2009-12-30 at 13:23 -0500, Andrew Dunstan wrote: > I'd like to see at > least the outline of an API before we go any further. JSON is, shall > we > say, lightly specified, and doesn't appear to have any equivalent to > XPath and friends, for example. How will we extract values from a > JSON > object? How will we be able to set values inside them? I think the primary use will be to load a JSON value into Perl or Python and process it there. So a json type that doesn't have any interesting operators doesn't sound useless to me. The features I would like to get out of it are input validation and encoding handling and smooth integration with said languages.
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > Doesn't seem insurmountable, though, just one more thing to think > about as we're having this conversation. Someone else will need to > weigh in on this point though, as I don't use JSON in a way that would > make anything beyond validation particularly relevant. I don't use JSON, but I do use YAML. Attached, please find a patch that implements hstore_to_yaml(). ....just kidding. :) > I think we are getting the cart way before the horse. +1. Smells like a solution in search of a problem, as they say. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200912310759 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAks8oC4ACgkQvJuQZxSWSsgHfQCgznfnazYgVDz9ak5xfQZj6Fsk b6UAoMH/v3Lu0R+wkoN024GcZtxqpEI2 =ELcu -----END PGP SIGNATURE-----
On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: > I think the primary use will be to load a JSON value into Perl or Python > and process it there. So a json type that doesn't have any interesting > operators doesn't sound useless to me. The features I would like to get > out of it are input validation and encoding handling and smooth > integration with said languages. What about access to various parts of a JSON data structure? Or is that just asking for too much trouble up-front? Best, David
David E. Wheeler wrote: > On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: > > >> I think the primary use will be to load a JSON value into Perl or Python >> and process it there. So a json type that doesn't have any interesting >> operators doesn't sound useless to me. The features I would like to get >> out of it are input validation and encoding handling and smooth >> integration with said languages. >> > > What about access to various parts of a JSON data structure? Or is that just asking for too much trouble up-front? > > > IMNSHO it's essential. I think Peter's approach of ignoring this requirement is extremely shortsighted. cheers andrew
On Thu, Dec 31, 2009 at 11:12 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > David E. Wheeler wrote: >> On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: >>> >>> I think the primary use will be to load a JSON value into Perl or Python >>> and process it there. So a json type that doesn't have any interesting >>> operators doesn't sound useless to me. The features I would like to get >>> out of it are input validation and encoding handling and smooth >>> integration with said languages. >>> >> >> What about access to various parts of a JSON data structure? Or is that >> just asking for too much trouble up-front? > > IMNSHO it's essential. I think Peter's approach of ignoring this requirement > is extremely shortsighted. I could go either way on this. As a practical matter, we probably shouldn't pick a library that is only a validator without any ability to manipulate the data structure. And as a further practical matter, that done, it's probably not that much work to expose whatever other functionality that library provides. But I would not go to the extent of saying that we should try to figure out from first principles what functionality we want to include and then make it a requirement that the chosen library must support all of those things. That seems like a recipe for failure... Anyhow, that brings me back to the question I asked upthread, which is "Can/should we suck one of these libraries into our code base (and if so, which?) or do we need to add an analogue of --with-libxml so that we can link against an external library if present and omit the feature otherwise?". Does anyone have any real-world experience with any of the JSON C libraries? ...Robert
Robert Haas wrote: > Anyhow, that brings me back to the question I asked upthread, which is > "Can/should we suck one of these libraries into our code base (and if > so, which?) or do we need to add an analogue of --with-libxml so that > we can link against an external library if present and omit the > feature otherwise?". > > Does anyone have any real-world experience with any of the JSON C libraries? > > > I do not, but I see that YAJL <http://lloyd.github.com/yajl/> is now in Fedora, and has a BSDish license, so maybe that's a good place to start. Maybe someone would like to try designing an API which could sit atop that. Then we would not need to speculate based on principle. I'd rather we use a library we can pull in like libxml than have to import the source and have to keep in sync with the upstream. cheers andrew
Robert Haas <robertmhaas@gmail.com> writes: > Anyhow, that brings me back to the question I asked upthread, which is > "Can/should we suck one of these libraries into our code base (and if > so, which?) or do we need to add an analogue of --with-libxml so that > we can link against an external library if present and omit the > feature otherwise?". Count me as -1 for "sucking in" any sizable amount of code for this. I do not wish to be on the hook to maintain something like that. regards, tom lane
On Thu, Dec 31, 2009 at 5:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Anyhow, that brings me back to the question I asked upthread, which is >> "Can/should we suck one of these libraries into our code base (and if >> so, which?) or do we need to add an analogue of --with-libxml so that >> we can link against an external library if present and omit the >> feature otherwise?". > > Count me as -1 for "sucking in" any sizable amount of code for this. > I do not wish to be on the hook to maintain something like that. OK, that's why I ask these questions. :-) How much would be "siz(e)able"? ...Robert
Andrew Dunstan <andrew@dunslane.net> writes: >> Does anyone have any real-world experience with any of the JSON C libraries? > > I do not, but I see that YAJL <http://lloyd.github.com/yajl/> is now in > Fedora, and has a BSDish license It's there in debian too, unstable and testing, and should be there on the next stable (squeeze): http://packages.debian.org/source/sid/yajl Regards, -- dim
On tor, 2009-12-31 at 11:12 -0500, Andrew Dunstan wrote: > > David E. Wheeler wrote: > > On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: > > > > > >> I think the primary use will be to load a JSON value into Perl or Python > >> and process it there. So a json type that doesn't have any interesting > >> operators doesn't sound useless to me. The features I would like to get > >> out of it are input validation and encoding handling and smooth > >> integration with said languages. > >> > > > > What about access to various parts of a JSON data structure? Or is that just asking for too much trouble up-front? > IMNSHO it's essential. I think Peter's approach of ignoring this > requirement is extremely shortsighted. Whose requirement is it? I'm not ignoring it, but so far no one has actually said that it is a requirement and why.
Peter Eisentraut wrote: >> IMNSHO it's essential. I think Peter's approach of ignoring this >> requirement is extremely shortsighted. >> > > Whose requirement is it? I'm not ignoring it, but so far no one has > actually said that it is a requirement and why. > > Mine for one :-). Quite apart from any other reason I would expect it to make indexing parts of the JSON more tractable. Say we use it to store a web session object, which is a natural enough use. I might well want to find or modify sessions with certain characteristics. I'm sure I wouldn't be the only possible usewr who would want something substantially more of such a type than just being able to validate it. We have XPath for XML. and a substantial accessor API for hstore, so why would we want anything less for JSON? In general we have adopted an approach that allows for a very rich type system, with a substantial set of manipulator functions for almost all types. That's one of the things I find attractive about Postgres, so I think we should stick to it in this instance. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Peter Eisentraut wrote: >> Whose requirement is it? I'm not ignoring it, but so far no one has >> actually said that it is a requirement and why. > Mine for one :-). I think there are a couple of interacting factors here. We are not likely to want to go far out of our way to support JSON operations that aren't implemented by the library we pick (which I think is Peter's underlying point) but at the same time the set of supported operations ought to be a factor in which library we pick (which I think is Andrew's point). So it would be a good idea to try to make a list of desirable operations before we go looking at individual libraries. Whether any particular missing features are showstoppers for the use of a given library is something that we can't reasonably determine if we don't have a pre-existing notion of what features we want. Note that it's perfectly reasonable to change our list of desired features based on what we find out about what's actually available --- but we need something to start out with. regards, tom lane
On fre, 2010-01-01 at 17:19 -0500, Andrew Dunstan wrote: > Mine for one :-). Quite apart from any other reason I would expect it to > make indexing parts of the JSON more tractable. Say we use it to store a > web session object, which is a natural enough use. I might well want to > find or modify sessions with certain characteristics. I'm sure I > wouldn't be the only possible usewr who would want something > substantially more of such a type than just being able to validate it. > We have XPath for XML. and a substantial accessor API for hstore, so why > would we want anything less for JSON? Well, because they are not the same. XML is a tree structure (and the XPath-SQL integration is already pretty weird), hstore is a set of key/value pairs, JSON is, supposedly, an object, which doesn't map very well to SQL. Of course you could invent an API for JSON, but that doesn't mean it is necessary for a JSON type to exist, if you have PL/Perl and PL/Python as much better object-oriented APIs already available.
2010/1/3 Peter Eisentraut <peter_e@gmx.net>: > On fre, 2010-01-01 at 17:19 -0500, Andrew Dunstan wrote: >> Mine for one :-). Quite apart from any other reason I would expect it to >> make indexing parts of the JSON more tractable. Say we use it to store a >> web session object, which is a natural enough use. I might well want to >> find or modify sessions with certain characteristics. I'm sure I >> wouldn't be the only possible usewr who would want something >> substantially more of such a type than just being able to validate it. >> We have XPath for XML. and a substantial accessor API for hstore, so why >> would we want anything less for JSON? > > Well, because they are not the same. XML is a tree structure (and the > XPath-SQL integration is already pretty weird), hstore is a set of > key/value pairs, JSON is, supposedly, an object, which doesn't map very > well to SQL. JSON is all of trees, object (key-value pairs), and arrays, which help denormalization of tables. Moreover, I think it's complementary to SQL because it doesn't map to SQL. I don't think there are many operations that we need inside DB for JSON but at least indexing by gin is a typical case which means we need arbitrary "fetch" value operation from an object. And now that there are many server-side javascript like Node.js (http://nodejs.org/), storing, validating and direct output without converting from any other type is quite demanded feature of RDBM from web developer's view. A question: Isn't there no possibility that we have our own implementation to handle JSON (i.e. no use of external libraries)? Regards, -- Hitoshi Harada
Hitoshi Harada wrote: > A question: Isn't there no possibility that we have our own > implementation to handle JSON (i.e. no use of external libraries)? > > > Why should we reinvent a wheel someone else has already invented? This is what shared libraries are all about. cheers andrew
2010/1/3 Andrew Dunstan <andrew@dunslane.net>: > > > Hitoshi Harada wrote: >> >> A question: Isn't there no possibility that we have our own >> implementation to handle JSON (i.e. no use of external libraries)? >> >> >> > > Why should we reinvent a wheel someone else has already invented? This is > what shared libraries are all about. Because what we need may be another wheel nobody has already invented. I don't deny to use one of external libraries but don't like to decide specification by their specifications. Regards, -- Hitoshi Harada
Hitoshi Harada wrote: > 2010/1/3 Andrew Dunstan <andrew@dunslane.net>: > >> Hitoshi Harada wrote: >> >>> A question: Isn't there no possibility that we have our own >>> implementation to handle JSON (i.e. no use of external libraries)? >>> >>> >>> >>> >> Why should we reinvent a wheel someone else has already invented? This is >> what shared libraries are all about. >> > Because what we need may be another wheel nobody has already invented. > I don't deny to use one of external libraries but don't like to decide > specification by their specifications. > > > OK, we really need to stop being abstract and say what operations we want. I think the minimal functionality I'd want is: convert record to JSON convert JSON to record extract a value, or set of values, from JSON composition of JSON Now all the libraries I have looked at (briefly) would require some code to provide for those, possibly quite a bit of code, but that doesn't mean we should just start from scratch and write our own JSON parser too. cheers andrew
On Jan 3, 2010, at 8:00 AM, Andrew Dunstan wrote: > I think the minimal functionality I'd want is: > > convert record to JSON > convert JSON to record With caveats as to dealing with nested structures (can a record be an attribute of a record?). > extract a value, or set of values, from JSON > composition of JSON There's a lot of functionality in hstore that I'd like to see. It'd make sense to use the same operators for the same operations.I think I'd start with hstore as a basic spec. Best, David
David E. Wheeler wrote: > On Jan 3, 2010, at 8:00 AM, Andrew Dunstan wrote: > > >> I think the minimal functionality I'd want is: >> >> convert record to JSON >> convert JSON to record >> > > With caveats as to dealing with nested structures (can a record be an attribute of a record?). > We allow composites as fields. The biggest mismatch in the type model is probably w.r.t arrays. JSON arrays can be heterogenous and non-rectangular, AIUI. > >> extract a value, or set of values, from JSON >> composition of JSON >> > > There's a lot of functionality in hstore that I'd like to see. It'd make sense to use the same operators for the same operations.I think I'd start with hstore as a basic spec. > > > OK, but hstores are flat, unlike JSON. We need some way to do the equivalent of xpath along the child axis and without predicate tests. hstore has no real equivalent because it has no nesting. cheers andrew
On Jan 3, 2010, at 11:40 AM, Andrew Dunstan wrote: > We allow composites as fields. The biggest mismatch in the type model is probably w.r.t arrays. JSON arrays can be heterogenousand non-rectangular, AIUI. Cool, that sounds right. > OK, but hstores are flat, unlike JSON. We need some way to do the equivalent of xpath along the child axis and withoutpredicate tests. hstore has no real equivalent because it has no nesting. You mean so that you can fetch a nested value? Hrm. I agree that it's have to be XPath like. But perhaps we can use a JavaScript-ysyntax for it? There could be an operator that returns records: % SELECT '{"foo":{"bar":["a","b","c"]}}' -> '["foo"]'; bar ------------- ("{a,b,c}") % SELECT '{"foo":{"bar":["a","b","c"]}}' -> '["foo"][1]'; 1 ----- (b) And another that returns values where possible and JSON where there are data structures. % SELECT '{"foo":{"bar":["a","b","c"]}}' => '["foo"]'; ?column? ------------------ {"bar":{a,b,c}"} % SELECT '{"foo":{"bar":["a","b","c"]}}' => '["foo"][1]'; ?column? ---------- b Not sure if the same function can return different values, or if it's even appropriate. In addition to returning JSON andTEXT as above, we'd also need to be able to return numbers: % SELECT '{"foo":{"bar":[22,42]}}' => '["foo"][1]'; ?column? ---------- 42 Thoughts? David
2010/1/4 David E. Wheeler <david@kineticode.com>: > On Jan 3, 2010, at 11:40 AM, Andrew Dunstan wrote: > >> We allow composites as fields. The biggest mismatch in the type model is probably w.r.t arrays. JSON arrays can be heterogenousand non-rectangular, AIUI. > > Cool, that sounds right. Does it mean you should create composite type to create anonymous JSON? >> OK, but hstores are flat, unlike JSON. We need some way to do the equivalent of xpath along the child axis and withoutpredicate tests. hstore has no real equivalent because it has no nesting. > > You mean so that you can fetch a nested value? Hrm. I agree that it's have to be XPath like. But perhaps we can use a JavaScript-ysyntax for it? There could be an operator that returns records: > > % SELECT '{"foo":{"bar":["a","b","c"]}}' -> '["foo"]'; > bar > ------------- > ("{a,b,c}") > > % SELECT '{"foo":{"bar":["a","b","c"]}}' -> '["foo"][1]'; > 1 > ----- > (b) That sounds good and seems possible, as far as operator returns JSON always. Perhaps every JSON fetching returns JSON even if the result would be a number. You can cast it. % SELECT ('{"foo":{"bar":["a","b","c"]}}' -> '["foo"][1]')::text; 1 ----- b Regards, -- Hitoshi Harada
Hitoshi Harada wrote: > 2010/1/4 David E. Wheeler <david@kineticode.com>: > >> On Jan 3, 2010, at 11:40 AM, Andrew Dunstan wrote: >> >> >>> We allow composites as fields. The biggest mismatch in the type model is probably w.r.t arrays. JSON arrays can be heterogenousand non-rectangular, AIUI. >>> >> Cool, that sounds right. >> > > Does it mean you should create composite type to create anonymous JSON? > > No, not in the least. We should still store JSON as text. We should simply be able to convert a JSON value to a record of an existing type (providing it has the right shape) and a record (of any shape) to JSON. cheers andrew
On Jan 3, 2010, at 4:18 PM, Hitoshi Harada wrote: > That sounds good and seems possible, as far as operator returns JSON > always. Perhaps every JSON fetching returns JSON even if the result > would be a number. You can cast it. > > % SELECT ('{"foo":{"bar":["a","b","c"]}}' -> '["foo"][1]')::text; > 1 > ----- > b No, because 'b' isn't valid JSON. So if we want an interface that returns scalars, they can't be JSON. Best, David
On Sun, Jan 3, 2010 at 11:00 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > Hitoshi Harada wrote: >> 2010/1/3 Andrew Dunstan <andrew@dunslane.net>: >>> Hitoshi Harada wrote: >>>> A question: Isn't there no possibility that we have our own >>>> implementation to handle JSON (i.e. no use of external libraries)? >>> Why should we reinvent a wheel someone else has already invented? This is >>> what shared libraries are all about. >> Because what we need may be another wheel nobody has already invented. >> I don't deny to use one of external libraries but don't like to decide >> specification by their specifications. > OK, we really need to stop being abstract and say what operations we want. > I think the minimal functionality I'd want is: > > convert record to JSON > convert JSON to record > extract a value, or set of values, from JSON > composition of JSON > > Now all the libraries I have looked at (briefly) would require some code to > provide for those, possibly quite a bit of code, but that doesn't mean we > should just start from scratch and write our own JSON parser too. I think this is really vastly overkill. The set of operations I think we need is more like: - given a JSON value, tell me if it's a string, number, object, array, true, false, or null - given a JSON object, give me the list of member names (error if it's not a hash) - given a JSON object, give me the member named x (error if it's not a hash) - given a JSON array, give me the upper bound (error if it's not an array) - given a JSON array, give me the element at offset x (error if it's not an array) What you're talking about may or may not be useful and someone may or may not want to implement it, but insisting that we have to have it for the first version of a json type seems to me to be setting the bar quite a bit higher than necessary. ...Robert
2010/1/4 David E. Wheeler <david@kineticode.com>: > On Jan 3, 2010, at 4:18 PM, Hitoshi Harada wrote: > >> That sounds good and seems possible, as far as operator returns JSON >> always. Perhaps every JSON fetching returns JSON even if the result >> would be a number. You can cast it. >> >> % SELECT ('{"foo":{"bar":["a","b","c"]}}' -> '["foo"][1]')::text; >> 1 >> ----- >> b > > No, because 'b' isn't valid JSON. So if we want an interface that returns scalars, they can't be JSON. AFAIK string value can be parsed as JSON. At least my local v8 shell answers: > JSON.stringify({"foo": {"bar": ["a", "b", "c"]}}) {"foo":{"bar":["a","b","c"]}} > JSON.stringify("b") "b" Regards, -- Hitoshi Harada
On Sun, Jan 3, 2010 at 1:51 PM, David E. Wheeler <david@kineticode.com> wrote: > On Jan 3, 2010, at 8:00 AM, Andrew Dunstan wrote: > >> I think the minimal functionality I'd want is: >> >> convert record to JSON >> convert JSON to record > > With caveats as to dealing with nested structures (can a record be an attribute of a record?). > >> extract a value, or set of values, from JSON >> composition of JSON > > There's a lot of functionality in hstore that I'd like to see. It'd make sense to use the same operators for the same operations.I think I'd start with hstore as a basic spec. David, Is this something you are planning to work on for the 2010-01-15 CommitFest? If not, I think we should go ahead and mark the patch which was the original subject of this thread "Returned with Feedback", as it does not seem to make sense to add it unless we add a json type first. Thoughts? ...Robert
On Jan 4, 2010, at 8:18 PM, Robert Haas wrote: > Is this something you are planning to work on for the 2010-01-15 > CommitFest? If not, I think we should go ahead and mark the patch > which was the original subject of this thread "Returned with > Feedback", as it does not seem to make sense to add it unless we add a > json type first. Not me, too much on my plate, and not enough C knowledge to be efficient. Agreed on "Returned with Feedback." Best, David