Thread: jsonb_set() strictness considered harmful to data
Hello, I am one of the primary maintainers of Pleroma, a federated social networking application written in Elixir, which uses PostgreSQL in ways that may be considered outside the typical usage scenarios for PostgreSQL. Namely, we leverage JSONB heavily as a backing store for JSON-LD documents[1]. We also use JSONB in combination with Ecto's "embedded structs" to store things like user preferences. The fact that we can use JSONB to achieve our design goals is a testament to the flexibility PostgreSQL has. However, in the process of doing so, we have discovered a serious flaw in the way jsonb_set() functions, but upon reading through this mailing list, we have discovered that this flaw appears to be an intentional design.[2] A few times now, we have written migrations that do things like copy keys in a JSONB object to a new key, to rename them. These migrations look like so: update users set info=jsonb_set(info, '{bar}', info->'foo'); Typically, this works nicely, except for cases where evaluating info->'foo' results in an SQL null being returned. When that happens, jsonb_set() returns an SQL null, which then results in data loss.[3] This is not acceptable. PostgreSQL is a database that is renowned for data integrity, but here it is wiping out data when it encounters a failure case. The way jsonb_set() should fail in this case is to simply return the original input: it should NEVER return SQL null. But hey, we've been burned by this so many times now that we'd like to donate a useful function to the commons, consider it a mollyguard for the real jsonb_set() function. create or replace function safe_jsonb_set(target jsonb, path text[], new_value jsonb, create_missing boolean default true) returns jsonb as $$ declare result jsonb; begin result := jsonb_set(target, path, coalesce(new_value, 'null'::jsonb), create_missing); if result is NULL then return target; else return result; end if; end; $$ language plpgsql; This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's own jsonb_set() should have this safety feature built in. Without it, using jsonb_set() is like playing russian roulette with your data, which is not a reasonable expectation for a database renowned for its commitment to data integrity. Please fix this bug so that we do not have to hack around this bug. It has probably ruined countless people's days so far. I don't want to hear about how the function is strict, I'm aware it is strict, and that strictness is harmful. Please fix the function so that it is actually safe to use. [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal representation" that shares similar qualities to JSON-LD, so I use JSON-LD here as a simplification. [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@blaine.gmane.org [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an example of data loss induced by this issue. Ariadne
On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote: > Hello, > > I am one of the primary maintainers of Pleroma, a federated social > networking application written in Elixir, which uses PostgreSQL in > ways that may be considered outside the typical usage scenarios for > PostgreSQL. > > Namely, we leverage JSONB heavily as a backing store for JSON-LD > documents[1]. We also use JSONB in combination with Ecto's "embedded > structs" to store things like user preferences. > > The fact that we can use JSONB to achieve our design goals is a > testament to the flexibility PostgreSQL has. > > However, in the process of doing so, we have discovered a serious flaw > in the way jsonb_set() functions, but upon reading through this > mailing list, we have discovered that this flaw appears to be an > intentional design.[2] > > A few times now, we have written migrations that do things like copy > keys in a JSONB object to a new key, to rename them. These migrations > look like so: > > update users set info=jsonb_set(info, '{bar}', info->'foo'); > > Typically, this works nicely, except for cases where evaluating > info->'foo' results in an SQL null being returned. When that happens, > jsonb_set() returns an SQL null, which then results in data loss.[3] > > This is not acceptable. PostgreSQL is a database that is renowned for > data integrity, but here it is wiping out data when it encounters a > failure case. The way jsonb_set() should fail in this case is to > simply return the original input: it should NEVER return SQL null. > > But hey, we've been burned by this so many times now that we'd like to > donate a useful function to the commons, consider it a mollyguard for > the real jsonb_set() function. > > create or replace function safe_jsonb_set(target jsonb, path > text[], new_value jsonb, create_missing boolean default true) returns > jsonb as $$ > declare > result jsonb; > begin > result := jsonb_set(target, path, coalesce(new_value, > 'null'::jsonb), create_missing); > if result is NULL then > return target; > else > return result; > end if; > end; > $$ language plpgsql; > > This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's > own jsonb_set() should have this safety feature built in. Without it, > using jsonb_set() is like playing russian roulette with your data, > which is not a reasonable expectation for a database renowned for its > commitment to data integrity. > > Please fix this bug so that we do not have to hack around this bug. > It has probably ruined countless people's days so far. I don't want > to hear about how the function is strict, I'm aware it is strict, and > that strictness is harmful. Please fix the function so that it is > actually safe to use. > > [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal > representation" that shares similar qualities to JSON-LD, so I use > JSON-LD here as a simplification. > > [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@blaine.gmane.org > > [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an > example of data loss induced by this issue. > > Ariadne > This should be directed towards the hackers list, too. What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgresdoes not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data andpeople's time by now, but it's something. Kind regards, -- Mark Felder ports-secteam & portmgr alumni feld@FreeBSD.org
On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote: > Hello, > > I am one of the primary maintainers of Pleroma, a federated social > networking application written in Elixir, which uses PostgreSQL in > ways that may be considered outside the typical usage scenarios for > PostgreSQL. > > Namely, we leverage JSONB heavily as a backing store for JSON-LD > documents[1]. We also use JSONB in combination with Ecto's "embedded > structs" to store things like user preferences. > > The fact that we can use JSONB to achieve our design goals is a > testament to the flexibility PostgreSQL has. > > However, in the process of doing so, we have discovered a serious flaw > in the way jsonb_set() functions, but upon reading through this > mailing list, we have discovered that this flaw appears to be an > intentional design.[2] > > A few times now, we have written migrations that do things like copy > keys in a JSONB object to a new key, to rename them. These migrations > look like so: > > update users set info=jsonb_set(info, '{bar}', info->'foo'); > > Typically, this works nicely, except for cases where evaluating > info->'foo' results in an SQL null being returned. When that happens, > jsonb_set() returns an SQL null, which then results in data loss.[3] > > This is not acceptable. PostgreSQL is a database that is renowned for > data integrity, but here it is wiping out data when it encounters a > failure case. The way jsonb_set() should fail in this case is to > simply return the original input: it should NEVER return SQL null. > > But hey, we've been burned by this so many times now that we'd like to > donate a useful function to the commons, consider it a mollyguard for > the real jsonb_set() function. > > create or replace function safe_jsonb_set(target jsonb, path > text[], new_value jsonb, create_missing boolean default true) returns > jsonb as $$ > declare > result jsonb; > begin > result := jsonb_set(target, path, coalesce(new_value, > 'null'::jsonb), create_missing); > if result is NULL then > return target; > else > return result; > end if; > end; > $$ language plpgsql; > > This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's > own jsonb_set() should have this safety feature built in. Without it, > using jsonb_set() is like playing russian roulette with your data, > which is not a reasonable expectation for a database renowned for its > commitment to data integrity. > > Please fix this bug so that we do not have to hack around this bug. > It has probably ruined countless people's days so far. I don't want > to hear about how the function is strict, I'm aware it is strict, and > that strictness is harmful. Please fix the function so that it is > actually safe to use. > > [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal > representation" that shares similar qualities to JSON-LD, so I use > JSON-LD here as a simplification. > > [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@blaine.gmane.org > > [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an > example of data loss induced by this issue. > > Ariadne > This should be directed towards the hackers list, too. What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgresdoes not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data andpeople's time by now, but it's something. Kind regards, -- Mark Felder ports-secteam & portmgr alumni feld@FreeBSD.org
## Ariadne Conill (ariadne@dereferenced.org): > update users set info=jsonb_set(info, '{bar}', info->'foo'); > > Typically, this works nicely, except for cases where evaluating > info->'foo' results in an SQL null being returned. When that happens, > jsonb_set() returns an SQL null, which then results in data loss.[3] So why don't you use the facilities of SQL to make sure to only touch the rows which match the prerequisites? UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') WHERE info->'foo' IS NOT NULL; No special wrappers required. Regards, Christoph -- Spare Space
## Ariadne Conill (ariadne@dereferenced.org):
> update users set info=jsonb_set(info, '{bar}', info->'foo');
>
> Typically, this works nicely, except for cases where evaluating
> info->'foo' results in an SQL null being returned. When that happens,
> jsonb_set() returns an SQL null, which then results in data loss.[3]
So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?
UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
WHERE info->'foo' IS NOT NULL;
Hello, On Fri, Oct 18, 2019 at 4:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: > > ## Ariadne Conill (ariadne@dereferenced.org): > > > update users set info=jsonb_set(info, '{bar}', info->'foo'); > > > > Typically, this works nicely, except for cases where evaluating > > info->'foo' results in an SQL null being returned. When that happens, > > jsonb_set() returns an SQL null, which then results in data loss.[3] > > So why don't you use the facilities of SQL to make sure to only > touch the rows which match the prerequisites? > > UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') > WHERE info->'foo' IS NOT NULL; Why don't we fix the database engine to not eat data when the jsonb_set() operation fails? Telling people to work around design flaws in the software is what I would expect of MySQL, not a database known for its data integrity. Obviously, it is possible to adjust the UPDATE statement to only match certain pre-conditions, *if you know those pre-conditions may be a problem*. What happens with us, and with other people who have hit this bug with jsonb_set() is that they hit issues that were not previously known about, and that's when jsonb_set() eats your data. I would also like to point out that the MySQL equivalent, json_set() when presented with a similar failure simply returns the unmodified input. It is not unreasonable to do the same in PostgreSQL. Personally, as a developer, I expect PostgreSQL to be on their game better than MySQL. > No special wrappers required. A special wrapper is needed because jsonb_set() does broken things when invoked in situations that do not match the preconceptions of those situations. We will have to ship this wrapper for several years because of the current behaviour of the jsonb_set() function. Ariadne
Hello, On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: >> >> ## Ariadne Conill (ariadne@dereferenced.org): >> >> > update users set info=jsonb_set(info, '{bar}', info->'foo'); >> > >> > Typically, this works nicely, except for cases where evaluating >> > info->'foo' results in an SQL null being returned. When that happens, >> > jsonb_set() returns an SQL null, which then results in data loss.[3] >> >> So why don't you use the facilities of SQL to make sure to only >> touch the rows which match the prerequisites? >> >> UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') >> WHERE info->'foo' IS NOT NULL; >> > > There are many ways to add code to queries to make working with this function safer - though using them presupposes oneremembers at the time of writing the query that there is danger and caveats in using this function. I agree that we shouldhave (and now) provided sane defined behavior when one of the inputs to the function is null instead blowing off theissue and defining the function as being strict. Whether that is "ignore and return the original object" or "add thekey with a json null scalar value" is debatable but either is considerably more useful than returning SQL NULL. A great example of how we got burned by this last year: Pleroma maintains pre-computed counters in JSONB for various types of activities (posts, followers, followings). Last year, another counter was added, with a migration. But some people did not run the migration, because they are users, and that's what users do. This resulted in Pleroma blanking out the `info` structure for users as they performed new activities that incremented that counter. At that time, Pleroma maintained various things like private keys used to sign things in that JSONB column (we no longer do this because of being burned by this several times now), which broke federation temporarily for the affected accounts with other servers for up to a week as those servers had to learn new public keys for those accounts (since the original private keys were lost). I believe that anything that can be catastrophically broken by users not following upgrade instructions precisely is a serious problem, and can lead to serious problems. I am sure that this is not the only project using JSONB which have had users destroy their own data in such a completely preventable fashion. Ariadne
## Ariadne Conill (ariadne@dereferenced.org): > Why don't we fix the database engine to not eat data when the > jsonb_set() operation fails? It didn't fail, it worked like SQL (you've been doing SQL for too long when you get used to the NULL propagation, but that's still what SQL does - check "+" for example). And changing a function will cause fun for everyone who relies on the current behaviour - so at least it shouldn't be done on a whim (some might argue that a whim was what got us into this situation in the first place). Continuing along that thought, I'd even argue that your are writing code which relies on properties of the data which you never guaranteed. There is a use case for data types and constraints. Not that I'm arguing for maximum surprise in programming, but I'm a little puzzled when people eschew thew built-in tools and start implmenting them again side-to-side with what's already there. Regards, Christoph -- Spare Space
On 10/18/19 3:11 PM, Ariadne Conill wrote: > Hello, > > On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston > <david.g.johnston@gmail.com> wrote: >> >> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: >>> >>> ## Ariadne Conill (ariadne@dereferenced.org): >>> >>>> update users set info=jsonb_set(info, '{bar}', info->'foo'); >>>> >>>> Typically, this works nicely, except for cases where evaluating >>>> info->'foo' results in an SQL null being returned. When that happens, >>>> jsonb_set() returns an SQL null, which then results in data loss.[3] >>> >>> So why don't you use the facilities of SQL to make sure to only >>> touch the rows which match the prerequisites? >>> >>> UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') >>> WHERE info->'foo' IS NOT NULL; >>> >> >> There are many ways to add code to queries to make working with this function safer - though using them presupposes oneremembers at the time of writing the query that there is danger and caveats in using this function. I agree that we shouldhave (and now) provided sane defined behavior when one of the inputs to the function is null instead blowing off theissue and defining the function as being strict. Whether that is "ignore and return the original object" or "add thekey with a json null scalar value" is debatable but either is considerably more useful than returning SQL NULL. > > A great example of how we got burned by this last year: Pleroma > maintains pre-computed counters in JSONB for various types of > activities (posts, followers, followings). Last year, another counter > was added, with a migration. But some people did not run the > migration, because they are users, and that's what users do. This So you are more forgiving of your misstep, allowing users to run outdated code, then of running afoul of Postgres documented behavior: https://www.postgresql.org/docs/11/functions-json.html " The field/element/path extraction operators return NULL, rather than failing, if the JSON input does not have the right structure to match the request; for example if no such element exists" Just trying to figure why one is worse then the other. > resulted in Pleroma blanking out the `info` structure for users as > they performed new activities that incremented that counter. At that > time, Pleroma maintained various things like private keys used to sign > things in that JSONB column (we no longer do this because of being > burned by this several times now), which broke federation temporarily > for the affected accounts with other servers for up to a week as those > servers had to learn new public keys for those accounts (since the > original private keys were lost). > > I believe that anything that can be catastrophically broken by users > not following upgrade instructions precisely is a serious problem, and > can lead to serious problems. I am sure that this is not the only > project using JSONB which have had users destroy their own data in > such a completely preventable fashion. > > Ariadne > > > -- Adrian Klaver adrian.klaver@aklaver.com
Hello, On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 10/18/19 3:11 PM, Ariadne Conill wrote: > > Hello, > > > > On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston > > <david.g.johnston@gmail.com> wrote: > >> > >> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: > >>> > >>> ## Ariadne Conill (ariadne@dereferenced.org): > >>> > >>>> update users set info=jsonb_set(info, '{bar}', info->'foo'); > >>>> > >>>> Typically, this works nicely, except for cases where evaluating > >>>> info->'foo' results in an SQL null being returned. When that happens, > >>>> jsonb_set() returns an SQL null, which then results in data loss.[3] > >>> > >>> So why don't you use the facilities of SQL to make sure to only > >>> touch the rows which match the prerequisites? > >>> > >>> UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') > >>> WHERE info->'foo' IS NOT NULL; > >>> > >> > >> There are many ways to add code to queries to make working with this function safer - though using them presupposesone remembers at the time of writing the query that there is danger and caveats in using this function. I agreethat we should have (and now) provided sane defined behavior when one of the inputs to the function is null insteadblowing off the issue and defining the function as being strict. Whether that is "ignore and return the originalobject" or "add the key with a json null scalar value" is debatable but either is considerably more useful than returningSQL NULL. > > > > A great example of how we got burned by this last year: Pleroma > > maintains pre-computed counters in JSONB for various types of > > activities (posts, followers, followings). Last year, another counter > > was added, with a migration. But some people did not run the > > migration, because they are users, and that's what users do. This > > So you are more forgiving of your misstep, allowing users to run > outdated code, then of running afoul of Postgres documented behavior: I'm not forgiving of either. > https://www.postgresql.org/docs/11/functions-json.html > " The field/element/path extraction operators return NULL, rather than > failing, if the JSON input does not have the right structure to match > the request; for example if no such element exists" It is known that the extraction operators return NULL. The problem here is jsonb_set() returning NULL when it encounters SQL NULL. > Just trying to figure why one is worse then the other. Any time a user loses data, it is worse. The preference for not having data loss is why Pleroma uses PostgreSQL as it's database of choice, as PostgreSQL has traditionally valued durability. If we should not use PostgreSQL, just say so. Ariadne > > > resulted in Pleroma blanking out the `info` structure for users as > > they performed new activities that incremented that counter. At that > > time, Pleroma maintained various things like private keys used to sign > > things in that JSONB column (we no longer do this because of being > > burned by this several times now), which broke federation temporarily > > for the affected accounts with other servers for up to a week as those > > servers had to learn new public keys for those accounts (since the > > original private keys were lost). > > > > I believe that anything that can be catastrophically broken by users > > not following upgrade instructions precisely is a serious problem, and > > can lead to serious problems. I am sure that this is not the only > > project using JSONB which have had users destroy their own data in > > such a completely preventable fashion. > > > > Ariadne > > > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com
Hello, On Fri, Oct 18, 2019 at 5:57 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: > > ## Ariadne Conill (ariadne@dereferenced.org): > > > Why don't we fix the database engine to not eat data when the > > jsonb_set() operation fails? > > It didn't fail, it worked like SQL (you've been doing SQL for too > long when you get used to the NULL propagation, but that's still > what SQL does - check "+" for example). > And changing a function will cause fun for everyone who relies on > the current behaviour - so at least it shouldn't be done on a whim > (some might argue that a whim was what got us into this situation > in the first place). NULL propagation makes sense in the context of traditional SQL. What users expect from the JSONB support is for it to behave as JSON manipulation behaves everywhere else. It makes sense that 2 + NULL returns NULL -- it's easily understood as a type mismatch. It does not make sense that jsonb_set('{}'::jsonb, '{foo}', NULL) returns NULL because a *value* was SQL NULL. In this case, it should, at the least, automatically coalesce to 'null'::jsonb. > Continuing along that thought, I'd even argue that your are > writing code which relies on properties of the data which you never > guaranteed. There is a use case for data types and constraints. There is a use case, but this frequently comes up as a question people ask. At some point, you have to start pondering whether the behaviour does not make logical sense in the context that people frame the JSONB type and it's associated manipulation functions. It is not *obvious* that jsonb_set() will trash your data, but that is what it is capable of doing. In a database that is advertised as being durable and not trashing data, even. > Not that I'm arguing for maximum surprise in programming, but > I'm a little puzzled when people eschew thew built-in tools and > start implmenting them again side-to-side with what's already > there. If you read the safe_jsonb_set() function, all we do is coalesce any SQL NULL to 'null'::jsonb, which is what it should be doing anyway, and then additionally handling any *unanticipated* failure case on top of that. While you are arguing that we should use tools to work around unanticipated effects (that are not even documented -- in no place in the jsonb_set() documentation does it say "if you pass SQL NULL to this function as a value, it will return SQL NULL"), I am arguing that jsonb_set() shouldn't set people up for their data to be trashed in the first place. Even MySQL gets this right. MySQL! The database that everyone knows takes your data out for a night it will never forget. This argument is miserable. It does not matter to me how jsonb_set() works as long as it does not return NULL when passed NULL as a value to set. JSONB columns should be treated as the complex types that they are: you don't null out an entire hash table because someone set a key to SQL NULL. So, please, let us fix this. Ariadne
Greetings, * Ariadne Conill (ariadne@dereferenced.org) wrote: > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > https://www.postgresql.org/docs/11/functions-json.html > > " The field/element/path extraction operators return NULL, rather than > > failing, if the JSON input does not have the right structure to match > > the request; for example if no such element exists" > > It is known that the extraction operators return NULL. The problem > here is jsonb_set() returning NULL when it encounters SQL NULL. > > > Just trying to figure why one is worse then the other. > > Any time a user loses data, it is worse. The preference for not > having data loss is why Pleroma uses PostgreSQL as it's database of > choice, as PostgreSQL has traditionally valued durability. If we > should not use PostgreSQL, just say so. Your contention that the documented, clear, and easily addressed behavior of a particular strict function equates to "the database system loses data and isn't durable" is really hurting your arguments here, not helping it. The argument about how it's unintuitive and can cause application developers to misuse the function (which is clearly an application bug, but perhaps an understandable one if the function interface isn't intuitive or is confusing) is a reasonable one and might be convincing enough to result in a change here. I'd suggest sticking to the latter argument when making this case. > > > I believe that anything that can be catastrophically broken by users > > > not following upgrade instructions precisely is a serious problem, and > > > can lead to serious problems. I am sure that this is not the only > > > project using JSONB which have had users destroy their own data in > > > such a completely preventable fashion. Let's be clear here that the issue with the upgrade instructions was that the user didn't follow your *application's* upgrade instructions, and your later code wasn't written to use the function, as documented, properly- this isn't a case of PG destroying your data. It's fine to contend that the interface sucks and that we should change it, but the argument that PG is eating data because the application sent a query to the database telling it, based on our documentation, to eat the data, isn't appropriate. Again, let's have a reasonable discussion here about if it makes sense to make a change here because the interface isn't intuitive and doesn't match what other systems do (I'm guessing it isn't in the SQL standard either, so we unfortunately can't look to that for help; though I'd hardly be surprised if they supported what PG does today anyway). As a practical response to the issue you've raised- have you considered using a trigger to check the validity of the new jsonb? Or, maybe, just made the jsonb column not nullable? With a trigger you could disallow non-null->null transistions, for example, or if it just shouldn't ever be null then making the column 'not null' would suffice. I'll echo Christoph's comments up thread too, though in my own language- these are risks you've explicitly accepted by using JSONB and writing your own validation and checks (or, not, apparently) rather than using what the database system provides. That doesn't mean I'm against making the change you suggest, but it certainly should become a lesson to anyone who is considering using primairly jsonb for their storage that it's risky to do so, because you're removing the database system's knowledge and understanding of the data, and further you tend to end up not having the necessary constraints in place to ensure that the data doesn't end up being garbage- thus letting your application destroy all the data easily due to an application bug. Thanks, Stephen
Attachment
Greetings, * Ariadne Conill (ariadne@dereferenced.org) wrote: > On Fri, Oct 18, 2019 at 5:57 PM Christoph Moench-Tegeder > <cmt@burggraben.net> wrote: > > ## Ariadne Conill (ariadne@dereferenced.org): > > > Why don't we fix the database engine to not eat data when the > > > jsonb_set() operation fails? > > > > It didn't fail, it worked like SQL (you've been doing SQL for too > > long when you get used to the NULL propagation, but that's still > > what SQL does - check "+" for example). > > And changing a function will cause fun for everyone who relies on > > the current behaviour - so at least it shouldn't be done on a whim > > (some might argue that a whim was what got us into this situation > > in the first place). > > NULL propagation makes sense in the context of traditional SQL. What > users expect from the JSONB support is for it to behave as JSON > manipulation behaves everywhere else. It makes sense that 2 + NULL > returns NULL -- it's easily understood as a type mismatch. It does > not make sense that jsonb_set('{}'::jsonb, '{foo}', NULL) returns NULL > because a *value* was SQL NULL. In this case, it should, at the > least, automatically coalesce to 'null'::jsonb. 2 + NULL isn't a type mismatch, just to be clear, it's "2 + unknown = unknown", which is pretty reasonable, if you accept the general notion of what NULL is to begin with. And as such, what follows with "set this blob of stuff to include this unknown thing ... implies ... we don't know what the result of the set is and therefore it's unknown" isn't entirely unreasonable, but I can agree that in this specific case, because what we're dealing with regarding JSONB is a complex data structure, not an individual value, that it's surprising to a developer and there can be an argument made there that we should consider changing it. > > Continuing along that thought, I'd even argue that your are > > writing code which relies on properties of the data which you never > > guaranteed. There is a use case for data types and constraints. > > There is a use case, but this frequently comes up as a question people > ask. At some point, you have to start pondering whether the behaviour > does not make logical sense in the context that people frame the JSONB > type and it's associated manipulation functions. It is not *obvious* > that jsonb_set() will trash your data, but that is what it is capable > of doing. In a database that is advertised as being durable and not > trashing data, even. Having the result of a call to a strict function be NULL isn't "trashing" your data. > > Not that I'm arguing for maximum surprise in programming, but > > I'm a little puzzled when people eschew thew built-in tools and > > start implmenting them again side-to-side with what's already > > there. > > If you read the safe_jsonb_set() function, all we do is coalesce any > SQL NULL to 'null'::jsonb, which is what it should be doing anyway, I'm not convinced that this is at all the right answer, particularly since we don't do that generally. We don't return the string 'null' when you do: NULL || 'abc', we return NULL. There might be something we can do here that doesn't result in the whole jsonb document becoming NULL though. > and then additionally handling any *unanticipated* failure case on top > of that. While you are arguing that we should use tools to work > around unanticipated effects (that are not even documented -- in no > place in the jsonb_set() documentation does it say "if you pass SQL > NULL to this function as a value, it will return SQL NULL"), I am > arguing that jsonb_set() shouldn't set people up for their data to be > trashed in the first place. The function is marked as strict, and the meaning of that is quite clearly defined in the documentation. I'm not against including a comment regarding this in the documentation, to be clear, though I seriously doubt it would actually have changed anything in this case. Thanks, Stephen
Attachment
On 10/18/19 4:31 PM, Ariadne Conill wrote: > Hello, > > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> On 10/18/19 3:11 PM, Ariadne Conill wrote: >>> Hello, >>> >>> On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston >>> <david.g.johnston@gmail.com> wrote: >>>> >>>> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: >>>>> >>>>> ## Ariadne Conill (ariadne@dereferenced.org): >>>>> >>>>>> update users set info=jsonb_set(info, '{bar}', info->'foo'); >>>>>> >>>>>> Typically, this works nicely, except for cases where evaluating >>>>>> info->'foo' results in an SQL null being returned. When that happens, >>>>>> jsonb_set() returns an SQL null, which then results in data loss.[3] >>>>> >>>>> So why don't you use the facilities of SQL to make sure to only >>>>> touch the rows which match the prerequisites? >>>>> >>>>> UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') >>>>> WHERE info->'foo' IS NOT NULL; >>>>> >>>> >>>> There are many ways to add code to queries to make working with this function safer - though using them presupposesone remembers at the time of writing the query that there is danger and caveats in using this function. I agreethat we should have (and now) provided sane defined behavior when one of the inputs to the function is null insteadblowing off the issue and defining the function as being strict. Whether that is "ignore and return the originalobject" or "add the key with a json null scalar value" is debatable but either is considerably more useful than returningSQL NULL. >>> >>> A great example of how we got burned by this last year: Pleroma >>> maintains pre-computed counters in JSONB for various types of >>> activities (posts, followers, followings). Last year, another counter >>> was added, with a migration. But some people did not run the >>> migration, because they are users, and that's what users do. This >> >> So you are more forgiving of your misstep, allowing users to run >> outdated code, then of running afoul of Postgres documented behavior: > > I'm not forgiving of either. > >> https://www.postgresql.org/docs/11/functions-json.html >> " The field/element/path extraction operators return NULL, rather than >> failing, if the JSON input does not have the right structure to match >> the request; for example if no such element exists" > > It is known that the extraction operators return NULL. The problem > here is jsonb_set() returning NULL when it encounters SQL NULL. I'm not following. Your original case was: jsonb_set(info, '{bar}', info->'foo'); where info->'foo' is equivalent to: test=# select '{"f1":1,"f2":null}'::jsonb ->'f3'; ?column? ---------- NULL So you know there is a possibility that a value extraction could return NULL and from your wrapper that COALESCE is the way to deal with this. > >> Just trying to figure why one is worse then the other. > > Any time a user loses data, it is worse. The preference for not > having data loss is why Pleroma uses PostgreSQL as it's database of > choice, as PostgreSQL has traditionally valued durability. If we > should not use PostgreSQL, just say so. There are any number of ways you can make Postgres lose data that are not related to durability e.g build the following in code: DELETE FROM some_table; and forget the WHERE. > > Ariadne > >> >>> resulted in Pleroma blanking out the `info` structure for users as >>> they performed new activities that incremented that counter. At that >>> time, Pleroma maintained various things like private keys used to sign >>> things in that JSONB column (we no longer do this because of being >>> burned by this several times now), which broke federation temporarily >>> for the affected accounts with other servers for up to a week as those >>> servers had to learn new public keys for those accounts (since the >>> original private keys were lost). >>> >>> I believe that anything that can be catastrophically broken by users >>> not following upgrade instructions precisely is a serious problem, and >>> can lead to serious problems. I am sure that this is not the only >>> project using JSONB which have had users destroy their own data in >>> such a completely preventable fashion. >>> >>> Ariadne >>> >>> >>> >> >> >> -- >> Adrian Klaver >> adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
Hello, On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost <sfrost@snowman.net> wrote: > > Greetings, > > * Ariadne Conill (ariadne@dereferenced.org) wrote: > > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > > https://www.postgresql.org/docs/11/functions-json.html > > > " The field/element/path extraction operators return NULL, rather than > > > failing, if the JSON input does not have the right structure to match > > > the request; for example if no such element exists" > > > > It is known that the extraction operators return NULL. The problem > > here is jsonb_set() returning NULL when it encounters SQL NULL. > > > > > Just trying to figure why one is worse then the other. > > > > Any time a user loses data, it is worse. The preference for not > > having data loss is why Pleroma uses PostgreSQL as it's database of > > choice, as PostgreSQL has traditionally valued durability. If we > > should not use PostgreSQL, just say so. > > Your contention that the documented, clear, and easily addressed > behavior of a particular strict function equates to "the database system > loses data and isn't durable" is really hurting your arguments here, not > helping it. > > The argument about how it's unintuitive and can cause application > developers to misuse the function (which is clearly an application bug, > but perhaps an understandable one if the function interface isn't > intuitive or is confusing) is a reasonable one and might be convincing > enough to result in a change here. > > I'd suggest sticking to the latter argument when making this case. > > > > > I believe that anything that can be catastrophically broken by users > > > > not following upgrade instructions precisely is a serious problem, and > > > > can lead to serious problems. I am sure that this is not the only > > > > project using JSONB which have had users destroy their own data in > > > > such a completely preventable fashion. > > Let's be clear here that the issue with the upgrade instructions was > that the user didn't follow your *application's* upgrade instructions, > and your later code wasn't written to use the function, as documented, > properly- this isn't a case of PG destroying your data. It's fine to > contend that the interface sucks and that we should change it, but the > argument that PG is eating data because the application sent a query to > the database telling it, based on our documentation, to eat the data, > isn't appropriate. Again, let's have a reasonable discussion here about > if it makes sense to make a change here because the interface isn't > intuitive and doesn't match what other systems do (I'm guessing it isn't > in the SQL standard either, so we unfortunately can't look to that for > help; though I'd hardly be surprised if they supported what PG does > today anyway). Okay, I will admit that saying PG is eating data is perhaps hyperbolic, but I will also say that the behaviour of jsonb_set() under this type of edge case is unintuitive and frequently results in unintended data loss. So, while PostgreSQL is not actually eating the data, it is putting the user in a position where they may suffer data loss if they are not extremely careful. Here is how other implementations handle this case: MySQL/MariaDB: select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: {"a":null,"b":2,"c":3} Microsoft SQL Server: select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: {"b":2,"c":3} Both of these outcomes make sense, given the nature of JSON objects. I am actually more in favor of what MSSQL does however, I think that makes the most sense of all. I did not compare to other database systems, because using them I found that there is a JSON_TABLE type function and then you do stuff with that to rewrite the object and dump it back out as JSON, and it's quite a mess. But MySQL and MSSQL have an equivalent jsonb inline modification function, as seen above. > As a practical response to the issue you've raised- have you considered > using a trigger to check the validity of the new jsonb? Or, maybe, just > made the jsonb column not nullable? With a trigger you could disallow > non-null->null transistions, for example, or if it just shouldn't ever > be null then making the column 'not null' would suffice. We have already mitigated the issue in a way we find appropriate to do. The suggestion of having a non-null constraint does seem useful as well and I will look into that. > I'll echo Christoph's comments up thread too, though in my own language- > these are risks you've explicitly accepted by using JSONB and writing > your own validation and checks (or, not, apparently) rather than using > what the database system provides. That doesn't mean I'm against > making the change you suggest, but it certainly should become a lesson > to anyone who is considering using primairly jsonb for their storage > that it's risky to do so, because you're removing the database system's > knowledge and understanding of the data, and further you tend to end up > not having the necessary constraints in place to ensure that the data > doesn't end up being garbage- thus letting your application destroy all > the data easily due to an application bug. Ariadne
Hello, On Fri, Oct 18, 2019 at 7:04 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 10/18/19 4:31 PM, Ariadne Conill wrote: > > Hello, > > > > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > >> > >> On 10/18/19 3:11 PM, Ariadne Conill wrote: > >>> Hello, > >>> > >>> On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston > >>> <david.g.johnston@gmail.com> wrote: > >>>> > >>>> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: > >>>>> > >>>>> ## Ariadne Conill (ariadne@dereferenced.org): > >>>>> > >>>>>> update users set info=jsonb_set(info, '{bar}', info->'foo'); > >>>>>> > >>>>>> Typically, this works nicely, except for cases where evaluating > >>>>>> info->'foo' results in an SQL null being returned. When that happens, > >>>>>> jsonb_set() returns an SQL null, which then results in data loss.[3] > >>>>> > >>>>> So why don't you use the facilities of SQL to make sure to only > >>>>> touch the rows which match the prerequisites? > >>>>> > >>>>> UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') > >>>>> WHERE info->'foo' IS NOT NULL; > >>>>> > >>>> > >>>> There are many ways to add code to queries to make working with this function safer - though using them presupposesone remembers at the time of writing the query that there is danger and caveats in using this function. I agreethat we should have (and now) provided sane defined behavior when one of the inputs to the function is null insteadblowing off the issue and defining the function as being strict. Whether that is "ignore and return the originalobject" or "add the key with a json null scalar value" is debatable but either is considerably more useful than returningSQL NULL. > >>> > >>> A great example of how we got burned by this last year: Pleroma > >>> maintains pre-computed counters in JSONB for various types of > >>> activities (posts, followers, followings). Last year, another counter > >>> was added, with a migration. But some people did not run the > >>> migration, because they are users, and that's what users do. This > >> > >> So you are more forgiving of your misstep, allowing users to run > >> outdated code, then of running afoul of Postgres documented behavior: > > > > I'm not forgiving of either. > > > >> https://www.postgresql.org/docs/11/functions-json.html > >> " The field/element/path extraction operators return NULL, rather than > >> failing, if the JSON input does not have the right structure to match > >> the request; for example if no such element exists" > > > > It is known that the extraction operators return NULL. The problem > > here is jsonb_set() returning NULL when it encounters SQL NULL. > > I'm not following. Your original case was: > > jsonb_set(info, '{bar}', info->'foo'); > > where info->'foo' is equivalent to: > > test=# select '{"f1":1,"f2":null}'::jsonb ->'f3'; > ?column? > ---------- > NULL > > So you know there is a possibility that a value extraction could return > NULL and from your wrapper that COALESCE is the way to deal with this. You're not following because you don't want to follow. It does not matter that info->'foo' is in my example. That's not what I am talking about. What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL. postgres=# \pset null '(null)' Null display is "(null)". postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL); jsonb_set ----------- (null) (1 row) This behaviour is basically giving an application developer a loaded shotgun and pointing it at their feet. It is not a good design. It is a design which has likely lead to many users experiencing unintentional data loss. Ariadne
What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL.
postgres=# \pset null '(null)'
Null display is "(null)".
postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
jsonb_set
-----------
(null)
(1 row)
This behaviour is basically giving an application developer a loaded
shotgun and pointing it at their feet. It is not a good design. It
is a design which has likely lead to many users experiencing
unintentional data loss.
Ariadne
Probably there will be some applications that needs NULL result in situations when value was not changed or when input value has not expected format. Design using in Postgres allows later customization - you can implement with COALESCE very simply behave that you want (sure, you have to know what you do). If Postgres implement design used by MySQL, then there is not any possibility to react on situation when update is not processed.
On Friday, October 18, 2019, Pavel Stehule <pavel.stehule@gmail.com> wrote:Probably there will be some applications that needs NULL result in situations when value was not changed or when input value has not expected format. Design using in Postgres allows later customization - you can implement with COALESCE very simply behave that you want (sure, you have to know what you do). If Postgres implement design used by MySQL, then there is not any possibility to react on situation when update is not processed.A CASE expression seems like it would work well for such detection in the rare case it is needed. Current behavior is unsafe with minimal or no redeeming qualities. Change it so passing in null raises an exception and make the user decide their own behavior if we don’t want to choose one for them.
David J.
Hello, On Sat, Oct 19, 2019 at 12:52 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > > so 19. 10. 2019 v 7:41 odesílatel David G. Johnston <david.g.johnston@gmail.com> napsal: >> >> On Friday, October 18, 2019, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >>> >>> Probably there will be some applications that needs NULL result in situations when value was not changed or when inputvalue has not expected format. Design using in Postgres allows later customization - you can implement with COALESCEvery simply behave that you want (sure, you have to know what you do). If Postgres implement design used by MySQL,then there is not any possibility to react on situation when update is not processed. >> >> >> A CASE expression seems like it would work well for such detection in the rare case it is needed. Current behavior isunsafe with minimal or no redeeming qualities. Change it so passing in null raises an exception and make the user decidetheir own behavior if we don’t want to choose one for them. > > > How you can do it? Buildn functions cannot to return more than one value. The NULL is one possible signal how to emit thisinformations. > > The NULL value can be problem everywhere - and is not consistent to raise exception somewhere and elsewhere not. > > I agree so the safe way is raising exception on NULL. Unfortunately, exception handling is pretty expensive in Postres(more in write transactions), so it should be used only when it is really necessary. I would say that any thing like update whatever set column=jsonb_set(column, '{foo}', NULL) should throw an exception. It should do, literally, *anything* else but blank that column. Ariadne
On Fri, Oct 18, 2019 at 09:14:09PM -0500, Ariadne Conill wrote: >Hello, > >On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost <sfrost@snowman.net> wrote: >> >> Greetings, >> >> * Ariadne Conill (ariadne@dereferenced.org) wrote: >> > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> > > https://www.postgresql.org/docs/11/functions-json.html >> > > " The field/element/path extraction operators return NULL, rather than >> > > failing, if the JSON input does not have the right structure to match >> > > the request; for example if no such element exists" >> > >> > It is known that the extraction operators return NULL. The problem >> > here is jsonb_set() returning NULL when it encounters SQL NULL. >> > >> > > Just trying to figure why one is worse then the other. >> > >> > Any time a user loses data, it is worse. The preference for not >> > having data loss is why Pleroma uses PostgreSQL as it's database of >> > choice, as PostgreSQL has traditionally valued durability. If we >> > should not use PostgreSQL, just say so. >> >> Your contention that the documented, clear, and easily addressed >> behavior of a particular strict function equates to "the database system >> loses data and isn't durable" is really hurting your arguments here, not >> helping it. >> >> The argument about how it's unintuitive and can cause application >> developers to misuse the function (which is clearly an application bug, >> but perhaps an understandable one if the function interface isn't >> intuitive or is confusing) is a reasonable one and might be convincing >> enough to result in a change here. >> >> I'd suggest sticking to the latter argument when making this case. >> >> > > > I believe that anything that can be catastrophically broken by users >> > > > not following upgrade instructions precisely is a serious problem, and >> > > > can lead to serious problems. I am sure that this is not the only >> > > > project using JSONB which have had users destroy their own data in >> > > > such a completely preventable fashion. >> >> Let's be clear here that the issue with the upgrade instructions was >> that the user didn't follow your *application's* upgrade instructions, >> and your later code wasn't written to use the function, as documented, >> properly- this isn't a case of PG destroying your data. It's fine to >> contend that the interface sucks and that we should change it, but the >> argument that PG is eating data because the application sent a query to >> the database telling it, based on our documentation, to eat the data, >> isn't appropriate. Again, let's have a reasonable discussion here about >> if it makes sense to make a change here because the interface isn't >> intuitive and doesn't match what other systems do (I'm guessing it isn't >> in the SQL standard either, so we unfortunately can't look to that for >> help; though I'd hardly be surprised if they supported what PG does >> today anyway). > >Okay, I will admit that saying PG is eating data is perhaps >hyperbolic, My experience is that using such hyperbole is pretty detrimental, even when one is trying to make a pretty sensible case. The problem is that people often respond in a similarly hyperbolic claims, particularly when you hit a nerve. And that's exactly what happened here, becase we're *extremely* sensitive about data corruption issues, so when you claim PostgreSQL is "eating data" people are likely to jump on you, beating you with the documentation stick. It's unfortunate, but it's also entirely predictable. >but I will also say that the behaviour of jsonb_set() >under this type of edge case is unintuitive and frequently results in >unintended data loss. So, while PostgreSQL is not actually eating the >data, it is putting the user in a position where they may suffer data >loss if they are not extremely careful. > >Here is how other implementations handle this case: > >MySQL/MariaDB: > >select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: > {"a":null,"b":2,"c":3} > >Microsoft SQL Server: > >select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: > {"b":2,"c":3} > >Both of these outcomes make sense, given the nature of JSON objects. >I am actually more in favor of what MSSQL does however, I think that >makes the most sense of all. > I do mostly agree with this. The json[b]_set behavior seems rather surprising, and I think I've seen a couple of cases running into exactly this issue. I've solved that with a simple CASE, but maybe changing the behavior would be better. That's unlikely to be back-patchable, though, so maybe a better option is to create a non-strict wrappers. But that does not work when the user is unaware of the behavior :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On Sat, Oct 19, 2019 at 1:08 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > >Here is how other implementations handle this case: > > > >MySQL/MariaDB: > > > >select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: > > {"a":null,"b":2,"c":3} > > > >Microsoft SQL Server: > > > >select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: > > {"b":2,"c":3} > > > >Both of these outcomes make sense, given the nature of JSON objects. > >I am actually more in favor of what MSSQL does however, I think that > >makes the most sense of all. > > > > I do mostly agree with this. The json[b]_set behavior seems rather > surprising, and I think I've seen a couple of cases running into exactly > this issue. I've solved that with a simple CASE, but maybe changing the > behavior would be better. That's unlikely to be back-patchable, though, > so maybe a better option is to create a non-strict wrappers. But that > does not work when the user is unaware of the behavior :-( Agree, that could be confusing. If I remember correctly, so far I've seen four or five such complains in mailing lists, but of course number of people who didn't reach out hackers is probably bigger. If we want to change it, the question is where to stop? Essentially we have: update table set data = some_func(data, some_args_with_null); where some_func happened to be jsonb_set, but could be any strict function. I wonder if in this case it makes sense to think about an alternative? For example, there is generic type subscripting patch, that allows to update a jsonb in the following way: update table set jsonb_data[key] = 'value'; It doesn't look like a function, so it's not a big deal if it will handle NULL values differently. And at the same time one can argue, that people, who are not aware about this caveat with jsonb_set and NULL values, will most likely use it due to a bit simpler syntax (more similar to some popular programming languages).
## Ariadne Conill (ariadne@dereferenced.org): > NULL propagation makes sense in the context of traditional SQL. What > users expect from the JSONB support is for it to behave as JSON > manipulation behaves everywhere else. Well, some users expect that. Others are using this interface as it is documented and implemented right now. And that's what makes this a somewhat difficult case: I wouldn't argue for one behaviour or the other if this was new functionality. But jsonb_set() was added in 9.5, and changing that behaviour now will make other people about as unhappy as you are right now. Further, "now" is a rather flexible term: the function cannot be changed "right now" with the next bugfix release (may break existing applications, deterring people from installing bugfixes: very bad) and there's about no way to get a new function into a bugfix release (catversion bump). The next chance to do anything here is version 13, to be expected around this time next year. This gives us ample time to think about a solution which is consistent and works for (almost) everyone - no need to force a behaviour change in that function right now (and in case it comes to that: which other json/jsonb-functions would be affected?). That creates a kind of bind for your case: you cannot rely on the new behaviour until the new version is in reasonably widespread use. Database servers are long-lived beasts - in the field, version 8.4 has finally mostly disappeared this year, but we still get some questions about that version here on the lists (8.4 went EOL over five years ago). At some point, you'll need to make a cut and require your users to upgrade the database. > At some point, you have to start pondering whether the behaviour > does not make logical sense in the context that people frame the JSONB > type and it's associated manipulation functions. But it does make sense from a SQL point of view - and this is a SQL database. JSON is not SQL (the sheer amount of "Note" in between the JSON functions and operators documentation is proof of that) and nots ASN.1, "people expect" depends a lot on what kind of people you ask. None of these expectations is "right" or "wrong" in an absolute manner. Code has to be "absolute" in order to be deterministic, and it should do so in a way that is unsurprising to the least amount of users: I'm willing to concede that jsonb_set() fails this test, but I'm still not convinced that your approach is much better just because it fits your specific use case. > It is not *obvious* > that jsonb_set() will trash your data, but that is what it is capable > of doing. It didn't. The data still fit the constraints you put on it: none, unfortunately. Which leads me to the advice for the time being (until we have this sorted out in one way or another, possibly the next major release): at least put a NOT NULL on columns which must be not NULL - that alone would have gone a long way to prevent the issues you've unfortunately had. You could even put CHECK constraints on your JSONB (like "CHECK (j->'info' IS NOT NULL)") to make sure it stays well-formed. As a SQL person, I'd even argue that you shouldn't use JSON columns for key data - there is a certain mismatch between SQL and JSON, which will get you now and then, and once you've implemented all the checks to be safe, you've build a type system when the database would have given you one for free. (And running UPDATEs inside your JSONB fields is not as efficient as on simple columns). And finally, you might put some version information in your database schema, so the application can check if all the neccessary data migrations have been run. Regards, Christoph -- Spare Space
Greetings, * Dmitry Dolgov (9erthalion6@gmail.com) wrote: > If we want to change it, the question is where to stop? Essentially we have: > > update table set data = some_func(data, some_args_with_null); > > where some_func happened to be jsonb_set, but could be any strict function. I don't think it makes any sense to try and extrapolate this out to other strict functions. Functions should be strict when it makes sense for them to be- in this case, it sounds like it doesn't really make sense for jsonb_set to be strict, and that's where we stop it. > I wonder if in this case it makes sense to think about an alternative? For > example, there is generic type subscripting patch, that allows to update a > jsonb in the following way: > > update table set jsonb_data[key] = 'value'; > > It doesn't look like a function, so it's not a big deal if it will handle NULL > values differently. And at the same time one can argue, that people, who are > not aware about this caveat with jsonb_set and NULL values, will most likely > use it due to a bit simpler syntax (more similar to some popular programming > languages). This seems like an entirely independent thing ... Thanks, Stephen
Attachment
On 10/18/19 3:10 PM, Mark Felder wrote: > > On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote: >> Hello, >> >> I am one of the primary maintainers of Pleroma, a federated social >> networking application written in Elixir, which uses PostgreSQL in >> ways that may be considered outside the typical usage scenarios for >> PostgreSQL. >> >> Namely, we leverage JSONB heavily as a backing store for JSON-LD >> documents[1]. We also use JSONB in combination with Ecto's "embedded >> structs" to store things like user preferences. >> >> The fact that we can use JSONB to achieve our design goals is a >> testament to the flexibility PostgreSQL has. >> >> However, in the process of doing so, we have discovered a serious flaw >> in the way jsonb_set() functions, but upon reading through this >> mailing list, we have discovered that this flaw appears to be an >> intentional design.[2] >> >> A few times now, we have written migrations that do things like copy >> keys in a JSONB object to a new key, to rename them. These migrations >> look like so: >> >> update users set info=jsonb_set(info, '{bar}', info->'foo'); >> >> Typically, this works nicely, except for cases where evaluating >> info->'foo' results in an SQL null being returned. When that happens, >> jsonb_set() returns an SQL null, which then results in data loss.[3] >> >> This is not acceptable. PostgreSQL is a database that is renowned for >> data integrity, but here it is wiping out data when it encounters a >> failure case. The way jsonb_set() should fail in this case is to >> simply return the original input: it should NEVER return SQL null. >> >> But hey, we've been burned by this so many times now that we'd like to >> donate a useful function to the commons, consider it a mollyguard for >> the real jsonb_set() function. >> >> create or replace function safe_jsonb_set(target jsonb, path >> text[], new_value jsonb, create_missing boolean default true) returns >> jsonb as $$ >> declare >> result jsonb; >> begin >> result := jsonb_set(target, path, coalesce(new_value, >> 'null'::jsonb), create_missing); >> if result is NULL then >> return target; >> else >> return result; >> end if; >> end; >> $$ language plpgsql; >> >> This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's >> own jsonb_set() should have this safety feature built in. Without it, >> using jsonb_set() is like playing russian roulette with your data, >> which is not a reasonable expectation for a database renowned for its >> commitment to data integrity. >> >> Please fix this bug so that we do not have to hack around this bug. >> It has probably ruined countless people's days so far. I don't want >> to hear about how the function is strict, I'm aware it is strict, and >> that strictness is harmful. Please fix the function so that it is >> actually safe to use. >> >> [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal >> representation" that shares similar qualities to JSON-LD, so I use >> JSON-LD here as a simplification. >> >> [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@blaine.gmane.org >> >> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an >> example of data loss induced by this issue. >> >> Ariadne >> > This should be directed towards the hackers list, too. > > What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgresdoes not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data andpeople's time by now, but it's something. > > The hyperbole here is misplaced. There is a difference between a bug and a POLA violation. This might be the latter, but it isn't the former. So please tone it down a bit. It's not the function that's unsafe, but the ill-informed use of it. We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it since 9.5. That's five releases ago. So it's a bit late to be coming to us telling us it's not safe (according to your preconceptions of what it should be doing). We could change it prospectively (i.e. from release 13 on) if we choose. But absent an actual bug (i.e. acting contrary to documented behaviour) we do not normally backpatch such changes, especially when there is a simple workaround for the perceived problem. And it's that policy that is in large measure responsible for Postgres' deserved reputation for stability. Incidentally, why is your function written in plpgsql? Wouldn't a simple SQL wrapper be better? create or replace function safe_jsonb_set (target jsonb, path text[], new_value jsonb, create_missing boolean default true) returns jsonb as $func$ select case when new_value is null then target else jsonb_set(target, path, new_value, create_missing) end $func$ language sql; And if we were to change it I'm not at all sure that we should do it the way that's suggested here, which strikes me as no more intuitive than the current behaviour. Rather I think we should possibly fill in a json null in the indicated place. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/18/19 3:10 PM, Mark Felder wrote: > > On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote: >> Hello, >> >> I am one of the primary maintainers of Pleroma, a federated social >> networking application written in Elixir, which uses PostgreSQL in >> ways that may be considered outside the typical usage scenarios for >> PostgreSQL. >> >> Namely, we leverage JSONB heavily as a backing store for JSON-LD >> documents[1]. We also use JSONB in combination with Ecto's "embedded >> structs" to store things like user preferences. >> >> The fact that we can use JSONB to achieve our design goals is a >> testament to the flexibility PostgreSQL has. >> >> However, in the process of doing so, we have discovered a serious flaw >> in the way jsonb_set() functions, but upon reading through this >> mailing list, we have discovered that this flaw appears to be an >> intentional design.[2] >> >> A few times now, we have written migrations that do things like copy >> keys in a JSONB object to a new key, to rename them. These migrations >> look like so: >> >> update users set info=jsonb_set(info, '{bar}', info->'foo'); >> >> Typically, this works nicely, except for cases where evaluating >> info->'foo' results in an SQL null being returned. When that happens, >> jsonb_set() returns an SQL null, which then results in data loss.[3] >> >> This is not acceptable. PostgreSQL is a database that is renowned for >> data integrity, but here it is wiping out data when it encounters a >> failure case. The way jsonb_set() should fail in this case is to >> simply return the original input: it should NEVER return SQL null. >> >> But hey, we've been burned by this so many times now that we'd like to >> donate a useful function to the commons, consider it a mollyguard for >> the real jsonb_set() function. >> >> create or replace function safe_jsonb_set(target jsonb, path >> text[], new_value jsonb, create_missing boolean default true) returns >> jsonb as $$ >> declare >> result jsonb; >> begin >> result := jsonb_set(target, path, coalesce(new_value, >> 'null'::jsonb), create_missing); >> if result is NULL then >> return target; >> else >> return result; >> end if; >> end; >> $$ language plpgsql; >> >> This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's >> own jsonb_set() should have this safety feature built in. Without it, >> using jsonb_set() is like playing russian roulette with your data, >> which is not a reasonable expectation for a database renowned for its >> commitment to data integrity. >> >> Please fix this bug so that we do not have to hack around this bug. >> It has probably ruined countless people's days so far. I don't want >> to hear about how the function is strict, I'm aware it is strict, and >> that strictness is harmful. Please fix the function so that it is >> actually safe to use. >> >> [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal >> representation" that shares similar qualities to JSON-LD, so I use >> JSON-LD here as a simplification. >> >> [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@blaine.gmane.org >> >> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an >> example of data loss induced by this issue. >> >> Ariadne >> > This should be directed towards the hackers list, too. > > What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgresdoes not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data andpeople's time by now, but it's something. > > The hyperbole here is misplaced. There is a difference between a bug and a POLA violation. This might be the latter, but it isn't the former. So please tone it down a bit. It's not the function that's unsafe, but the ill-informed use of it. We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it since 9.5. That's five releases ago. So it's a bit late to be coming to us telling us it's not safe (according to your preconceptions of what it should be doing). We could change it prospectively (i.e. from release 13 on) if we choose. But absent an actual bug (i.e. acting contrary to documented behaviour) we do not normally backpatch such changes, especially when there is a simple workaround for the perceived problem. And it's that policy that is in large measure responsible for Postgres' deserved reputation for stability. Incidentally, why is your function written in plpgsql? Wouldn't a simple SQL wrapper be better? create or replace function safe_jsonb_set (target jsonb, path text[], new_value jsonb, create_missing boolean default true) returns jsonb as $func$ select case when new_value is null then target else jsonb_set(target, path, new_value, create_missing) end $func$ language sql; And if we were to change it I'm not at all sure that we should do it the way that's suggested here, which strikes me as no more intuitive than the current behaviour. Rather I think we should possibly fill in a json null in the indicated place. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote: > > ... > >The hyperbole here is misplaced. There is a difference between a bug and >a POLA violation. This might be the latter, but it isn't the former. So >please tone it down a bit. It's not the function that's unsafe, but the >ill-informed use of it. > > >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it >since 9.5. That's five releases ago. So it's a bit late to be coming to >us telling us it's not safe (according to your preconceptions of what it >should be doing). > > >We could change it prospectively (i.e. from release 13 on) if we choose. >But absent an actual bug (i.e. acting contrary to documented behaviour) >we do not normally backpatch such changes, especially when there is a >simple workaround for the perceived problem. And it's that policy that >is in large measure responsible for Postgres' deserved reputation for >stability. > Yeah. > >Incidentally, why is your function written in plpgsql? Wouldn't a simple >SQL wrapper be better? > > > create or replace function safe_jsonb_set > (target jsonb, path text[], new_value jsonb, create_missing > boolean default true) > returns jsonb as > $func$ > select case when new_value is null then target else > jsonb_set(target, path, new_value, create_missing) end > $func$ language sql; > > >And if we were to change it I'm not at all sure that we should do it the >way that's suggested here, which strikes me as no more intuitive than >the current behaviour. Rather I think we should possibly fill in a json >null in the indicated place. > Not sure, but that seems rather confusing to me, because it's mixing SQL NULL and JSON null, i.e. it's not clear to me why jsonb_set(..., "...", NULL) should do the same thing as jsonb_set(..., "...", 'null':jsonb) I'm not entirely surprised it's what MySQL does ;-) but I'd say treating it as a deletion of the key (just like MSSQL) is somewhat more sensible. But I admit it's quite subjective. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote: > > ... > >The hyperbole here is misplaced. There is a difference between a bug and >a POLA violation. This might be the latter, but it isn't the former. So >please tone it down a bit. It's not the function that's unsafe, but the >ill-informed use of it. > > >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it >since 9.5. That's five releases ago. So it's a bit late to be coming to >us telling us it's not safe (according to your preconceptions of what it >should be doing). > > >We could change it prospectively (i.e. from release 13 on) if we choose. >But absent an actual bug (i.e. acting contrary to documented behaviour) >we do not normally backpatch such changes, especially when there is a >simple workaround for the perceived problem. And it's that policy that >is in large measure responsible for Postgres' deserved reputation for >stability. > Yeah. > >Incidentally, why is your function written in plpgsql? Wouldn't a simple >SQL wrapper be better? > > > create or replace function safe_jsonb_set > (target jsonb, path text[], new_value jsonb, create_missing > boolean default true) > returns jsonb as > $func$ > select case when new_value is null then target else > jsonb_set(target, path, new_value, create_missing) end > $func$ language sql; > > >And if we were to change it I'm not at all sure that we should do it the >way that's suggested here, which strikes me as no more intuitive than >the current behaviour. Rather I think we should possibly fill in a json >null in the indicated place. > Not sure, but that seems rather confusing to me, because it's mixing SQL NULL and JSON null, i.e. it's not clear to me why jsonb_set(..., "...", NULL) should do the same thing as jsonb_set(..., "...", 'null':jsonb) I'm not entirely surprised it's what MySQL does ;-) but I'd say treating it as a deletion of the key (just like MSSQL) is somewhat more sensible. But I admit it's quite subjective. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 19, 2019 at 11:21:26AM -0400, Stephen Frost wrote: >Greetings, > >* Dmitry Dolgov (9erthalion6@gmail.com) wrote: >> If we want to change it, the question is where to stop? Essentially we have: >> >> update table set data = some_func(data, some_args_with_null); >> >> where some_func happened to be jsonb_set, but could be any strict function. > >I don't think it makes any sense to try and extrapolate this out to >other strict functions. Functions should be strict when it makes sense >for them to be- in this case, it sounds like it doesn't really make >sense for jsonb_set to be strict, and that's where we stop it. > Yeah. I think the issue here is (partially) that other databases adopted similar functions after us, but decided to use a different behavior. It might be more natural for the users, but that does not mean we should change the other strict functions. Plus I'm not sure if SQL standard says anything about strict functions (I found nothing, but I looked only very quickly), but I'm pretty sure we can't change how basic operators change, and we translate them to function calls (e.g. 1+2 is int4pl(1,2)). >> I wonder if in this case it makes sense to think about an alternative? For >> example, there is generic type subscripting patch, that allows to update a >> jsonb in the following way: >> >> update table set jsonb_data[key] = 'value'; >> >> It doesn't look like a function, so it's not a big deal if it will handle NULL >> values differently. And at the same time one can argue, that people, who are >> not aware about this caveat with jsonb_set and NULL values, will most likely >> use it due to a bit simpler syntax (more similar to some popular programming >> languages). > >This seems like an entirely independent thing ... > Right. Useful, but entirely separate feature. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/19/19 12:18 PM, Tomas Vondra wrote: > On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote: > > Not sure, but that seems rather confusing to me, because it's mixing SQL > NULL and JSON null, i.e. it's not clear to me why > > jsonb_set(..., "...", NULL) > > should do the same thing as > > jsonb_set(..., "...", 'null':jsonb) > > I'm not entirely surprised it's what MySQL does ;-) but I'd say treating > it as a deletion of the key (just like MSSQL) is somewhat more sensible. > But I admit it's quite subjective. > That's yet another variant, which just reinforces my view that there is no guaranteed-intuitive behaviour here. OTOH, to me, turning jsonb_set into jsonb_delete for some case seems ... odd. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/19/19 12:18 PM, Tomas Vondra wrote: > On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote: > > Not sure, but that seems rather confusing to me, because it's mixing SQL > NULL and JSON null, i.e. it's not clear to me why > > jsonb_set(..., "...", NULL) > > should do the same thing as > > jsonb_set(..., "...", 'null':jsonb) > > I'm not entirely surprised it's what MySQL does ;-) but I'd say treating > it as a deletion of the key (just like MSSQL) is somewhat more sensible. > But I admit it's quite subjective. > That's yet another variant, which just reinforces my view that there is no guaranteed-intuitive behaviour here. OTOH, to me, turning jsonb_set into jsonb_delete for some case seems ... odd. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
>since 9.5. That's five releases ago. So it's a bit late to be coming to
>us telling us it's not safe (according to your preconceptions of what it
>should be doing).
>
>
>We could change it prospectively (i.e. from release 13 on) if we choose.
>But absent an actual bug (i.e. acting contrary to documented behaviour)
>we do not normally backpatch such changes, especially when there is a
>simple workaround for the perceived problem. And it's that policy that
>is in large measure responsible for Postgres' deserved reputation for
>stability.
>
Yeah.
>And if we were to change it I'm not at all sure that we should do it the
>way that's suggested here, which strikes me as no more intuitive than
>the current behaviour. Rather I think we should possibly fill in a json
>null in the indicated place.
>
Not sure, but that seems rather confusing to me, because it's mixing SQL
NULL and JSON null, i.e. it's not clear to me why
But I admit it's quite subjective.
>
>We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
>since 9.5. That's five releases ago. So it's a bit late to be coming to
>us telling us it's not safe (according to your preconceptions of what it
>should be doing).
>
>
>We could change it prospectively (i.e. from release 13 on) if we choose.
>But absent an actual bug (i.e. acting contrary to documented behaviour)
>we do not normally backpatch such changes, especially when there is a
>simple workaround for the perceived problem. And it's that policy that
>is in large measure responsible for Postgres' deserved reputation for
>stability.
>
Yeah.
>And if we were to change it I'm not at all sure that we should do it the
>way that's suggested here, which strikes me as no more intuitive than
>the current behaviour. Rather I think we should possibly fill in a json
>null in the indicated place.
>
Not sure, but that seems rather confusing to me, because it's mixing SQL
NULL and JSON null, i.e. it's not clear to me why
But I admit it's quite subjective.
On 10/19/19 12:32 PM, David G. Johnston wrote: > On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> > wrote: > > > > >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it > >since 9.5. That's five releases ago. So it's a bit late to be > coming to > >us telling us it's not safe (according to your preconceptions of > what it > >should be doing). > > > > > There have been numerous complaints and questions about this behavior > in those five years; and none of the responses to those defenses has > actually made the current behavior sound beneficial but rather have > simply said "this is how it works, deal with it". I haven't seen a patch, which for most possible solutions should be fairly simple to code. This is open source. Code speaks louder than complaints. > > > > >We could change it prospectively (i.e. from release 13 on) if we > choose. > >But absent an actual bug (i.e. acting contrary to documented > behaviour) > >we do not normally backpatch such changes, especially when there is a > >simple workaround for the perceived problem. And it's that policy > that > >is in large measure responsible for Postgres' deserved reputation for > >stability. > > > > Yeah. > > > Agreed, this is v13 material if enough people come on board to support > making a change. We have changed such things in the past. But maybe a new function might be a better way to go. I haven't given it enough thought yet. > > >And if we were to change it I'm not at all sure that we should do > it the > >way that's suggested here, which strikes me as no more intuitive than > >the current behaviour. Rather I think we should possibly fill in > a json > >null in the indicated place. > > > > Not sure, but that seems rather confusing to me, because it's > mixing SQL > NULL and JSON null, i.e. it's not clear to me why > > [...] > > But I admit it's quite subjective. > > > Providing SQL NULL to this function and asking it to do something with > that is indeed subjective - with no obvious reasonable default, and I > agree that "return a NULL" while possible consistent is probably the > least useful behavior that could have been chosen. We should never > have allowed an SQL NULL to be an acceptable argument in the first > place, and can reasonably safely and effectively prevent it going > forward. Then people will have to explicitly code what they want to > do if their data and queries present this invalid unknown data to the > function. > > How exactly do we prevent a NULL being passed as an argument? The only thing we could do would be to raise an exception, I think. That seems like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/19/19 12:32 PM, David G. Johnston wrote: > On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> > wrote: > > > > >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it > >since 9.5. That's five releases ago. So it's a bit late to be > coming to > >us telling us it's not safe (according to your preconceptions of > what it > >should be doing). > > > > > There have been numerous complaints and questions about this behavior > in those five years; and none of the responses to those defenses has > actually made the current behavior sound beneficial but rather have > simply said "this is how it works, deal with it". I haven't seen a patch, which for most possible solutions should be fairly simple to code. This is open source. Code speaks louder than complaints. > > > > >We could change it prospectively (i.e. from release 13 on) if we > choose. > >But absent an actual bug (i.e. acting contrary to documented > behaviour) > >we do not normally backpatch such changes, especially when there is a > >simple workaround for the perceived problem. And it's that policy > that > >is in large measure responsible for Postgres' deserved reputation for > >stability. > > > > Yeah. > > > Agreed, this is v13 material if enough people come on board to support > making a change. We have changed such things in the past. But maybe a new function might be a better way to go. I haven't given it enough thought yet. > > >And if we were to change it I'm not at all sure that we should do > it the > >way that's suggested here, which strikes me as no more intuitive than > >the current behaviour. Rather I think we should possibly fill in > a json > >null in the indicated place. > > > > Not sure, but that seems rather confusing to me, because it's > mixing SQL > NULL and JSON null, i.e. it's not clear to me why > > [...] > > But I admit it's quite subjective. > > > Providing SQL NULL to this function and asking it to do something with > that is indeed subjective - with no obvious reasonable default, and I > agree that "return a NULL" while possible consistent is probably the > least useful behavior that could have been chosen. We should never > have allowed an SQL NULL to be an acceptable argument in the first > place, and can reasonably safely and effectively prevent it going > forward. Then people will have to explicitly code what they want to > do if their data and queries present this invalid unknown data to the > function. > > How exactly do we prevent a NULL being passed as an argument? The only thing we could do would be to raise an exception, I think. That seems like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/18/19 7:18 PM, Ariadne Conill wrote: > Hello, > > On Fri, Oct 18, 2019 at 7:04 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> On 10/18/19 4:31 PM, Ariadne Conill wrote: >>> Hello, >>> >>> On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: >>>> >>>> On 10/18/19 3:11 PM, Ariadne Conill wrote: >>>>> Hello, >>>>> >>>>> On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston >>>>> <david.g.johnston@gmail.com> wrote: >>>>>> >>>>>> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote: >>>>>>> >>>>>>> ## Ariadne Conill (ariadne@dereferenced.org): >>>>>>> >>>>>>>> update users set info=jsonb_set(info, '{bar}', info->'foo'); >>>>>>>> >>>>>>>> Typically, this works nicely, except for cases where evaluating >>>>>>>> info->'foo' results in an SQL null being returned. When that happens, >>>>>>>> jsonb_set() returns an SQL null, which then results in data loss.[3] >>>>>>> >>>>>>> So why don't you use the facilities of SQL to make sure to only >>>>>>> touch the rows which match the prerequisites? >>>>>>> >>>>>>> UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo') >>>>>>> WHERE info->'foo' IS NOT NULL; >>>>>>> >>>>>> >>>>>> There are many ways to add code to queries to make working with this function safer - though using them presupposesone remembers at the time of writing the query that there is danger and caveats in using this function. I agreethat we should have (and now) provided sane defined behavior when one of the inputs to the function is null insteadblowing off the issue and defining the function as being strict. Whether that is "ignore and return the originalobject" or "add the key with a json null scalar value" is debatable but either is considerably more useful than returningSQL NULL. >>>>> >>>>> A great example of how we got burned by this last year: Pleroma >>>>> maintains pre-computed counters in JSONB for various types of >>>>> activities (posts, followers, followings). Last year, another counter >>>>> was added, with a migration. But some people did not run the >>>>> migration, because they are users, and that's what users do. This >>>> >>>> So you are more forgiving of your misstep, allowing users to run >>>> outdated code, then of running afoul of Postgres documented behavior: >>> >>> I'm not forgiving of either. >>> >>>> https://www.postgresql.org/docs/11/functions-json.html >>>> " The field/element/path extraction operators return NULL, rather than >>>> failing, if the JSON input does not have the right structure to match >>>> the request; for example if no such element exists" >>> >>> It is known that the extraction operators return NULL. The problem >>> here is jsonb_set() returning NULL when it encounters SQL NULL. >> >> I'm not following. Your original case was: >> >> jsonb_set(info, '{bar}', info->'foo'); >> >> where info->'foo' is equivalent to: >> >> test=# select '{"f1":1,"f2":null}'::jsonb ->'f3'; >> ?column? >> ---------- >> NULL >> >> So you know there is a possibility that a value extraction could return >> NULL and from your wrapper that COALESCE is the way to deal with this. > > You're not following because you don't want to follow. > > It does not matter that info->'foo' is in my example. That's not what > I am talking about. > > What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL > > postgres=# \pset null '(null)' > Null display is "(null)". > postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL); > jsonb_set > ----------- > (null) > (1 row) > > This behaviour is basically giving an application developer a loaded > shotgun and pointing it at their feet. It is not a good design. It > is a design which has likely lead to many users experiencing > unintentional data loss. create table null_test(fld_1 integer, fld_2 integer); insert into null_test values(1, 2), (3, NULL); select * from null_test ; fld_1 | fld_2 -------+------- 1 | 2 3 | NULL (2 rows) update null_test set fld_1 = fld_1 + fld_2; select * from null_test ; fld_1 | fld_2 -------+------- 3 | 2 NULL | NULL Failure to account for NULL is a generic issue. Given that this only the second post I can find that deals with this, in going on 4 years, I am guessing most users have dealt with it. If you really think this raises to the level of a bug then I would suggest filing a report here: https://www.postgresql.org/account/login/?next=/account/submitbug/ > > Ariadne > -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote: > >On 10/19/19 12:32 PM, David G. Johnston wrote: >> On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra >> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> >> wrote: >> >> > >> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it >> >since 9.5. That's five releases ago. So it's a bit late to be >> coming to >> >us telling us it's not safe (according to your preconceptions of >> what it >> >should be doing). >> > >> >> >> There have been numerous complaints and questions about this behavior >> in those five years; and none of the responses to those defenses has >> actually made the current behavior sound beneficial but rather have >> simply said "this is how it works, deal with it". > > >I haven't seen a patch, which for most possible solutions should be >fairly simple to code. This is open source. Code speaks louder than >complaints. > IMHO that might be a bit too harsh - I'm not surprised no one sent a patch when we're repeatedly telling people "you're holding it wrong". Without a clear consensus what the "correct" behavior is, I wouldn't send a patch either. > >> >> > >> >We could change it prospectively (i.e. from release 13 on) if we >> choose. >> >But absent an actual bug (i.e. acting contrary to documented >> behaviour) >> >we do not normally backpatch such changes, especially when there is a >> >simple workaround for the perceived problem. And it's that policy >> that >> >is in large measure responsible for Postgres' deserved reputation for >> >stability. >> > >> >> Yeah. >> >> >> Agreed, this is v13 material if enough people come on board to support >> making a change. > > > >We have changed such things in the past. But maybe a new function might >be a better way to go. I haven't given it enough thought yet. > I think the #1 thing we should certainly do is explaining the behavior in the docs. > > >> >> >And if we were to change it I'm not at all sure that we should do >> it the >> >way that's suggested here, which strikes me as no more intuitive than >> >the current behaviour. Rather I think we should possibly fill in >> a json >> >null in the indicated place. >> > >> >> Not sure, but that seems rather confusing to me, because it's >> mixing SQL >> NULL and JSON null, i.e. it's not clear to me why >> >> [...] >> >> But I admit it's quite subjective. >> >> >> Providing SQL NULL to this function and asking it to do something with >> that is indeed subjective - with no obvious reasonable default, and I >> agree that "return a NULL" while possible consistent is probably the >> least useful behavior that could have been chosen. We should never >> have allowed an SQL NULL to be an acceptable argument in the first >> place, and can reasonably safely and effectively prevent it going >> forward. Then people will have to explicitly code what they want to >> do if their data and queries present this invalid unknown data to the >> function. >> >> > >How exactly do we prevent a NULL being passed as an argument? The only >thing we could do would be to raise an exception, I think. That seems >like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing. > I don't know, but if we don't know what the "right" behavior with NULL is, is raising an exception really that ugly? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote: > >On 10/19/19 12:32 PM, David G. Johnston wrote: >> On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra >> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> >> wrote: >> >> > >> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it >> >since 9.5. That's five releases ago. So it's a bit late to be >> coming to >> >us telling us it's not safe (according to your preconceptions of >> what it >> >should be doing). >> > >> >> >> There have been numerous complaints and questions about this behavior >> in those five years; and none of the responses to those defenses has >> actually made the current behavior sound beneficial but rather have >> simply said "this is how it works, deal with it". > > >I haven't seen a patch, which for most possible solutions should be >fairly simple to code. This is open source. Code speaks louder than >complaints. > IMHO that might be a bit too harsh - I'm not surprised no one sent a patch when we're repeatedly telling people "you're holding it wrong". Without a clear consensus what the "correct" behavior is, I wouldn't send a patch either. > >> >> > >> >We could change it prospectively (i.e. from release 13 on) if we >> choose. >> >But absent an actual bug (i.e. acting contrary to documented >> behaviour) >> >we do not normally backpatch such changes, especially when there is a >> >simple workaround for the perceived problem. And it's that policy >> that >> >is in large measure responsible for Postgres' deserved reputation for >> >stability. >> > >> >> Yeah. >> >> >> Agreed, this is v13 material if enough people come on board to support >> making a change. > > > >We have changed such things in the past. But maybe a new function might >be a better way to go. I haven't given it enough thought yet. > I think the #1 thing we should certainly do is explaining the behavior in the docs. > > >> >> >And if we were to change it I'm not at all sure that we should do >> it the >> >way that's suggested here, which strikes me as no more intuitive than >> >the current behaviour. Rather I think we should possibly fill in >> a json >> >null in the indicated place. >> > >> >> Not sure, but that seems rather confusing to me, because it's >> mixing SQL >> NULL and JSON null, i.e. it's not clear to me why >> >> [...] >> >> But I admit it's quite subjective. >> >> >> Providing SQL NULL to this function and asking it to do something with >> that is indeed subjective - with no obvious reasonable default, and I >> agree that "return a NULL" while possible consistent is probably the >> least useful behavior that could have been chosen. We should never >> have allowed an SQL NULL to be an acceptable argument in the first >> place, and can reasonably safely and effectively prevent it going >> forward. Then people will have to explicitly code what they want to >> do if their data and queries present this invalid unknown data to the >> function. >> >> > >How exactly do we prevent a NULL being passed as an argument? The only >thing we could do would be to raise an exception, I think. That seems >like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing. > I don't know, but if we don't know what the "right" behavior with NULL is, is raising an exception really that ugly? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote:
>
>On 10/19/19 12:32 PM, David G. Johnston wrote:
>> On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra
>> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>>
>> wrote:
>>
>> >
>> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
>> >since 9.5. That's five releases ago. So it's a bit late to be
>> coming to
>> >us telling us it's not safe (according to your preconceptions of
>> what it
>> >should be doing).
>> >
>>
>>
>> There have been numerous complaints and questions about this behavior
>> in those five years; and none of the responses to those defenses has
>> actually made the current behavior sound beneficial but rather have
>> simply said "this is how it works, deal with it".
>
>
>I haven't seen a patch, which for most possible solutions should be
>fairly simple to code. This is open source. Code speaks louder than
>complaints.
>
IMHO that might be a bit too harsh - I'm not surprised no one sent a
patch when we're repeatedly telling people "you're holding it wrong".
Without a clear consensus what the "correct" behavior is, I wouldn't
send a patch either.
>
>>
>> >
>> >We could change it prospectively (i.e. from release 13 on) if we
>> choose.
>> >But absent an actual bug (i.e. acting contrary to documented
>> behaviour)
>> >we do not normally backpatch such changes, especially when there is a
>> >simple workaround for the perceived problem. And it's that policy
>> that
>> >is in large measure responsible for Postgres' deserved reputation for
>> >stability.
>> >
>>
>> Yeah.
>>
>>
>> Agreed, this is v13 material if enough people come on board to support
>> making a change.
>
>
>
>We have changed such things in the past. But maybe a new function might
>be a better way to go. I haven't given it enough thought yet.
>
I think the #1 thing we should certainly do is explaining the behavior
in the docs.
>
>
>>
>> >And if we were to change it I'm not at all sure that we should do
>> it the
>> >way that's suggested here, which strikes me as no more intuitive than
>> >the current behaviour. Rather I think we should possibly fill in
>> a json
>> >null in the indicated place.
>> >
>>
>> Not sure, but that seems rather confusing to me, because it's
>> mixing SQL
>> NULL and JSON null, i.e. it's not clear to me why
>>
>> [...]
>>
>> But I admit it's quite subjective.
>>
>>
>> Providing SQL NULL to this function and asking it to do something with
>> that is indeed subjective - with no obvious reasonable default, and I
>> agree that "return a NULL" while possible consistent is probably the
>> least useful behavior that could have been chosen. We should never
>> have allowed an SQL NULL to be an acceptable argument in the first
>> place, and can reasonably safely and effectively prevent it going
>> forward. Then people will have to explicitly code what they want to
>> do if their data and queries present this invalid unknown data to the
>> function.
>>
>>
>
>How exactly do we prevent a NULL being passed as an argument? The only
>thing we could do would be to raise an exception, I think. That seems
>like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing.
>
I don't know, but if we don't know what the "right" behavior with NULL
is, is raising an exception really that ugly?
On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote:
>
>On 10/19/19 12:32 PM, David G. Johnston wrote:
>> On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra
>> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>>
>> wrote:
>>
>> >
>> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
>> >since 9.5. That's five releases ago. So it's a bit late to be
>> coming to
>> >us telling us it's not safe (according to your preconceptions of
>> what it
>> >should be doing).
>> >
>>
>>
>> There have been numerous complaints and questions about this behavior
>> in those five years; and none of the responses to those defenses has
>> actually made the current behavior sound beneficial but rather have
>> simply said "this is how it works, deal with it".
>
>
>I haven't seen a patch, which for most possible solutions should be
>fairly simple to code. This is open source. Code speaks louder than
>complaints.
>
IMHO that might be a bit too harsh - I'm not surprised no one sent a
patch when we're repeatedly telling people "you're holding it wrong".
Without a clear consensus what the "correct" behavior is, I wouldn't
send a patch either.
>
>>
>> >
>> >We could change it prospectively (i.e. from release 13 on) if we
>> choose.
>> >But absent an actual bug (i.e. acting contrary to documented
>> behaviour)
>> >we do not normally backpatch such changes, especially when there is a
>> >simple workaround for the perceived problem. And it's that policy
>> that
>> >is in large measure responsible for Postgres' deserved reputation for
>> >stability.
>> >
>>
>> Yeah.
>>
>>
>> Agreed, this is v13 material if enough people come on board to support
>> making a change.
>
>
>
>We have changed such things in the past. But maybe a new function might
>be a better way to go. I haven't given it enough thought yet.
>
I think the #1 thing we should certainly do is explaining the behavior
in the docs.
>
>
>>
>> >And if we were to change it I'm not at all sure that we should do
>> it the
>> >way that's suggested here, which strikes me as no more intuitive than
>> >the current behaviour. Rather I think we should possibly fill in
>> a json
>> >null in the indicated place.
>> >
>>
>> Not sure, but that seems rather confusing to me, because it's
>> mixing SQL
>> NULL and JSON null, i.e. it's not clear to me why
>>
>> [...]
>>
>> But I admit it's quite subjective.
>>
>>
>> Providing SQL NULL to this function and asking it to do something with
>> that is indeed subjective - with no obvious reasonable default, and I
>> agree that "return a NULL" while possible consistent is probably the
>> least useful behavior that could have been chosen. We should never
>> have allowed an SQL NULL to be an acceptable argument in the first
>> place, and can reasonably safely and effectively prevent it going
>> forward. Then people will have to explicitly code what they want to
>> do if their data and queries present this invalid unknown data to the
>> function.
>>
>>
>
>How exactly do we prevent a NULL being passed as an argument? The only
>thing we could do would be to raise an exception, I think. That seems
>like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing.
>
I don't know, but if we don't know what the "right" behavior with NULL
is, is raising an exception really that ugly?
FWIW I've been bitten by this 'feature' more than once as well, accidentally erasing a column. Now I usually write js = jsonb_set(js, coalesce(new_column, 'null'::jsonb)) to prevent erasing the whole column, and instead setting the value to a jsonb null value, but I also found the STRICT behavior very surprising at first..
-Floris
FWIW I've been bitten by this 'feature' more than once as well, accidentally erasing a column. Now I usually write js = jsonb_set(js, coalesce(new_column, 'null'::jsonb)) to prevent erasing the whole column, and instead setting the value to a jsonb null value, but I also found the STRICT behavior very surprising at first..
-Floris
On 19/10/2019 07:52, Ariadne Conill wrote: > > I would say that any thing like > > update whatever set column=jsonb_set(column, '{foo}', NULL) > > should throw an exception. It should do, literally, *anything* else > but blank that column. steve=# create table foo (bar jsonb not null); CREATE TABLE steve=# insert into foo (bar) values ('{"a":"b"}'); INSERT 0 1 steve=# update foo set bar = jsonb_set(bar, '{foo}', NULL); ERROR: null value in column "bar" violates not-null constraint DETAIL: Failing row contains (null). steve=# update foo set bar = jsonb_set(bar, '{foo}', 'null'::jsonb); UPDATE 1 I don't see any behaviour that's particularly surprising there? Though I understand how an app developer who's light on SQL might get it wrong - and I've made similar mistakes in schema upgrade scripts without involving jsonb. Cheers, Steve
On 10/20/19 4:39 AM, Floris Van Nee wrote: > > FWIW I've been bitten by this 'feature' more than once as well, > accidentally erasing a column. Now I usually write js = jsonb_set(js, > coalesce(new_column, 'null'::jsonb)) to prevent erasing the whole > column, and instead setting the value to a jsonb null value, but I > also found the STRICT behavior very surprising at first.. > > > Understood. I think the real question here is what it should do instead when the value is NULL. Your behaviour above is one suggestion, which I personally find intuitive. Another has been to remove the associated key. Another is to return the original target. And yet another is to raise an exception, which is easy to write but really punts the issue back to the application programmer who will have to decide how to ensure they never pass in a NULL parameter. Possibly we could even add an extra parameter to specify what should be done. Also, the question will arise what to do when any of the other parameters are NULL. Should we return NULL in those cases as we do now? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/20/19 4:39 AM, Floris Van Nee wrote: > > FWIW I've been bitten by this 'feature' more than once as well, > accidentally erasing a column. Now I usually write js = jsonb_set(js, > coalesce(new_column, 'null'::jsonb)) to prevent erasing the whole > column, and instead setting the value to a jsonb null value, but I > also found the STRICT behavior very surprising at first.. > > > Understood. I think the real question here is what it should do instead when the value is NULL. Your behaviour above is one suggestion, which I personally find intuitive. Another has been to remove the associated key. Another is to return the original target. And yet another is to raise an exception, which is easy to write but really punts the issue back to the application programmer who will have to decide how to ensure they never pass in a NULL parameter. Possibly we could even add an extra parameter to specify what should be done. Also, the question will arise what to do when any of the other parameters are NULL. Should we return NULL in those cases as we do now? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Understood. I think the real question here is what it should do instead
when the value is NULL. Your behaviour above is one suggestion, which I
personally find intuitive. Another has been to remove the associated
key. Another is to return the original target. And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to ensure
they never pass in a NULL parameter. Possibly we could even add an extra
parameter to specify what should be done.
Also, the question will arise what to do when any of the other
parameters are NULL. Should we return NULL in those cases as we do now?
Understood. I think the real question here is what it should do instead
when the value is NULL. Your behaviour above is one suggestion, which I
personally find intuitive. Another has been to remove the associated
key. Another is to return the original target. And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to ensure
they never pass in a NULL parameter. Possibly we could even add an extra
parameter to specify what should be done.
Also, the question will arise what to do when any of the other
parameters are NULL. Should we return NULL in those cases as we do now?
And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to ensure
they never pass in a NULL parameter.
Possibly we could even add an extra
parameter to specify what should be done.
Should we return NULL in those cases as we do now?
And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to ensure
they never pass in a NULL parameter.
Possibly we could even add an extra
parameter to specify what should be done.
Should we return NULL in those cases as we do now?
On 10/20/19 1:14 PM, David G. Johnston wrote: > On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan > <andrew.dunstan@2ndquadrant.com > <mailto:andrew.dunstan@2ndquadrant.com>> wrote: > > And yet another is to > raise an exception, which is easy to write but really punts the issue > back to the application programmer who will have to decide how to > ensure > they never pass in a NULL parameter. > > > That's kinda the point - if they never pass NULL they won't encounter > any problems but as soon as the data and their application don't see > eye-to-eye the application developer has to decide what they want to > do about it. We are in no position to decide for them and making it > obvious they have a decision to make and implement here doesn't seem > like a improper position to take. The app dev can avoid this problem today by making sure they don't pass a NULL as the value. Or they can use a wrapper function which does that for them. So frankly this doesn't seem like much of an advance. And, as has been noted, it's not consistent with what either MySQL or MSSQL do. In general I'm not that keen on raising an exception for cases like this. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/20/19 1:14 PM, David G. Johnston wrote: > On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan > <andrew.dunstan@2ndquadrant.com > <mailto:andrew.dunstan@2ndquadrant.com>> wrote: > > And yet another is to > raise an exception, which is easy to write but really punts the issue > back to the application programmer who will have to decide how to > ensure > they never pass in a NULL parameter. > > > That's kinda the point - if they never pass NULL they won't encounter > any problems but as soon as the data and their application don't see > eye-to-eye the application developer has to decide what they want to > do about it. We are in no position to decide for them and making it > obvious they have a decision to make and implement here doesn't seem > like a improper position to take. The app dev can avoid this problem today by making sure they don't pass a NULL as the value. Or they can use a wrapper function which does that for them. So frankly this doesn't seem like much of an advance. And, as has been noted, it's not consistent with what either MySQL or MSSQL do. In general I'm not that keen on raising an exception for cases like this. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: > >On 10/20/19 1:14 PM, David G. Johnston wrote: >> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan >> <andrew.dunstan@2ndquadrant.com >> <mailto:andrew.dunstan@2ndquadrant.com>> wrote: >> >> And yet another is to >> raise an exception, which is easy to write but really punts the issue >> back to the application programmer who will have to decide how to >> ensure >> they never pass in a NULL parameter. >> >> >> That's kinda the point - if they never pass NULL they won't encounter >> any problems but as soon as the data and their application don't see >> eye-to-eye the application developer has to decide what they want to >> do about it. We are in no position to decide for them and making it >> obvious they have a decision to make and implement here doesn't seem >> like a improper position to take. > > >The app dev can avoid this problem today by making sure they don't pass >a NULL as the value. Or they can use a wrapper function which does that >for them. So frankly this doesn't seem like much of an advance. And, as >has been noted, it's not consistent with what either MySQL or MSSQL do. >In general I'm not that keen on raising an exception for cases like this. > I think the general premise of this thread is that the application developer does not realize that may be necessary, because it's a bit surprising behavior, particularly when having more experience with other databases that behave differently. It's also pretty easy to not notice this issue for a long time, resulting in significant data loss. Let's say you're used to the MSSQL or MySQL behavior, you migrate your application to PostgreSQL or whatever - how do you find out about this behavior? Users are likely to visit https://www.postgresql.org/docs/12/functions-json.html but that says nothing about how jsonb_set works with NULL values :-( You're right raising an exception may not be the "right behavior" for whatever definition of "right". But I kinda agree with David that it's somewhat reasonable when we don't know what the "universally correct" thing is (or when there's no such thing). IMHO that's better than silently discarding some of the data. FWIW I think the JSON/JSONB part of our code base is amazing, and the fact that various other databases adopted something very similar over the last couple of years just confirms that. And if this is the only speck of dust in the API, I think that's pretty amazing. I'm not sure how significant this issue actually is - it's true we got a couple of complaints over the years (judging by a quick search for jsonb_set and NULL in the archives), but I'm not sure that's enough to justify any changes in backbranches. I'd say no, but I have no idea how many people are affected by this but don't know about it ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: > >On 10/20/19 1:14 PM, David G. Johnston wrote: >> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan >> <andrew.dunstan@2ndquadrant.com >> <mailto:andrew.dunstan@2ndquadrant.com>> wrote: >> >> And yet another is to >> raise an exception, which is easy to write but really punts the issue >> back to the application programmer who will have to decide how to >> ensure >> they never pass in a NULL parameter. >> >> >> That's kinda the point - if they never pass NULL they won't encounter >> any problems but as soon as the data and their application don't see >> eye-to-eye the application developer has to decide what they want to >> do about it. We are in no position to decide for them and making it >> obvious they have a decision to make and implement here doesn't seem >> like a improper position to take. > > >The app dev can avoid this problem today by making sure they don't pass >a NULL as the value. Or they can use a wrapper function which does that >for them. So frankly this doesn't seem like much of an advance. And, as >has been noted, it's not consistent with what either MySQL or MSSQL do. >In general I'm not that keen on raising an exception for cases like this. > I think the general premise of this thread is that the application developer does not realize that may be necessary, because it's a bit surprising behavior, particularly when having more experience with other databases that behave differently. It's also pretty easy to not notice this issue for a long time, resulting in significant data loss. Let's say you're used to the MSSQL or MySQL behavior, you migrate your application to PostgreSQL or whatever - how do you find out about this behavior? Users are likely to visit https://www.postgresql.org/docs/12/functions-json.html but that says nothing about how jsonb_set works with NULL values :-( You're right raising an exception may not be the "right behavior" for whatever definition of "right". But I kinda agree with David that it's somewhat reasonable when we don't know what the "universally correct" thing is (or when there's no such thing). IMHO that's better than silently discarding some of the data. FWIW I think the JSON/JSONB part of our code base is amazing, and the fact that various other databases adopted something very similar over the last couple of years just confirms that. And if this is the only speck of dust in the API, I think that's pretty amazing. I'm not sure how significant this issue actually is - it's true we got a couple of complaints over the years (judging by a quick search for jsonb_set and NULL in the archives), but I'm not sure that's enough to justify any changes in backbranches. I'd say no, but I have no idea how many people are affected by this but don't know about it ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/20/19 1:14 PM, David G. Johnston wrote:
> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com
> <mailto:andrew.dunstan@2ndquadrant.com>> wrote:
>
> And yet another is to
> raise an exception, which is easy to write but really punts the issue
> back to the application programmer who will have to decide how to
> ensure
> they never pass in a NULL parameter.
>
>
> That's kinda the point - if they never pass NULL they won't encounter
> any problems but as soon as the data and their application don't see
> eye-to-eye the application developer has to decide what they want to
> do about it. We are in no position to decide for them and making it
> obvious they have a decision to make and implement here doesn't seem
> like a improper position to take.
The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like this.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/20/19 1:14 PM, David G. Johnston wrote:
> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com
> <mailto:andrew.dunstan@2ndquadrant.com>> wrote:
>
> And yet another is to
> raise an exception, which is easy to write but really punts the issue
> back to the application programmer who will have to decide how to
> ensure
> they never pass in a NULL parameter.
>
>
> That's kinda the point - if they never pass NULL they won't encounter
> any problems but as soon as the data and their application don't see
> eye-to-eye the application developer has to decide what they want to
> do about it. We are in no position to decide for them and making it
> obvious they have a decision to make and implement here doesn't seem
> like a improper position to take.
The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like this.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, 2019-10-18 at 21:18 -0500, Ariadne Conill wrote: > postgres=# \pset null '(null)' > Null display is "(null)". > postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL); > jsonb_set > ----------- > (null) > (1 row) > > This behaviour is basically giving an application developer a loaded > shotgun and pointing it at their feet. It is not a good design. It > is a design which has likely lead to many users experiencing > unintentional data loss. I understand your sentiments, even if you voiced them too drastically for my taste. The basic problem is that SQL NULL and JSON null have different semantics, and while it is surprising for you that a database function returns NULL if an argument is NULL, many database people would be surprised by the opposite. Please have some understanding. That said, I think it is reasonable that a PostgreSQL JSON function behaves in the way that JSON users would expect, so here is my +1 for interpreting an SQL NULL as a JSON null in the above case, so that the result of the above becomes {"a": null, "b": 2, "c": 3} -1 for backpatching such a change. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
> That said, I think it is reasonable that a PostgreSQL JSON function > behaves in the way that JSON users would expect, so here is my +1 for > interpreting an SQL NULL as a JSON null in the above case Just to chime in as another application developer: the current functionality does seem pretty surprising and dangerous to me. Raising an exception seems pretty annoying. Setting the key's value to a JSON null would be fine, but I also like the idea of removing the key entirely, since that gives you strictly more functionality: you can always set the key to a JSON null by passing one in, if that's what you want. But there are lots of other functions that convert SQL NULL to JSON null: postgres=# select row_to_json(row(null)), json_build_object('foo', null), json_object(array['foo', null]), json_object(array['foo'], array[null]); row_to_json | json_build_object | json_object | json_object -------------+-------------------+----------------+---------------- {"f1":null} | {"foo" : null} | {"foo" : null} | {"foo" : null} (1 row) (The jsonb variants give the same results.) I think those functions are very similar to json_set here, and I'd expect json_set to do what they do (i.e. convert SQL NULL to JSON null). Paul
On 10/20/19 4:18 PM, Tomas Vondra wrote: > On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: >> >> On 10/20/19 1:14 PM, David G. Johnston wrote: >>> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan >>> <andrew.dunstan@2ndquadrant.com >>> <mailto:andrew.dunstan@2ndquadrant.com>> wrote: >>> >>> And yet another is to >>> raise an exception, which is easy to write but really punts the >>> issue >>> back to the application programmer who will have to decide how to >>> ensure >>> they never pass in a NULL parameter. >>> >>> >>> That's kinda the point - if they never pass NULL they won't encounter >>> any problems but as soon as the data and their application don't see >>> eye-to-eye the application developer has to decide what they want to >>> do about it. We are in no position to decide for them and making it >>> obvious they have a decision to make and implement here doesn't seem >>> like a improper position to take. >> >> >> The app dev can avoid this problem today by making sure they don't pass >> a NULL as the value. Or they can use a wrapper function which does that >> for them. So frankly this doesn't seem like much of an advance. And, as >> has been noted, it's not consistent with what either MySQL or MSSQL do. >> In general I'm not that keen on raising an exception for cases like >> this. >> > > I think the general premise of this thread is that the application > developer does not realize that may be necessary, because it's a bit > surprising behavior, particularly when having more experience with other > databases that behave differently. It's also pretty easy to not notice > this issue for a long time, resulting in significant data loss. > > Let's say you're used to the MSSQL or MySQL behavior, you migrate your > application to PostgreSQL or whatever - how do you find out about this > behavior? Users are likely to visit > > https://www.postgresql.org/docs/12/functions-json.html > > but that says nothing about how jsonb_set works with NULL values :-( We should certainly fix that. I accept some responsibility for the omission. > > You're right raising an exception may not be the "right behavior" for > whatever definition of "right". But I kinda agree with David that it's > somewhat reasonable when we don't know what the "universally correct" > thing is (or when there's no such thing). IMHO that's better than > silently discarding some of the data. I'm not arguing against the idea of improving the situation. But I am arguing against a minimal fix that will not provide much of value to a careful app developer. i.e. I want to do more to support app devs. Ideally they would not need to use wrapper functions. There will be plenty of situations where it is mighty inconvenient to catch an exception thrown by jsonb_set(). And catching exceptions can be expensive. You want to avoid that if possible in your performance-critical plpgsql code. > > FWIW I think the JSON/JSONB part of our code base is amazing, and the > fact that various other databases adopted something very similar over > the last couple of years just confirms that. And if this is the only > speck of dust in the API, I think that's pretty amazing. TY. When I first saw the SQL/JSON spec I thought I should send a request to the SQL standards committee for a royalty payment, since it looked so familiar ;-) > > I'm not sure how significant this issue actually is - it's true we got a > couple of complaints over the years (judging by a quick search for > jsonb_set and NULL in the archives), but I'm not sure that's enough to > justify any changes in backbranches. I'd say no, but I have no idea how > many people are affected by this but don't know about it ... > > No, no backpatching. As I said upthread, this isn't a bug, but it is arguably a POLA violation, which is why we should do something for release 13. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/20/19 4:18 PM, Tomas Vondra wrote: > On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: >> >> On 10/20/19 1:14 PM, David G. Johnston wrote: >>> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan >>> <andrew.dunstan@2ndquadrant.com >>> <mailto:andrew.dunstan@2ndquadrant.com>> wrote: >>> >>> And yet another is to >>> raise an exception, which is easy to write but really punts the >>> issue >>> back to the application programmer who will have to decide how to >>> ensure >>> they never pass in a NULL parameter. >>> >>> >>> That's kinda the point - if they never pass NULL they won't encounter >>> any problems but as soon as the data and their application don't see >>> eye-to-eye the application developer has to decide what they want to >>> do about it. We are in no position to decide for them and making it >>> obvious they have a decision to make and implement here doesn't seem >>> like a improper position to take. >> >> >> The app dev can avoid this problem today by making sure they don't pass >> a NULL as the value. Or they can use a wrapper function which does that >> for them. So frankly this doesn't seem like much of an advance. And, as >> has been noted, it's not consistent with what either MySQL or MSSQL do. >> In general I'm not that keen on raising an exception for cases like >> this. >> > > I think the general premise of this thread is that the application > developer does not realize that may be necessary, because it's a bit > surprising behavior, particularly when having more experience with other > databases that behave differently. It's also pretty easy to not notice > this issue for a long time, resulting in significant data loss. > > Let's say you're used to the MSSQL or MySQL behavior, you migrate your > application to PostgreSQL or whatever - how do you find out about this > behavior? Users are likely to visit > > https://www.postgresql.org/docs/12/functions-json.html > > but that says nothing about how jsonb_set works with NULL values :-( We should certainly fix that. I accept some responsibility for the omission. > > You're right raising an exception may not be the "right behavior" for > whatever definition of "right". But I kinda agree with David that it's > somewhat reasonable when we don't know what the "universally correct" > thing is (or when there's no such thing). IMHO that's better than > silently discarding some of the data. I'm not arguing against the idea of improving the situation. But I am arguing against a minimal fix that will not provide much of value to a careful app developer. i.e. I want to do more to support app devs. Ideally they would not need to use wrapper functions. There will be plenty of situations where it is mighty inconvenient to catch an exception thrown by jsonb_set(). And catching exceptions can be expensive. You want to avoid that if possible in your performance-critical plpgsql code. > > FWIW I think the JSON/JSONB part of our code base is amazing, and the > fact that various other databases adopted something very similar over > the last couple of years just confirms that. And if this is the only > speck of dust in the API, I think that's pretty amazing. TY. When I first saw the SQL/JSON spec I thought I should send a request to the SQL standards committee for a royalty payment, since it looked so familiar ;-) > > I'm not sure how significant this issue actually is - it's true we got a > couple of complaints over the years (judging by a quick search for > jsonb_set and NULL in the archives), but I'm not sure that's enough to > justify any changes in backbranches. I'd say no, but I have no idea how > many people are affected by this but don't know about it ... > > No, no backpatching. As I said upthread, this isn't a bug, but it is arguably a POLA violation, which is why we should do something for release 13. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Steven Pousty wrote: > I would think though that raising an exception is better than a > default behavior which deletes data. I can't help but feel the need to make the point that the function is not deleting anything. It is just returning null. The deletion of data is being performed by an update statement that uses the function's return value to set a column value. I don't agree that raising an exception in the function is a good idea (perhaps unless it's valid to assume that this function will only ever be used in such a context). Making the column not null (as already suggested) and having the update statement itself raise the exception seems more appropriate if an exception is desirable. But that presumes an accurate understanding of the behaviour of jsonb_set. Really, I think the best fix would be in the documentation so that everyone who finds the function in the documentation understands its behaviour immediately. I didn't even know there was such a thing as a strict function or what it means and the documentation for jsonb_set doesn't mention that it is a strict function and the examples of its use don't demonstrate this behaviour. I'm referring to https://www.postgresql.org/docs/9.5/functions-json.html. All of this contributes to the astonishment encountered here. Least astonishment can probably be achieved with additional documentation but it has to be where the reader is looking when they first encounter the function in the documentation so that their expectations are set correctly and set early. And documentation can be "fixed" sooner than postgresql 13. Perhaps an audit of the documentation for all strict functions would be a good idea to see if they need work. Knowing that a function won't be executed at all and will effectively return null when given a null argument might be important to know for other functions as well. cheers, raf
Steven Pousty wrote: > I would think though that raising an exception is better than a > default behavior which deletes data. I can't help but feel the need to make the point that the function is not deleting anything. It is just returning null. The deletion of data is being performed by an update statement that uses the function's return value to set a column value. I don't agree that raising an exception in the function is a good idea (perhaps unless it's valid to assume that this function will only ever be used in such a context). Making the column not null (as already suggested) and having the update statement itself raise the exception seems more appropriate if an exception is desirable. But that presumes an accurate understanding of the behaviour of jsonb_set. Really, I think the best fix would be in the documentation so that everyone who finds the function in the documentation understands its behaviour immediately. I didn't even know there was such a thing as a strict function or what it means and the documentation for jsonb_set doesn't mention that it is a strict function and the examples of its use don't demonstrate this behaviour. I'm referring to https://www.postgresql.org/docs/9.5/functions-json.html. All of this contributes to the astonishment encountered here. Least astonishment can probably be achieved with additional documentation but it has to be where the reader is looking when they first encounter the function in the documentation so that their expectations are set correctly and set early. And documentation can be "fixed" sooner than postgresql 13. Perhaps an audit of the documentation for all strict functions would be a good idea to see if they need work. Knowing that a function won't be executed at all and will effectively return null when given a null argument might be important to know for other functions as well. cheers, raf
Hello, On Sun, 2019-10-20 at 18:51 -0400, Andrew Dunstan wrote: > On 10/20/19 4:18 PM, Tomas Vondra wrote: > > > > https://www.postgresql.org/docs/12/functions-json.html > > > > but that says nothing about how jsonb_set works with NULL values :- > > ( > > > We should certainly fix that. I accept some responsibility for the > omission. > > > FWIW, if you are able to update the documentation, the current JSON RFC is 8259. https://tools.ietf.org/html/rfc8259 Cheers, Rob
I would argue that only if the target parameter (the actual json value) is NULL should the result be NULL. The function is documented as returning the target, with modifications to a small part of its structure as specified by the other parameters. It is strange for the result to suddenly collapse down to NULL just because another parameter is NULL. Perhaps if the path is NULL, that can mean "don't update". And if create_missing is NULL, that should mean the same as not specifying it. I think. At a minimum, if we don't change it, the documentation needs to get one of those warning boxes alerting people that the functions will destroy their input entirely rather than slightly modifying it if any of the other parameters are NULL.My only doubt about any of this is that by the same argument, functions like replace() should not return NULL if the 2nd or 3rd parameter is NULL. I'm guessing replace() is specified by SQL and also unchanged in many versions so therefore not eligible for re-thinking but it still gives me just a bit of pause.
On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: > >On 10/20/19 4:18 PM, Tomas Vondra wrote: >> On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: >>> >>> On 10/20/19 1:14 PM, David G. Johnston wrote: >>>> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan >>>> <andrew.dunstan@2ndquadrant.com >>>> <mailto:andrew.dunstan@2ndquadrant.com>> wrote: >>>> >>>> And yet another is to >>>> raise an exception, which is easy to write but really punts the >>>> issue >>>> back to the application programmer who will have to decide how to >>>> ensure >>>> they never pass in a NULL parameter. >>>> >>>> >>>> That's kinda the point - if they never pass NULL they won't encounter >>>> any problems but as soon as the data and their application don't see >>>> eye-to-eye the application developer has to decide what they want to >>>> do about it. We are in no position to decide for them and making it >>>> obvious they have a decision to make and implement here doesn't seem >>>> like a improper position to take. >>> >>> >>> The app dev can avoid this problem today by making sure they don't pass >>> a NULL as the value. Or they can use a wrapper function which does that >>> for them. So frankly this doesn't seem like much of an advance. And, as >>> has been noted, it's not consistent with what either MySQL or MSSQL do. >>> In general I'm not that keen on raising an exception for cases like >>> this. >>> >> >> I think the general premise of this thread is that the application >> developer does not realize that may be necessary, because it's a bit >> surprising behavior, particularly when having more experience with other >> databases that behave differently. It's also pretty easy to not notice >> this issue for a long time, resulting in significant data loss. >> >> Let's say you're used to the MSSQL or MySQL behavior, you migrate your >> application to PostgreSQL or whatever - how do you find out about this >> behavior? Users are likely to visit >> >> https://www.postgresql.org/docs/12/functions-json.html >> >> but that says nothing about how jsonb_set works with NULL values :-( > > > >We should certainly fix that. I accept some responsibility for the omission. > +1 > >> >> You're right raising an exception may not be the "right behavior" for >> whatever definition of "right". But I kinda agree with David that it's >> somewhat reasonable when we don't know what the "universally correct" >> thing is (or when there's no such thing). IMHO that's better than >> silently discarding some of the data. > > >I'm not arguing against the idea of improving the situation. But I am >arguing against a minimal fix that will not provide much of value to a >careful app developer. i.e. I want to do more to support app devs. >Ideally they would not need to use wrapper functions. There will be >plenty of situations where it is mighty inconvenient to catch an >exception thrown by jsonb_set(). And catching exceptions can be >expensive. You want to avoid that if possible in your >performance-critical plpgsql code. > True. And AFAIK catching exceptions is not really possible in some code, e.g. in stored procedures (because we can't do subtransactions, so no exception blocks). > >> >> FWIW I think the JSON/JSONB part of our code base is amazing, and the >> fact that various other databases adopted something very similar over >> the last couple of years just confirms that. And if this is the only >> speck of dust in the API, I think that's pretty amazing. > > >TY. When I first saw the SQL/JSON spec I thought I should send a request >to the SQL standards committee for a royalty payment, since it looked so >familiar ;-) > ;-) > >> >> I'm not sure how significant this issue actually is - it's true we got a >> couple of complaints over the years (judging by a quick search for >> jsonb_set and NULL in the archives), but I'm not sure that's enough to >> justify any changes in backbranches. I'd say no, but I have no idea how >> many people are affected by this but don't know about it ... >> >> > >No, no backpatching. As I said upthread, this isn't a bug, but it is >arguably a POLA violation, which is why we should do something for >release 13. > WFM regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: > >On 10/20/19 4:18 PM, Tomas Vondra wrote: >> On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: >>> >>> On 10/20/19 1:14 PM, David G. Johnston wrote: >>>> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan >>>> <andrew.dunstan@2ndquadrant.com >>>> <mailto:andrew.dunstan@2ndquadrant.com>> wrote: >>>> >>>> And yet another is to >>>> raise an exception, which is easy to write but really punts the >>>> issue >>>> back to the application programmer who will have to decide how to >>>> ensure >>>> they never pass in a NULL parameter. >>>> >>>> >>>> That's kinda the point - if they never pass NULL they won't encounter >>>> any problems but as soon as the data and their application don't see >>>> eye-to-eye the application developer has to decide what they want to >>>> do about it. We are in no position to decide for them and making it >>>> obvious they have a decision to make and implement here doesn't seem >>>> like a improper position to take. >>> >>> >>> The app dev can avoid this problem today by making sure they don't pass >>> a NULL as the value. Or they can use a wrapper function which does that >>> for them. So frankly this doesn't seem like much of an advance. And, as >>> has been noted, it's not consistent with what either MySQL or MSSQL do. >>> In general I'm not that keen on raising an exception for cases like >>> this. >>> >> >> I think the general premise of this thread is that the application >> developer does not realize that may be necessary, because it's a bit >> surprising behavior, particularly when having more experience with other >> databases that behave differently. It's also pretty easy to not notice >> this issue for a long time, resulting in significant data loss. >> >> Let's say you're used to the MSSQL or MySQL behavior, you migrate your >> application to PostgreSQL or whatever - how do you find out about this >> behavior? Users are likely to visit >> >> https://www.postgresql.org/docs/12/functions-json.html >> >> but that says nothing about how jsonb_set works with NULL values :-( > > > >We should certainly fix that. I accept some responsibility for the omission. > +1 > >> >> You're right raising an exception may not be the "right behavior" for >> whatever definition of "right". But I kinda agree with David that it's >> somewhat reasonable when we don't know what the "universally correct" >> thing is (or when there's no such thing). IMHO that's better than >> silently discarding some of the data. > > >I'm not arguing against the idea of improving the situation. But I am >arguing against a minimal fix that will not provide much of value to a >careful app developer. i.e. I want to do more to support app devs. >Ideally they would not need to use wrapper functions. There will be >plenty of situations where it is mighty inconvenient to catch an >exception thrown by jsonb_set(). And catching exceptions can be >expensive. You want to avoid that if possible in your >performance-critical plpgsql code. > True. And AFAIK catching exceptions is not really possible in some code, e.g. in stored procedures (because we can't do subtransactions, so no exception blocks). > >> >> FWIW I think the JSON/JSONB part of our code base is amazing, and the >> fact that various other databases adopted something very similar over >> the last couple of years just confirms that. And if this is the only >> speck of dust in the API, I think that's pretty amazing. > > >TY. When I first saw the SQL/JSON spec I thought I should send a request >to the SQL standards committee for a royalty payment, since it looked so >familiar ;-) > ;-) > >> >> I'm not sure how significant this issue actually is - it's true we got a >> couple of complaints over the years (judging by a quick search for >> jsonb_set and NULL in the archives), but I'm not sure that's enough to >> justify any changes in backbranches. I'd say no, but I have no idea how >> many people are affected by this but don't know about it ... >> >> > >No, no backpatching. As I said upthread, this isn't a bug, but it is >arguably a POLA violation, which is why we should do something for >release 13. > WFM regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/21/19 2:07 AM, Tomas Vondra wrote: > On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: >> >>> I think the general premise of this thread is that the application >>> developer does not realize that may be necessary, because it's a bit >>> surprising behavior, particularly when having more experience with >>> other >>> databases that behave differently. It's also pretty easy to not notice >>> this issue for a long time, resulting in significant data loss. >>> >>> Let's say you're used to the MSSQL or MySQL behavior, you migrate your >>> application to PostgreSQL or whatever - how do you find out about this >>> behavior? Users are likely to visit >>> >>> https://www.postgresql.org/docs/12/functions-json.html >>> >>> but that says nothing about how jsonb_set works with NULL values :-( >> >> >> >> We should certainly fix that. I accept some responsibility for the >> omission. >> > > +1 > > So let's add something to the JSON funcs page like this: Note: All the above functions except for json_build_object, json_build_array, json_to_recordset, json_populate_record, and json_populate_recordset and their jsonb equivalents are strict functions. That is, if any argument is NULL the function result will be NULL and the function won't even be called. Particular care should therefore be taken to avoid passing NULL arguments to those functions unless a NULL result is expected. This is particularly true of the jsonb_set and jsonb_insert functions. (We do have a heck of a lot of Note: sections on that page) cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/21/19 2:07 AM, Tomas Vondra wrote: > On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: >> >>> I think the general premise of this thread is that the application >>> developer does not realize that may be necessary, because it's a bit >>> surprising behavior, particularly when having more experience with >>> other >>> databases that behave differently. It's also pretty easy to not notice >>> this issue for a long time, resulting in significant data loss. >>> >>> Let's say you're used to the MSSQL or MySQL behavior, you migrate your >>> application to PostgreSQL or whatever - how do you find out about this >>> behavior? Users are likely to visit >>> >>> https://www.postgresql.org/docs/12/functions-json.html >>> >>> but that says nothing about how jsonb_set works with NULL values :-( >> >> >> >> We should certainly fix that. I accept some responsibility for the >> omission. >> > > +1 > > So let's add something to the JSON funcs page like this: Note: All the above functions except for json_build_object, json_build_array, json_to_recordset, json_populate_record, and json_populate_recordset and their jsonb equivalents are strict functions. That is, if any argument is NULL the function result will be NULL and the function won't even be called. Particular care should therefore be taken to avoid passing NULL arguments to those functions unless a NULL result is expected. This is particularly true of the jsonb_set and jsonb_insert functions. (We do have a heck of a lot of Note: sections on that page) cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/20/19 11:07 PM, Tomas Vondra wrote: > On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: > > True. And AFAIK catching exceptions is not really possible in some code, > e.g. in stored procedures (because we can't do subtransactions, so no > exception blocks). > Can you explain the above to me as I thought there are exception blocks in stored functions and now sub-transactions in stored procedures. -- Adrian Klaver adrian.klaver@aklaver.com
On 10/20/19 11:07 PM, Tomas Vondra wrote: > On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: > > True. And AFAIK catching exceptions is not really possible in some code, > e.g. in stored procedures (because we can't do subtransactions, so no > exception blocks). > Can you explain the above to me as I thought there are exception blocks in stored functions and now sub-transactions in stored procedures. -- Adrian Klaver adrian.klaver@aklaver.com
I'm not arguing against the idea of improving the situation. But I am
arguing against a minimal fix that will not provide much of value to a
careful app developer. i.e. I want to do more to support app devs.
Ideally they would not need to use wrapper functions. There will be
plenty of situations where it is mighty inconvenient to catch an
exception thrown by jsonb_set(). And catching exceptions can be
expensive. You want to avoid that if possible in your
performance-critical plpgsql code.
I'm not arguing against the idea of improving the situation. But I am
arguing against a minimal fix that will not provide much of value to a
careful app developer. i.e. I want to do more to support app devs.
Ideally they would not need to use wrapper functions. There will be
plenty of situations where it is mighty inconvenient to catch an
exception thrown by jsonb_set(). And catching exceptions can be
expensive. You want to avoid that if possible in your
performance-critical plpgsql code.
Steven Pousty wrote:
> I would think though that raising an exception is better than a
> default behavior which deletes data.
I can't help but feel the need to make the point that
the function is not deleting anything. It is just
returning null. The deletion of data is being performed
by an update statement that uses the function's return
value to set a column value.
I don't agree that raising an exception in the function
is a good idea (perhaps unless it's valid to assume
that this function will only ever be used in such a
context). Making the column not null (as already
suggested) and having the update statement itself raise
the exception seems more appropriate if an exception is
desirable. But that presumes an accurate understanding
of the behaviour of jsonb_set.
Really, I think the best fix would be in the
documentation so that everyone who finds the function
in the documentation understands its behaviour
immediately.
Steven Pousty wrote:
> I would think though that raising an exception is better than a
> default behavior which deletes data.
I can't help but feel the need to make the point that
the function is not deleting anything. It is just
returning null. The deletion of data is being performed
by an update statement that uses the function's return
value to set a column value.
I don't agree that raising an exception in the function
is a good idea (perhaps unless it's valid to assume
that this function will only ever be used in such a
context). Making the column not null (as already
suggested) and having the update statement itself raise
the exception seems more appropriate if an exception is
desirable. But that presumes an accurate understanding
of the behaviour of jsonb_set.
Really, I think the best fix would be in the
documentation so that everyone who finds the function
in the documentation understands its behaviour
immediately.
Turning a JSON null into a SQL null and thereby "deleting" the data is not the path of least surprises.
In what situation does that happen? (If it's already been mentioned I missed it, long thread, sorry).
Cheers,
Steve
On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote: >On 10/20/19 11:07 PM, Tomas Vondra wrote: >>On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: > >> >>True. And AFAIK catching exceptions is not really possible in some code, >>e.g. in stored procedures (because we can't do subtransactions, so no >>exception blocks). >> > >Can you explain the above to me as I thought there are exception >blocks in stored functions and now sub-transactions in stored >procedures. > Sorry for the confusion - I've not been particularly careful when writing that response. Let me illustrate the issue with this example: CREATE TABLE t (a int); CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN -- SAVEPOINT s1; INSERT INTO t VALUES (1); -- COMMIT; EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; $$; CALL test(); If you uncomment the SAVEPOINT, you get NOTICE: error: unsupported transaction command in PL/pgSQL because savepoints are not allowed in stored procedures. Fine. If you uncomment the COMMIT, you get NOTICE: error: cannot commit while a subtransaction is active which happens because the EXCEPTION block creates a subtransaction, and we can't commit when it's active. But we can commit outside the exception block: CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN BEGIN INSERT INTO t VALUES (1); EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; COMMIT; END; $$; regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote: >On 10/20/19 11:07 PM, Tomas Vondra wrote: >>On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: > >> >>True. And AFAIK catching exceptions is not really possible in some code, >>e.g. in stored procedures (because we can't do subtransactions, so no >>exception blocks). >> > >Can you explain the above to me as I thought there are exception >blocks in stored functions and now sub-transactions in stored >procedures. > Sorry for the confusion - I've not been particularly careful when writing that response. Let me illustrate the issue with this example: CREATE TABLE t (a int); CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN -- SAVEPOINT s1; INSERT INTO t VALUES (1); -- COMMIT; EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; $$; CALL test(); If you uncomment the SAVEPOINT, you get NOTICE: error: unsupported transaction command in PL/pgSQL because savepoints are not allowed in stored procedures. Fine. If you uncomment the COMMIT, you get NOTICE: error: cannot commit while a subtransaction is active which happens because the EXCEPTION block creates a subtransaction, and we can't commit when it's active. But we can commit outside the exception block: CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN BEGIN INSERT INTO t VALUES (1); EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; COMMIT; END; $$; regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/21/19 12:50 PM, Tomas Vondra wrote: > On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote: >> On 10/20/19 11:07 PM, Tomas Vondra wrote: >>> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: >> >>> >>> True. And AFAIK catching exceptions is not really possible in some code, >>> e.g. in stored procedures (because we can't do subtransactions, so no >>> exception blocks). >>> >> >> Can you explain the above to me as I thought there are exception >> blocks in stored functions and now sub-transactions in stored procedures. >> > > Sorry for the confusion - I've not been particularly careful when > writing that response. > > Let me illustrate the issue with this example: > > CREATE TABLE t (a int); > > CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ > DECLARE > msg TEXT; > BEGIN > -- SAVEPOINT s1; > INSERT INTO t VALUES (1); > -- COMMIT; > EXCEPTION > WHEN others THEN > msg := SUBSTR(SQLERRM, 1, 100); > RAISE NOTICE 'error: %', msg; > END; $$; > > CALL test(); > > If you uncomment the SAVEPOINT, you get > > NOTICE: error: unsupported transaction command in PL/pgSQL > > because savepoints are not allowed in stored procedures. Fine. > > If you uncomment the COMMIT, you get > > NOTICE: error: cannot commit while a subtransaction is active > > which happens because the EXCEPTION block creates a subtransaction, and > we can't commit when it's active. > > But we can commit outside the exception block: > > CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ > DECLARE > msg TEXT; > BEGIN > BEGIN > INSERT INTO t VALUES (1); > EXCEPTION > WHEN others THEN > msg := SUBSTR(SQLERRM, 1, 100); > RAISE NOTICE 'error: %', msg; > END; > COMMIT; > END; $$; You can do something like the below though: CREATE TABLE t (a int PRIMARY KEY); CREATE OR REPLACE PROCEDURE public.test() LANGUAGE plpgsql AS $procedure$ DECLARE msg TEXT; BEGIN BEGIN INSERT INTO t VALUES (1); EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; UPDATE t set a = 2; END; COMMIT; END; $procedure$ test_(postgres)# CALL test(); CALL test_(postgres)# select * from t; a --- 1 (1 row) test_(postgres)# CALL test(); NOTICE: error: duplicate key value violates unique constraint "t_pkey" CALL test_(postgres)# select * from t; a --- 2 (1 row) > > > regards > -- Adrian Klaver adrian.klaver@aklaver.com
On 10/21/19 12:50 PM, Tomas Vondra wrote: > On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote: >> On 10/20/19 11:07 PM, Tomas Vondra wrote: >>> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: >> >>> >>> True. And AFAIK catching exceptions is not really possible in some code, >>> e.g. in stored procedures (because we can't do subtransactions, so no >>> exception blocks). >>> >> >> Can you explain the above to me as I thought there are exception >> blocks in stored functions and now sub-transactions in stored procedures. >> > > Sorry for the confusion - I've not been particularly careful when > writing that response. > > Let me illustrate the issue with this example: > > CREATE TABLE t (a int); > > CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ > DECLARE > msg TEXT; > BEGIN > -- SAVEPOINT s1; > INSERT INTO t VALUES (1); > -- COMMIT; > EXCEPTION > WHEN others THEN > msg := SUBSTR(SQLERRM, 1, 100); > RAISE NOTICE 'error: %', msg; > END; $$; > > CALL test(); > > If you uncomment the SAVEPOINT, you get > > NOTICE: error: unsupported transaction command in PL/pgSQL > > because savepoints are not allowed in stored procedures. Fine. > > If you uncomment the COMMIT, you get > > NOTICE: error: cannot commit while a subtransaction is active > > which happens because the EXCEPTION block creates a subtransaction, and > we can't commit when it's active. > > But we can commit outside the exception block: > > CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ > DECLARE > msg TEXT; > BEGIN > BEGIN > INSERT INTO t VALUES (1); > EXCEPTION > WHEN others THEN > msg := SUBSTR(SQLERRM, 1, 100); > RAISE NOTICE 'error: %', msg; > END; > COMMIT; > END; $$; You can do something like the below though: CREATE TABLE t (a int PRIMARY KEY); CREATE OR REPLACE PROCEDURE public.test() LANGUAGE plpgsql AS $procedure$ DECLARE msg TEXT; BEGIN BEGIN INSERT INTO t VALUES (1); EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; UPDATE t set a = 2; END; COMMIT; END; $procedure$ test_(postgres)# CALL test(); CALL test_(postgres)# select * from t; a --- 1 (1 row) test_(postgres)# CALL test(); NOTICE: error: duplicate key value violates unique constraint "t_pkey" CALL test_(postgres)# select * from t; a --- 2 (1 row) > > > regards > -- Adrian Klaver adrian.klaver@aklaver.com
Steven Pousty wrote: > On Sun, Oct 20, 2019 at 4:31 PM raf <raf@raf.org> wrote: > > > Steven Pousty wrote: > > > > > I would think though that raising an exception is better than a > > > default behavior which deletes data. > > > > I can't help but feel the need to make the point that > > the function is not deleting anything. It is just > > returning null. The deletion of data is being performed > > by an update statement that uses the function's return > > value to set a column value. > > > > I don't agree that raising an exception in the function > > is a good idea (perhaps unless it's valid to assume > > that this function will only ever be used in such a > > context). Making the column not null (as already > > suggested) and having the update statement itself raise > > the exception seems more appropriate if an exception is > > desirable. But that presumes an accurate understanding > > of the behaviour of jsonb_set. > > > > Really, I think the best fix would be in the > > documentation so that everyone who finds the function > > in the documentation understands its behaviour > > immediately. > > > Hey Raf > > In a perfect world I would agree with you. But often users do not read ALL > the documentation before they use the function in their code OR they are > not sure that the condition applies to them (until it does). I'm well aware of that, hence the statement that this information needs to appear at the place in the documentation where the user is first going to encounter the function (i.e. in the table where its examples are). Even putting it in a note box further down the page might not be enough (but hopefully it will be). cheers, raf
Steven Pousty wrote: > On Sun, Oct 20, 2019 at 4:31 PM raf <raf@raf.org> wrote: > > > Steven Pousty wrote: > > > > > I would think though that raising an exception is better than a > > > default behavior which deletes data. > > > > I can't help but feel the need to make the point that > > the function is not deleting anything. It is just > > returning null. The deletion of data is being performed > > by an update statement that uses the function's return > > value to set a column value. > > > > I don't agree that raising an exception in the function > > is a good idea (perhaps unless it's valid to assume > > that this function will only ever be used in such a > > context). Making the column not null (as already > > suggested) and having the update statement itself raise > > the exception seems more appropriate if an exception is > > desirable. But that presumes an accurate understanding > > of the behaviour of jsonb_set. > > > > Really, I think the best fix would be in the > > documentation so that everyone who finds the function > > in the documentation understands its behaviour > > immediately. > > > Hey Raf > > In a perfect world I would agree with you. But often users do not read ALL > the documentation before they use the function in their code OR they are > not sure that the condition applies to them (until it does). I'm well aware of that, hence the statement that this information needs to appear at the place in the documentation where the user is first going to encounter the function (i.e. in the table where its examples are). Even putting it in a note box further down the page might not be enough (but hopefully it will be). cheers, raf
On 2019-10-20 13:20:23 -0700, Steven Pousty wrote: > I would think though that raising an exception is better than a default > behavior which deletes data. > As an app dev I am quite used to all sorts of "APIs" throwing exceptions and > have learned to deal with them. > > This is my way of saying that raising an exception is an improvement over the > current situation. May not be the "best" solution but definitely an > improvement. I somewhat disagree. SQL isn't in general a language which uses exceptions a lot. It does have the value NULL to mean "unknown", and generally unknown combined with something else results in an unknown value again: % psql wds Null display is "(∅)". Line style is unicode. Border style is 2. Unicode border line style is "double". Timing is on. Expanded display is used automatically. psql (11.5 (Ubuntu 11.5-3.pgdg18.04+1)) Type "help" for help. wds=> select 4 + NULL; ╔══════════╗ ║ ?column? ║ ╟──────────╢ ║ (∅) ║ ╚══════════╝ (1 row) Time: 0.924 ms wds=> select replace('steven', 'e', NULL); ╔═════════╗ ║ replace ║ ╟─────────╢ ║ (∅) ║ ╚═════════╝ (1 row) Time: 0.918 ms Throwing an exception for a pure function seems "un-SQLy" to me. In particular, jsonb_set does something similar for json values as replace does for strings, so it should behave similarly. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment
On 2019-10-21 09:39:13 -0700, Steven Pousty wrote: > Turning a JSON null into a SQL null and thereby "deleting" the data > is not the path of least surprises. But it doesn't do that: A JSON null is perfectly fine: wds=> select jsonb_set('{"a": 1, "b": 2}'::jsonb, '{c}', 'null'::jsonb); ╔═════════════════════════════╗ ║ jsonb_set ║ ╟─────────────────────────────╢ ║ {"a": 1, "b": 2, "c": null} ║ ╚═════════════════════════════╝ (1 row) It is trying to replace a part of the JSON object with an SQL NULL (i.e. unknown) which returns SQL NULL: wds=> select jsonb_set('{"a": 1, "b": 2}'::jsonb, '{c}', NULL); ╔═══════════╗ ║ jsonb_set ║ ╟───────────╢ ║ (∅) ║ ╚═══════════╝ (1 row) hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment
On 2019-10-22 09:16:05 +1100, raf wrote: > Steven Pousty wrote: > > In a perfect world I would agree with you. But often users do not read ALL > > the documentation before they use the function in their code OR they are > > not sure that the condition applies to them (until it does). > > I'm well aware of that, hence the statement that this > information needs to appear at the place in the > documentation where the user is first going to > encounter the function (i.e. in the table where its > examples are). I think this is a real weakness of the tabular format used for documenting functions: While it is quite compact which is nice if you just want to look up a function's name or parameters, it really discourages explanations longer than a single paragraph. Section 9.9 gets around this problem by limiting the in-table description to a few words and "see Section 9.9.x". So you basically have to read the text and not just the table. Maybe that would make sense for the json functions, too? hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment
On 2019-10-20 13:20:23 -0700, Steven Pousty wrote:
> I would think though that raising an exception is better than a default
> behavior which deletes data.
> As an app dev I am quite used to all sorts of "APIs" throwing exceptions and
> have learned to deal with them.
>
> This is my way of saying that raising an exception is an improvement over the
> current situation. May not be the "best" solution but definitely an
> improvement.
I somewhat disagree. SQL isn't in general a language which uses
exceptions a lot. It does have the value NULL to mean "unknown", and
generally unknown combined with something else results in an unknown
value again:
Throwing an exception for a pure function seems "un-SQLy" to me. In
particular, jsonb_set does something similar for json values as replace
does for strings, so it should behave similarly.
David G. Johnston wrote: > Now if only the vast majority of users could have and keep this level of understanding > in mind while writing complex queries so that they remember to always add protections > to compensate for the unique design decision that SQL has taken here... You can only say that if you don't understand NULL (you wouldn't be alone). If I modify a JSON with an unknown value, the result is unknown. This seems very intuitive to me. One could argue that whoever uses SQL should understand SQL. But I believe that it is reasonable to suppose that many people who use JSON in the database are more savvy with JSON than with SQL (they might not have chosen JSON otherwise), so I agree that it makes sense to change this particular behavior. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
David G. Johnston wrote:
> Now if only the vast majority of users could have and keep this level of understanding
> in mind while writing complex queries so that they remember to always add protections
> to compensate for the unique design decision that SQL has taken here...
You can only say that if you don't understand NULL (you wouldn't be alone).
If I modify a JSON with an unknown value, the result is unknown.
This seems very intuitive to me.
One could argue that whoever uses SQL should understand SQL.
But I believe that it is reasonable to suppose that many people who
use JSON in the database are more savvy with JSON than with SQL
(they might not have chosen JSON otherwise), so I agree that it makes
sense to change this particular behavior.
On 2019-10-22 18:06:39 -0700, David G. Johnston wrote: > On Tue, Oct 22, 2019 at 3:55 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote: > On 2019-10-20 13:20:23 -0700, Steven Pousty wrote: > > I would think though that raising an exception is better than a > > default behavior which deletes data. > > As an app dev I am quite used to all sorts of "APIs" throwing > > exceptions and have learned to deal with them. > > > > This is my way of saying that raising an exception is an > > improvement over the current situation. May not be the "best" > > solution but definitely an improvement. > > I somewhat disagree. SQL isn't in general a language which uses > exceptions a lot. It does have the value NULL to mean "unknown", and > generally unknown combined with something else results in an unknown > value again: > > [...] > > > Throwing an exception for a pure function seems "un-SQLy" to me. In > particular, jsonb_set does something similar for json values as replace > does for strings, so it should behave similarly. > > > Now if only the vast majority of users could have and keep this level of > understanding in mind while writing complex queries so that they remember to > always add protections to compensate for the unique design decision that SQL > has taken here... I grant that SQL NULL takes a bit to get used to. However, it is a core part of the SQL language and everyone who uses SQL must understand it (I don't remember when I first stumbled across "select * from t where c = NULL" returning 0 rows, but it was probably within the first few days of using a database). And personally I find it much easier to deal with concept which are applied consistently across the whole language than those which sometimes apply and sometimes don't seemingly at random, just because a developer thought it would be convenient for the specific use-case they had in mind. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment
On 10/23/19 5:42 AM, Laurenz Albe wrote: > David G. Johnston wrote: >> Now if only the vast majority of users could have and keep this level of understanding >> in mind while writing complex queries so that they remember to always add protections >> to compensate for the unique design decision that SQL has taken here... > > You can only say that if you don't understand NULL (you wouldn't be alone). > If I modify a JSON with an unknown value, the result is unknown. > This seems very intuitive to me. Would you expect modifying an array value with an unknown would result in the entire array being unknown? > One could argue that whoever uses SQL should understand SQL. > > But I believe that it is reasonable to suppose that many people who > use JSON in the database are more savvy with JSON than with SQL > (they might not have chosen JSON otherwise), so I agree that it makes > sense to change this particular behavior. > > Yours, > Laurenz Albe That (generally) SQL NULL results in NULL for any operation has been brought up multiple times in this thread, including above, as a rationale for the current jsonb behavior. I don't think it is a valid argument. When examples are given, they typically are with scalar values where such behavior makes sense: the resulting scalar value has to be NULL or non-NULL, it can't be both. It is less sensible with compound values where the rule can apply to individual scalar components. And indeed that is what Postgresql does for another compound type: # select array_replace(array[1,2,3],2,NULL); array_replace --------------- {1,NULL,3} The returned value is not NULL. Why the inconsistency between the array type and json type? Are there any cases other than json where the entire compound value is set to NULL as a result of one of its components being NULL?
On Wed, Oct 23, 2019 at 12:01 PM Stuart McGraw <smcg4191@mtneva.com> wrote: > Why the inconsistency between the array > type and json type? Are there any cases other than json where the entire > compound value is set to NULL as a result of one of its components being > NULL? That's a great point. It does look like hstore's delete / minus operator behaves like that, though: =# select 'a=>1,b=>2'::hstore - null; ?column? ---------- (1 row)
Hello, On Wed, 2019-10-23 at 20:33 +0200, Peter J. Holzer wrote: > > I grant that SQL NULL takes a bit to get used to. However, it is a > core > part of the SQL language and everyone who uses SQL must understand it > (I > don't remember when I first stumbled across "select * from t where c > = > NULL" returning 0 rows, but it was probably within the first few days > of > using a database). And personally I find it much easier to deal with > concept which are applied consistently across the whole language than > those which sometimes apply and sometimes don't seemingly at random, > just because a developer thought it would be convenient for the > specific > use-case they had in mind. > > hp > From the JSON spec:- 3. Values A JSON value MUST be an object, array, number, or string, or one of the following three literal names: false null true The literal names MUST be lowercase. No other literal names are allowed. So, you can't set a value associated to a key to SQL NULL. If a key should not have a value then delete that key from the JSON. If you decide your application is going to use one of those three literal names, then you need to code accordingly. My 2 cents.
When examples are given, they typically are with scalar values where
such behavior makes sense: the resulting scalar value has to be NULL
or non-NULL, it can't be both.
It is less sensible with compound values where the rule can apply to
individual scalar components. And indeed that is what Postgresql does
for another compound type:
On Wed, 2019-10-23 at 13:00 -0600, Stuart McGraw wrote: > > You can only say that if you don't understand NULL (you wouldn't be alone). > > If I modify a JSON with an unknown value, the result is unknown. > > This seems very intuitive to me. > > Would you expect modifying an array value with an unknown would result > in the entire array being unknown? Hm, yes, that is less intuitive. I was viewing a JSON as an atomic value above. > > One could argue that whoever uses SQL should understand SQL. > > > > But I believe that it is reasonable to suppose that many people who > > use JSON in the database are more savvy with JSON than with SQL > > (they might not have chosen JSON otherwise), so I agree that it makes > > sense to change this particular behavior. > > That (generally) SQL NULL results in NULL for any operation has been > brought up multiple times in this thread, including above, as a rationale > for the current jsonb behavior. I don't think it is a valid argument. > > When examples are given, they typically are with scalar values where > such behavior makes sense: the resulting scalar value has to be NULL > or non-NULL, it can't be both. > > It is less sensible with compound values where the rule can apply to > individual scalar components. And indeed that is what Postgresql does > for another compound type: > > # select array_replace(array[1,2,3],2,NULL); > array_replace > --------------- > {1,NULL,3} > > The returned value is not NULL. Why the inconsistency between the array > type and json type? Are there any cases other than json where the entire > compound value is set to NULL as a result of one of its components being > NULL? That is a good point. I agree that the behavior should be changed. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Laurenz Albe <laurenz.albe@cybertec.at> writes: > On Wed, 2019-10-23 at 13:00 -0600, Stuart McGraw wrote: >> It is less sensible with compound values where the rule can apply to >> individual scalar components. I agree that JSON can sensibly be viewed as a composite value, but ... >> And indeed that is what Postgresql does >> for another compound type: >> >> # select array_replace(array[1,2,3],2,NULL); >> array_replace >> --------------- >> {1,NULL,3} >> >> The returned value is not NULL. Why the inconsistency between the array >> type and json type? ... the flaw in this argument is that the array element is actually a SQL NULL when we're done. To do something similar in the JSON case, we have to translate SQL NULL to JSON null, and that's cheating to some extent. They're not the same thing (and I'll generally resist proposals to, say, make SELECT 'null'::json IS NULL return true). Maybe it's okay to make this case work like that, but don't be too high and mighty about it being logically clean; it isn't. regards, tom lane
On 10/24/19 2:17 PM, Tom Lane wrote: > Laurenz Albe <laurenz.albe@cybertec.at> writes: >> On Wed, 2019-10-23 at 13:00 -0600, Stuart McGraw wrote: >>> It is less sensible with compound values where the rule can apply to >>> individual scalar components. > > I agree that JSON can sensibly be viewed as a composite value, but ... > >>> And indeed that is what Postgresql does >>> for another compound type: >>> >>> # select array_replace(array[1,2,3],2,NULL); >>> array_replace >>> --------------- >>> {1,NULL,3} >>> >>> The returned value is not NULL. Why the inconsistency between the array >>> type and json type? > > ... the flaw in this argument is that the array element is actually > a SQL NULL when we're done. To do something similar in the JSON case, > we have to translate SQL NULL to JSON null, and that's cheating to > some extent. They're not the same thing (and I'll generally resist > proposals to, say, make SELECT 'null'::json IS NULL return true). > > Maybe it's okay to make this case work like that, but don't be too > high and mighty about it being logically clean; it isn't. > > regards, tom lane Sure, but my point was not that this was a perfect "logically clean" answer, just that the argument, which was made multiple times, that the entire result should be NULL because "that's the way SQL NULLs work" is not really right. It does seem to me that mapping NULL to "null" is likely a workable approach but that's just my uninformed opinion.
On 10/21/19 9:28 AM, Andrew Dunstan wrote: > On 10/21/19 2:07 AM, Tomas Vondra wrote: >> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: >>>> I think the general premise of this thread is that the application >>>> developer does not realize that may be necessary, because it's a bit >>>> surprising behavior, particularly when having more experience with >>>> other >>>> databases that behave differently. It's also pretty easy to not notice >>>> this issue for a long time, resulting in significant data loss. >>>> >>>> Let's say you're used to the MSSQL or MySQL behavior, you migrate your >>>> application to PostgreSQL or whatever - how do you find out about this >>>> behavior? Users are likely to visit >>>> >>>> https://www.postgresql.org/docs/12/functions-json.html >>>> >>>> but that says nothing about how jsonb_set works with NULL values :-( >>> >>> >>> We should certainly fix that. I accept some responsibility for the >>> omission. >>> >> +1 >> >> > > So let's add something to the JSON funcs page like this: > > > Note: All the above functions except for json_build_object, > json_build_array, json_to_recordset, json_populate_record, and > json_populate_recordset and their jsonb equivalents are strict > functions. That is, if any argument is NULL the function result will be > NULL and the function won't even be called. Particular care should > therefore be taken to avoid passing NULL arguments to those functions > unless a NULL result is expected. This is particularly true of the > jsonb_set and jsonb_insert functions. > > > > (We do have a heck of a lot of Note: sections on that page) > > For release 13+, I have given some more thought to what should be done. I think the bar for altering the behaviour of a function should be rather higher than we have in the present case, and the longer the function has been sanctioned by time the higher the bar should be. However, I think there is a case to be made for providing a non-strict jsonb_set type function. To advance th4e discussion, attached is a POC patch that does that. This can also be done as an extension, meaning that users of back branches could deploy it immediately. I've tested this against release 12, but I think it could go probably all the way back to 9.5. The new function is named jsonb_ set_lax, but I'm open to bikeshedding. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 10/21/19 9:28 AM, Andrew Dunstan wrote: > On 10/21/19 2:07 AM, Tomas Vondra wrote: >> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: >>>> I think the general premise of this thread is that the application >>>> developer does not realize that may be necessary, because it's a bit >>>> surprising behavior, particularly when having more experience with >>>> other >>>> databases that behave differently. It's also pretty easy to not notice >>>> this issue for a long time, resulting in significant data loss. >>>> >>>> Let's say you're used to the MSSQL or MySQL behavior, you migrate your >>>> application to PostgreSQL or whatever - how do you find out about this >>>> behavior? Users are likely to visit >>>> >>>> https://www.postgresql.org/docs/12/functions-json.html >>>> >>>> but that says nothing about how jsonb_set works with NULL values :-( >>> >>> >>> We should certainly fix that. I accept some responsibility for the >>> omission. >>> >> +1 >> >> > > So let's add something to the JSON funcs page like this: > > > Note: All the above functions except for json_build_object, > json_build_array, json_to_recordset, json_populate_record, and > json_populate_recordset and their jsonb equivalents are strict > functions. That is, if any argument is NULL the function result will be > NULL and the function won't even be called. Particular care should > therefore be taken to avoid passing NULL arguments to those functions > unless a NULL result is expected. This is particularly true of the > jsonb_set and jsonb_insert functions. > > > > (We do have a heck of a lot of Note: sections on that page) > > For release 13+, I have given some more thought to what should be done. I think the bar for altering the behaviour of a function should be rather higher than we have in the present case, and the longer the function has been sanctioned by time the higher the bar should be. However, I think there is a case to be made for providing a non-strict jsonb_set type function. To advance th4e discussion, attached is a POC patch that does that. This can also be done as an extension, meaning that users of back branches could deploy it immediately. I've tested this against release 12, but I think it could go probably all the way back to 9.5. The new function is named jsonb_ set_lax, but I'm open to bikeshedding. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Oct 28, 2019, at 08:52, Andrew Dunstan wrote: > > For release 13+, I have given some more thought to what should be done. > I think the bar for altering the behaviour of a function should be > rather higher than we have in the present case, and the longer the > function has been sanctioned by time the higher the bar should be. > However, I think there is a case to be made for providing a non-strict > jsonb_set type function. To advance th4e discussion, attached is a POC > patch that does that. This can also be done as an extension, meaning > that users of back branches could deploy it immediately. I've tested > this against release 12, but I think it could go probably all the way > back to 9.5. The new function is named jsonb_ set_lax, but I'm open to > bikeshedding. > > Thank you Andrew, and I understand the difficulty in making changes to functions that already exist in production deployments.An additional function like this would be helpful to many. -- Mark Felder ports-secteam & portmgr alumni feld@FreeBSD.org
On Mon, Oct 28, 2019, at 08:52, Andrew Dunstan wrote: > > For release 13+, I have given some more thought to what should be done. > I think the bar for altering the behaviour of a function should be > rather higher than we have in the present case, and the longer the > function has been sanctioned by time the higher the bar should be. > However, I think there is a case to be made for providing a non-strict > jsonb_set type function. To advance th4e discussion, attached is a POC > patch that does that. This can also be done as an extension, meaning > that users of back branches could deploy it immediately. I've tested > this against release 12, but I think it could go probably all the way > back to 9.5. The new function is named jsonb_ set_lax, but I'm open to > bikeshedding. > > Thank you Andrew, and I understand the difficulty in making changes to functions that already exist in production deployments.An additional function like this would be helpful to many. -- Mark Felder ports-secteam & portmgr alumni feld@FreeBSD.org
For release 13+, I have given some more thought to what should be done.
I think the bar for altering the behaviour of a function should be
rather higher than we have in the present case, and the longer the
function has been sanctioned by time the higher the bar should be.
However, I think there is a case to be made for providing a non-strict
jsonb_set type function. To advance th4e discussion, attached is a POC
patch that does that. This can also be done as an extension, meaning
that users of back branches could deploy it immediately. I've tested
this against release 12, but I think it could go probably all the way
back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
bikeshedding.
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("null jsonb value")));
For release 13+, I have given some more thought to what should be done.
I think the bar for altering the behaviour of a function should be
rather higher than we have in the present case, and the longer the
function has been sanctioned by time the higher the bar should be.
However, I think there is a case to be made for providing a non-strict
jsonb_set type function. To advance th4e discussion, attached is a POC
patch that does that. This can also be done as an extension, meaning
that users of back branches could deploy it immediately. I've tested
this against release 12, but I think it could go probably all the way
back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
bikeshedding.
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("null jsonb value")));
On 11/15/19 2:14 PM, Pavel Stehule wrote: > Hi > > > > For release 13+, I have given some more thought to what should be > done. > I think the bar for altering the behaviour of a function should be > rather higher than we have in the present case, and the longer the > function has been sanctioned by time the higher the bar should be. > However, I think there is a case to be made for providing a non-strict > jsonb_set type function. To advance th4e discussion, attached is a POC > patch that does that. This can also be done as an extension, meaning > that users of back branches could deploy it immediately. I've tested > this against release 12, but I think it could go probably all the way > back to 9.5. The new function is named jsonb_ set_lax, but I'm open to > bikeshedding. > > > I am sending a review of this patch > > 1. this patch does what was proposed and it is based on discussion. > > 2. there are not any problem with patching or compilation, all regress > tests passed. > > 4. code looks well and it is well commented. > > 5. the patch has enough regress tests > > My notes: > > a) missing documentation > > b) error message is not finalized > > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("null jsonb value"))); > > Any other looks well, and this function can be very handy. > > Thanks for the review. I will add some docco. What would be a better error message? "null jsonb replacement not permitted"? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/15/19 2:14 PM, Pavel Stehule wrote: > Hi > > > > For release 13+, I have given some more thought to what should be > done. > I think the bar for altering the behaviour of a function should be > rather higher than we have in the present case, and the longer the > function has been sanctioned by time the higher the bar should be. > However, I think there is a case to be made for providing a non-strict > jsonb_set type function. To advance th4e discussion, attached is a POC > patch that does that. This can also be done as an extension, meaning > that users of back branches could deploy it immediately. I've tested > this against release 12, but I think it could go probably all the way > back to 9.5. The new function is named jsonb_ set_lax, but I'm open to > bikeshedding. > > > I am sending a review of this patch > > 1. this patch does what was proposed and it is based on discussion. > > 2. there are not any problem with patching or compilation, all regress > tests passed. > > 4. code looks well and it is well commented. > > 5. the patch has enough regress tests > > My notes: > > a) missing documentation > > b) error message is not finalized > > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("null jsonb value"))); > > Any other looks well, and this function can be very handy. > > Thanks for the review. I will add some docco. What would be a better error message? "null jsonb replacement not permitted"? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/15/19 2:14 PM, Pavel Stehule wrote:
> Hi
>
>
>
> For release 13+, I have given some more thought to what should be
> done.
> I think the bar for altering the behaviour of a function should be
> rather higher than we have in the present case, and the longer the
> function has been sanctioned by time the higher the bar should be.
> However, I think there is a case to be made for providing a non-strict
> jsonb_set type function. To advance th4e discussion, attached is a POC
> patch that does that. This can also be done as an extension, meaning
> that users of back branches could deploy it immediately. I've tested
> this against release 12, but I think it could go probably all the way
> back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
> bikeshedding.
>
>
> I am sending a review of this patch
>
> 1. this patch does what was proposed and it is based on discussion.
>
> 2. there are not any problem with patching or compilation, all regress
> tests passed.
>
> 4. code looks well and it is well commented.
>
> 5. the patch has enough regress tests
>
> My notes:
>
> a) missing documentation
>
> b) error message is not finalized
>
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("null jsonb value")));
>
> Any other looks well, and this function can be very handy.
>
>
Thanks for the review. I will add some docco.
What would be a better error message? "null jsonb replacement not
permitted"?
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/15/19 2:14 PM, Pavel Stehule wrote:
> Hi
>
>
>
> For release 13+, I have given some more thought to what should be
> done.
> I think the bar for altering the behaviour of a function should be
> rather higher than we have in the present case, and the longer the
> function has been sanctioned by time the higher the bar should be.
> However, I think there is a case to be made for providing a non-strict
> jsonb_set type function. To advance th4e discussion, attached is a POC
> patch that does that. This can also be done as an extension, meaning
> that users of back branches could deploy it immediately. I've tested
> this against release 12, but I think it could go probably all the way
> back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
> bikeshedding.
>
>
> I am sending a review of this patch
>
> 1. this patch does what was proposed and it is based on discussion.
>
> 2. there are not any problem with patching or compilation, all regress
> tests passed.
>
> 4. code looks well and it is well commented.
>
> 5. the patch has enough regress tests
>
> My notes:
>
> a) missing documentation
>
> b) error message is not finalized
>
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("null jsonb value")));
>
> Any other looks well, and this function can be very handy.
>
>
Thanks for the review. I will add some docco.
What would be a better error message? "null jsonb replacement not
permitted"?
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote: > Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed", > errdetail - a exception due setting "null_value_treatment" => > raise_exception > and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb" > > I don't know, but in this case, the exception should be verbose. This is > "rich" function with lot of functionality @Andrew: This patch is waiting on input from you for a couple of days now. -- Michael
Attachment
On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote: > Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed", > errdetail - a exception due setting "null_value_treatment" => > raise_exception > and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb" > > I don't know, but in this case, the exception should be verbose. This is > "rich" function with lot of functionality @Andrew: This patch is waiting on input from you for a couple of days now. -- Michael
On 11/27/19 9:35 PM, Michael Paquier wrote: > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote: >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed", >> errdetail - a exception due setting "null_value_treatment" => >> raise_exception >> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb" >> >> I don't know, but in this case, the exception should be verbose. This is >> "rich" function with lot of functionality > @Andrew: This patch is waiting on input from you for a couple of days > now. > Will get to this on Friday - tomorrow is Thanksgiving so I'm unlikely to get to it then. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/27/19 9:35 PM, Michael Paquier wrote: > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote: >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed", >> errdetail - a exception due setting "null_value_treatment" => >> raise_exception >> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb" >> >> I don't know, but in this case, the exception should be verbose. This is >> "rich" function with lot of functionality > @Andrew: This patch is waiting on input from you for a couple of days > now. > Will get to this on Friday - tomorrow is Thanksgiving so I'm unlikely to get to it then. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > > > On 11/27/19 9:35 PM, Michael Paquier wrote: > > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote: > >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed", > >> errdetail - a exception due setting "null_value_treatment" => > >> raise_exception > >> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb" > >> > >> I don't know, but in this case, the exception should be verbose. This is > >> "rich" function with lot of functionality > > @Andrew: This patch is waiting on input from you for a couple of days > > now. > > > > Updated version including docco and better error message. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > > > On 11/27/19 9:35 PM, Michael Paquier wrote: > > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote: > >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed", > >> errdetail - a exception due setting "null_value_treatment" => > >> raise_exception > >> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb" > >> > >> I don't know, but in this case, the exception should be verbose. This is > >> "rich" function with lot of functionality > > @Andrew: This patch is waiting on input from you for a couple of days > > now. > > > > Updated version including docco and better error message. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
>
>
> On 11/27/19 9:35 PM, Michael Paquier wrote:
> > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> >> errdetail - a exception due setting "null_value_treatment" =>
> >> raise_exception
> >> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb"
> >>
> >> I don't know, but in this case, the exception should be verbose. This is
> >> "rich" function with lot of functionality
> > @Andrew: This patch is waiting on input from you for a couple of days
> > now.
> >
>
>
Updated version including docco and better error message.
cheers
andrew
+ errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used")));
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
>
>
> On 11/27/19 9:35 PM, Michael Paquier wrote:
> > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> >> errdetail - a exception due setting "null_value_treatment" =>
> >> raise_exception
> >> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb"
> >>
> >> I don't know, but in this case, the exception should be verbose. This is
> >> "rich" function with lot of functionality
> > @Andrew: This patch is waiting on input from you for a couple of days
> > now.
> >
>
>
Updated version including docco and better error message.
cheers
andrew
+ errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used")));
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > > po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <andrew.dunstan@2ndquadrant.com> napsal: >> >> >> Updated version including docco and better error message. >> >> cheers >> >> andrew > > > I think so my objections are solved. I have small objection > > + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), > + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used"))); > > "null_value_treatment := 'raise_exception'\"" > > it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax > > "null_value_treatment => 'raise_exception'\"" > > It is fixed in attached patch > > source compilation without warnings, > compilation docs without warnings > check-world passed without any problems > > I'll mark this patch as ready for commiter > > Thank you for your work > Thanks for the review. I propose to commit this shortly. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > > po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <andrew.dunstan@2ndquadrant.com> napsal: >> >> >> Updated version including docco and better error message. >> >> cheers >> >> andrew > > > I think so my objections are solved. I have small objection > > + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), > + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used"))); > > "null_value_treatment := 'raise_exception'\"" > > it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax > > "null_value_treatment => 'raise_exception'\"" > > It is fixed in attached patch > > source compilation without warnings, > compilation docs without warnings > check-world passed without any problems > > I'll mark this patch as ready for commiter > > Thank you for your work > Thanks for the review. I propose to commit this shortly. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote: >On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> Hi >> >> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <andrew.dunstan@2ndquadrant.com> napsal: >>> >>> >>> Updated version including docco and better error message. >>> >>> cheers >>> >>> andrew >> >> >> I think so my objections are solved. I have small objection >> >> + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), >> + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used"))); >> >> "null_value_treatment := 'raise_exception'\"" >> >> it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax >> >> "null_value_treatment => 'raise_exception'\"" >> >> It is fixed in attached patch >> >> source compilation without warnings, >> compilation docs without warnings >> check-world passed without any problems >> >> I'll mark this patch as ready for commiter >> >> Thank you for your work >> > > >Thanks for the review. I propose to commit this shortly. > Now that this was committed, I've updated the patch status accordingly. Thanks! -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote: >On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> Hi >> >> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <andrew.dunstan@2ndquadrant.com> napsal: >>> >>> >>> Updated version including docco and better error message. >>> >>> cheers >>> >>> andrew >> >> >> I think so my objections are solved. I have small objection >> >> + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), >> + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used"))); >> >> "null_value_treatment := 'raise_exception'\"" >> >> it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax >> >> "null_value_treatment => 'raise_exception'\"" >> >> It is fixed in attached patch >> >> source compilation without warnings, >> compilation docs without warnings >> check-world passed without any problems >> >> I'll mark this patch as ready for commiter >> >> Thank you for your work >> > > >Thanks for the review. I propose to commit this shortly. > Now that this was committed, I've updated the patch status accordingly. Thanks! -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello, January 17, 2020 5:21 PM, "Tomas Vondra" <tomas.vondra@2ndquadrant.com> wrote: > On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote: > >> On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> Hi >>> >>> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <andrew.dunstan@2ndquadrant.com> napsal: >> >> Updated version including docco and better error message. >> >> cheers >> >> andrew >>> I think so my objections are solved. I have small objection >>> >>> + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), >>> + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is >>> not used"))); >>> >>> "null_value_treatment := 'raise_exception'\"" >>> >>> it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax >>> >>> "null_value_treatment => 'raise_exception'\"" >>> >>> It is fixed in attached patch >>> >>> source compilation without warnings, >>> compilation docs without warnings >>> check-world passed without any problems >>> >>> I'll mark this patch as ready for commiter >>> >>> Thank you for your work >> >> Thanks for the review. I propose to commit this shortly. > > Now that this was committed, I've updated the patch status accordingly. Thank you very much for coming together and finding a solution to this bug! Ariadne
Hello, January 17, 2020 5:21 PM, "Tomas Vondra" <tomas.vondra@2ndquadrant.com> wrote: > On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote: > >> On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> Hi >>> >>> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <andrew.dunstan@2ndquadrant.com> napsal: >> >> Updated version including docco and better error message. >> >> cheers >> >> andrew >>> I think so my objections are solved. I have small objection >>> >>> + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), >>> + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is >>> not used"))); >>> >>> "null_value_treatment := 'raise_exception'\"" >>> >>> it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax >>> >>> "null_value_treatment => 'raise_exception'\"" >>> >>> It is fixed in attached patch >>> >>> source compilation without warnings, >>> compilation docs without warnings >>> check-world passed without any problems >>> >>> I'll mark this patch as ready for commiter >>> >>> Thank you for your work >> >> Thanks for the review. I propose to commit this shortly. > > Now that this was committed, I've updated the patch status accordingly. Thank you very much for coming together and finding a solution to this bug! Ariadne
On Jan 17, 2020, at 4:28 PM, Ariadne Conill <ariadne@dereferenced.org> wrote:Hello,
January 17, 2020 5:21 PM, "Tomas Vondra" <tomas.vondra@2ndquadrant.com> wrote:
Thank you very much for coming together and finding a solution to this bug!
Ariadne
On Jan 17, 2020, at 4:28 PM, Ariadne Conill <ariadne@dereferenced.org> wrote:Hello,
January 17, 2020 5:21 PM, "Tomas Vondra" <tomas.vondra@2ndquadrant.com> wrote:
Thank you very much for coming together and finding a solution to this bug!
Ariadne