Re: ON ERROR in json_query and the like - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: ON ERROR in json_query and the like |
Date | |
Msg-id | CAFj8pRCx3S4L4Vze-qMpDX1Gtv26wkAQRKBf2VMpF11DgyOXwg@mail.gmail.com Whole thread Raw |
In response to | Re: ON ERROR in json_query and the like (Amit Langote <amitlangote09@gmail.com>) |
List | pgsql-hackers |
pá 21. 6. 2024 v 2:22 odesílatel Amit Langote <amitlangote09@gmail.com> napsal:
On Fri, Jun 21, 2024 at 1:11 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Jun 20, 2024 at 2:19 AM Amit Langote <amitlangote09@gmail.com> wrote:
>> On Mon, Jun 17, 2024 at 10:07 PM Markus Winand <markus.winand@winand.at> wrote:
>> > > On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
>> > > Agree that the documentation needs to be clear about this. I'll update
>> > > my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
>> > > Functions.
>> >
>> > Considering another branch of this thread [1] I think the
>> > "Supported Features” appendix of the docs should mention that as well.
>> >
>> > The way I see it is that the standards defines two overloaded
>> > JSON_QUERY functions, of which PostgreSQL will support only one.
>> > In case of valid JSON, the implied CAST makes it look as though
>> > the second variant of these function was supported as well but that
>> > illusion totally falls apart once the JSON is not valid anymore.
>> >
>> > I think it affects the following feature IDs:
>> >
>> > - T821, Basic SQL/JSON query operators
>> > For JSON_VALUE, JSON_TABLE and JSON_EXISTS
>> > - T828, JSON_QUERY
>> >
>> > Also, how hard would it be to add the functions that accept
>> > character strings? Is there, besides the effort, any thing else
>> > against it? I’m asking because I believe once released it might
>> > never be changed — for backward compatibility.
>>
>> Hmm, I'm starting to think that adding the implied cast to json wasn't
>> such a great idea after all, because it might mislead the users to
>> think that JSON parsing is transparent (respects ON ERROR), which is
>> what you are saying, IIUC.
>>
>
> Actually, the implied cast is exactly the correct thing to do here - the issue is that we aren't using the newly added soft errors infrastructure [1] to catch the result of that cast and instead produce whatever output on error tells us to produce. This seems to be in the realm of doability so we should try in the interest of being standard conforming.
Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1] that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.
I can look into making that work again, but that is not going to make beta2.
> I'd even argue to make this an open item unless and until the attempt is agreed upon to have failed (or it succeeds of course).
OK, adding an open item.
At this time, when I wrote this mail, I didn't exactly notice the standard, so broken format should be handled there too. In this time, there was no support for soft errors ever in Postgres, so handling broken formats was inconsistent.
Standard describes format errors, but exactly doesn't describe if this is error like missing key or broken json format. Maybe wrongly, but intuitively for me, these errors are of different kinds and broken input data is a different case than missing key (but fully valid json). I didn't find the exact sentence in standard when I searched it (but it was four years ago).
My position in this case is not extra strong. The original patch was written and tested to be compatible with Oracle (what is a strong argument and feature). On second hand, some things required subtransactioning what was wrong (soft errors were introduced later). The compatibility with Oracle is a strong argument, but Oracle by itself is not fully compatible with standard, and some cases are special (in Oracle) because empty string in Oracle is NULL, and then it is handled differently. In this time I had motivation to reduce the patch to "safe" minimum to be possible to accept it by committers. The patch was written in 2017 (I think). Handling broken format (input format) was one issue that I thought could be solved later.
The main reason for my mail is fact, so Postgres and Oracle have DIFFERENT correct format of JSON!
'{a:10}' is correct on Oracle, but not correct on Postgres. And with default ON ERROR NULL (what is default), then the Oracle returns 10, and Postgres NULL. I thought this can be very messy and better to just raise an exception.
Regards
Pavel
--
Thanks, Amit Langote
[1] https://www.postgresql.org/message-id/CAFj8pRCnzO2cnHi5ebXciV%3DtuGVvAQOW9uPU%2BDQV1GkL31R%3D-g%40mail.gmail.com
pgsql-hackers by date: