Re: [HACKERS] [PATCH] Generic type subscripting - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [HACKERS] [PATCH] Generic type subscripting
Date
Msg-id CAFj8pRCGanQPvHZY5QXDtd6oQ8k3kQXcEMt53T3k=JY=+T-UyQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Generic type subscripting  ("Dian M Fay" <dian.m.fay@gmail.com>)
Responses Re: [HACKERS] [PATCH] Generic type subscripting  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Hi


I'm thinking of the update path as a kind of implicit schema. JSON is
intentionally not bound to any schema on creation, so I don't see a
failure to enforce another schema at runtime (and outside the WHERE
clause, at that) as an error exactly.

This concept is not consistent with other implemented behaviour.

1. The schema is dynamically enhanced - so although the path doesn't exists, it is created and data are changed

postgres=# create table foo(a jsonb);
CREATE TABLE
postgres=# insert into foo values('{}');
INSERT 0 1
postgres=# update foo set a['a']['a'][10] = '0';
UPDATE 1
postgres=# select * from foo;
┌───────────────────────────────────────────────────────────────────────────────┐
│                                       a                                       │
╞═══════════════════════════════════════════════════════════════════════════════╡
│ {"a": {"a": [null, null, null, null, null, null, null, null, null, null, 0]}} │
└───────────────────────────────────────────────────────────────────────────────┘
(1 row)

So although the path [a,a,10] was not exists, it was created.

2. this update fails (and it is correct)

postgres=# update foo set a['a']['a']['b'] = '0';
ERROR:  path element at position 3 is not an integer: "b"

although the path [a,a,b] doesn't exists, and it is not ignored.

This implementation doesn't do only UPDATE (and then analogy with WHERE clause isn't fully adequate). It does MERGE. This is necessary, because without it, the behaviour will be pretty unfriendly - because there is not any external schema. I think so this is important - and it can be little bit messy. I am not sure if I use correct technical terms - we try to use LAX update in first step, and if it is not successful, then we try to do LAX insert. This is maybe correct from JSON semantic - but for developer it is unfriendly, because he hasn't possibility to detect if insert was or was not successful. In special JSON functions I can control behave and can specify LAX or STRICT how it is necessity. But in this interface (subscripting) this possibility is missing.

I think so there should be final check (semantically) if value was updated, and if the value was changed. If not, then error should be raised. It should be very similar like RLS update. I know and I understand so there should be more than one possible implementations, but safe is only one - after successful update I would to see new value inside, and when it is not possible, then I expect exception. I think so it is more practical too. I can control filtering with WHERE clause. But I cannot to control MERGE process. Manual recheck after every update can be terrible slow.

Regards

Pavel



But I looked into the bulk case a little further, and "outside the
WHERE clause" cuts both ways. The server reports an update whether or
not the JSON could have been modified, which suggests triggers will
fire for no-op updates. That's more clearly a problem.

insert into j (val) values
 ('{"a": 100}'),
 ('{"a": "200"}'),
 ('{"b": "300"}'),
 ('{"c": {"d": 400}}'),
 ('{"a": {"z": 500}}');

INSERT 0 5
update j set val['a']['z'] = '600' returning *;
                val                 
────────────────────────────────────
 {"a": 100}
 {"a": "200"}
 {"a": {"z": 600}, "b": "300"}
 {"a": {"z": 600}, "c": {"d": 400}}
 {"a": {"z": 600}}
(5 rows)

*UPDATE 5*

pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.
Next
From: Thomas Munro
Date:
Subject: Re: new heapcheck contrib module