Thread: re-novice coming back to pgsql: porting an SQLite update statement to postgres

The last time I used PostgreSQL for anything was about 15 years ago and
I have only limited SQL background, so please consider this a novice
question.

I have an embedded SQLite database that I would like to port to
PostgreSQL, I have done the majority of this porting, but am stuck on a
final compound statement. This is for a hobby project.

For background I'll give the SQLite schema and the statement in
question, and also the PostgreSQL schema. The SQLite code here is also
available to view in context at the links at the bottom of the post in
case this is helpful

SQLite schema:

create table if not exists buckets (
    rowid    INTEGER PRIMARY KEY AUTOINCREMENT,
    id       TEXT UNIQUE NOT NULL,
    name     TEXT,
    type     TEXT NOT NULL,
    client   TEXT NOT NULL,
    hostname TEXT NOT NULL,
    created  TEXT NOT NULL, -- unix micro
    datastr  TEXT NOT NULL  -- JSON text
);
create table if not exists events (
    id        INTEGER PRIMARY KEY AUTOINCREMENT,
    bucketrow INTEGER NOT NULL,
    starttime INTEGER NOT NULL, -- unix micro
    endtime   INTEGER NOT NULL, -- unix micro
    datastr   TEXT NOT NULL,    -- JSON text
    FOREIGN KEY (bucketrow) REFERENCES buckets(rowid)
);
create index if not exists event_index_id ON events(id);
create index if not exists event_index_starttime ON events(bucketrow, starttime);
create index if not exists event_index_endtime ON events(bucketrow, endtime);


PostgreSQL schema:

create table if not exists buckets (
    rowid    SERIAL PRIMARY KEY,
    id       TEXT UNIQUE NOT NULL,
    name     TEXT,
    type     TEXT NOT NULL,
    client   TEXT NOT NULL,
    hostname TEXT NOT NULL,
    created  TIMESTAMP WITH TIME ZONE NOT NULL,
    datastr  JSONB NOT NULL
);
create table if not exists events (
    id        SERIAL PRIMARY KEY,
    bucketrow INTEGER NOT NULL,
    starttime TIMESTAMP WITH TIME ZONE NOT NULL,
    endtime   TIMESTAMP WITH TIME ZONE NOT NULL,
    datastr   JSONB NOT NULL,
    FOREIGN KEY (bucketrow) REFERENCES buckets(rowid)
);
create index if not exists event_index_id ON events(id);
create index if not exists event_index_starttime ON events(bucketrow, starttime);
create index if not exists event_index_endtime ON events(bucketrow, endtime);


The part that I am having difficulty knowing where to start deals with
the JSON data in the events table. This contains largely unstructured
data, but can have an array of amendment notes added using the SQL
statements below. These notes are essentially append-only alterations
to the unstructured data, spanning a period of the unstructured data
within the time bounds of the event table row.

My question is where would be the best place for me to looks to learn
about how to implement a port of this SQLite? and what would broadly be
the most sensible approach to take (to narrow down what I need to read
through in learning)?

Apologies for the long post, and thank you for any help.
Dan


SQLite statement:

begin transaction;
    -- ensure we have an amend array.
    update events set datastr = json_insert(datastr, '$.amend', json('[]'))
    where
        datetime(starttime, 'subsec') < datetime(?5, 'subsec') and
        datetime(endtime, 'subsec') > datetime(?4, 'subsec') and
        bucketrow = (
            select rowid from buckets where id = ?1
        );
    update events set datastr = json_insert(datastr, '$.amend[#]', json_object('time', ?2, 'msg', ?3, 'replace', (
        -- trim amendments down to original event bounds.
        select json_group_array(json_replace(value,
            '$.start', case
                when datetime(starttime, 'subsec') > datetime(json_extract(value, '$.start'), 'subsec') then
                    starttime
                else
                    json_extract(value, '$.start')
                end,
            '$.end', case
                when datetime(endtime, 'subsec') < datetime(json_extract(value, '$.end'), 'subsec') then
                    endtime
                else
                    json_extract(value, '$.end')
                end
        ))
        from
            json_each(?6)
        where
            datetime(json_extract(value, '$.start'), 'subsec') < datetime(endtime, 'subsec') and
            datetime(json_extract(value, '$.end'), 'subsec') > datetime(starttime, 'subsec')
    )))
    where
        datetime(starttime, 'subsec') < datetime(?5, 'subsec') and
        datetime(endtime, 'subsec') > datetime(?4, 'subsec') and
        bucketrow = (
            select rowid from buckets where id = ?1
        );
commit;

Schema:
https://github.com/kortschak/dex/blob/2ce9ff385443f5f475a5efdf91721903699852fa/cmd/worklog/store/db.go#L203-L223
Amend statement:
https://github.com/kortschak/dex/blob/2ce9ff385443f5f475a5efdf91721903699852fa/cmd/worklog/store/db.go#L791-L828





On Mon, 15 Jul 2024 20:31:13 +0000
Dan Kortschak <dan+pgsql@kortschak.io> wrote:

> My question is where would be the best place for me to looks to learn
> about how to implement a port of this SQLite? and what would broadly be
> the most sensible approach to take (to narrow down what I need to read
> through in learning)?
>

This is the goto page for anything SQL :
https://www.postgresql.org/docs/current/sql-commands.html

For DateTime types :
https://www.postgresql.org/docs/current/datatype-datetime.html

For JSON types :
https://www.postgresql.org/docs/current/datatype-json.html

If your query works in SQLite, all you have to do is read those, and try to port; if it fails, read them again. Also
searchthe archives of the pgsql-general list, many answers in there 


--
                    Bien à vous, Vincent Veyron

https://compta.libremen.com
Logiciel libre de comptabilité générale en partie double



On 7/23/24 13:11, Vincent Veyron wrote:
> On Mon, 15 Jul 2024 20:31:13 +0000
> Dan Kortschak <dan+pgsql@kortschak.io> wrote:
> 
>> My question is where would be the best place for me to looks to learn
>> about how to implement a port of this SQLite? and what would broadly be
>> the most sensible approach to take (to narrow down what I need to read
>> through in learning)?
>>
> 
> This is the goto page for anything SQL :
> https://www.postgresql.org/docs/current/sql-commands.html
> 
> For DateTime types :
> https://www.postgresql.org/docs/current/datatype-datetime.html
> 
> For JSON types :
> https://www.postgresql.org/docs/current/datatype-json.html

Just know that SQLite does not enforce types, therefore it is entirely 
possible that there are values in fields that are not valid in Postgres.

Think:

select ''::integer
ERROR:  invalid input syntax for type integer: ""
LINE 1: select ''::integer


> 
> If your query works in SQLite, all you have to do is read those, and try to port; if it fails, read them again. Also
searchthe archives of the pgsql-general list, many answers in there
 
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




On Tue, Jul 23, 2024 at 10:35 PM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
> Just know that SQLite does not enforce types [...]

That's true, and applies to the OP's schema.

But for the record, SQLite *can* enforce types these days,
on an opt-in basis, with [STRICT tables][1].
Albeit with a limited type-system. --DD

PS: and could be done manually even before, with CHECK
(typeof(col)='blob') for example.

[1]: https://www.sqlite.org/stricttables.html



On 7/23/24 13:11, Vincent Veyron wrote:
> On Mon, 15 Jul 2024 20:31:13 +0000
>
> This is the goto page for anything SQL :
> https://www.postgresql.org/docs/current/sql-commands.html
>
> For DateTime types :
> https://www.postgresql.org/docs/current/datatype-datetime.html
>
> For JSON types :
> https://www.postgresql.org/docs/current/datatype-json.html

Thanks, I will work through those.


On Tue, 2024-07-23 at 23:52 +0200, Dominique Devienne wrote:
> On Tue, Jul 23, 2024 at 10:35 PM Adrian Klaver
> <adrian.klaver@aklaver.com> wrote:
> > Just know that SQLite does not enforce types [...]
>
> That's true, and applies to the OP's schema.

Thank you both. Yes, I was aware of this weirdness of the schema (I
inherited it) and was shocked that it worked when I relaised. I'll be
happier when types are properly enforced, but I don't think I can
retrospectively enforce that on the SQLite implementation I have.




On 7/23/24 17:23, Dan Kortschak wrote:
> On 7/23/24 13:11, Vincent Veyron wrote:
>> On Mon, 15 Jul 2024 20:31:13 +0000
>>
>> This is the goto page for anything SQL :
>> https://www.postgresql.org/docs/current/sql-commands.html
>>
>> For DateTime types :
>> https://www.postgresql.org/docs/current/datatype-datetime.html
>>
>> For JSON types :
>> https://www.postgresql.org/docs/current/datatype-json.html
> 
> Thanks, I will work through those.
> 
> 
> On Tue, 2024-07-23 at 23:52 +0200, Dominique Devienne wrote:
>> On Tue, Jul 23, 2024 at 10:35 PM Adrian Klaver
>> <adrian.klaver@aklaver.com> wrote:
>>> Just know that SQLite does not enforce types [...]
>>
>> That's true, and applies to the OP's schema.
> 
> Thank you both. Yes, I was aware of this weirdness of the schema (I
> inherited it) and was shocked that it worked when I relaised. I'll be
> happier when types are properly enforced, but I don't think I can
> retrospectively enforce that on the SQLite implementation I have.
> 

Which gets back to verifying the data coming from SQLite will work in 
the Postgres tables with the Postgres types specified in the table 
definitions.

You can either:

1) Just import the data into the Postgres tables as defined and see if 
it works and if not what blows up.

2) Create Postgres staging tables that have all the column type's set to 
varchar or text for every column. Then import the data. Then you could 
do select col::<the_type_desired> from the_table and see what works and 
what fails.

-- 
Adrian Klaver
adrian.klaver@aklaver.com




On Wed, 2024-07-24 at 00:23 +0000, Dan Kortschak wrote:
> On 7/23/24 13:11, Vincent Veyron wrote:
> > On Mon, 15 Jul 2024 20:31:13 +0000
> >
> > This is the goto page for anything SQL :
> > https://www.postgresql.org/docs/current/sql-commands.html
> >
> > For DateTime types :
> > https://www.postgresql.org/docs/current/datatype-datetime.html
> >
> > For JSON types :
> > https://www.postgresql.org/docs/current/datatype-json.html
>
> Thanks, I will work through those.

I've had a chance to attack this. The first part of the problem was
that I could not figure out how to get the multiple statement
transaction that I using in SQLite to work with PostgreSQL. The
solution was to use the host language's Postres binding transaction
functions and send the statements separately.

The first part, to ensure the JSON array exist is solved with

    update
        events
    set
        datastr = jsonb_set(datastr, '{amend}', '[]')
    where
        starttime < $3 and
        endtime > $2 and
        not datastr::jsonb ? 'amend' and
        bucketrow = (
            select rowid from buckets where id = $1
        );

I'm still having difficulties with the second part which is to update
the contents of the amend array in the JSON.

So far I'm able to append the relevant details to the append array, but
I'm unable to correctly select the corrects elements from the $6
argument, which is in the form
[{"start":<RFC3339>,"end":<RFC3339>,"data":<object>}, ...]. The first
update statement gives me broadly what I want, but includes elements of
the array that it shouldn't.

    update
        events
    set
        datastr = jsonb_set(
            datastr,
            '{amend}',
            datastr->'amend' || jsonb_build_object(
                'time', $2::TEXT,
                'msg', $3::TEXT,
                'replace', (
                    -- This select is for comparison with the code below.
                    select * from jsonb($6::TEXT)
                )
            )
        )
    where
        starttime < $5 and
        endtime > $4 and
        bucketrow = (
            select rowid from buckets where id = $1
        );

If I filter on the start and end time, I end up with no element coming
through at all and the "replace" field ends up null.

    update
        events
    set
        datastr = jsonb_set(
            datastr,
            '{amend}',
            datastr->'amend' || jsonb_build_object(
                'time', $2::TEXT,
                'msg', $3::TEXT,
                'replace', (
                    select *
                    from
                        jsonb($6::TEXT) as replacement
                    where
                        (replacement->>'start')::TIMESTAMP WITH TIME ZONE < endtime and
                        (replacement->>'end')::TIMESTAMP WITH TIME ZONE > starttime
                )
            )
        )
    where
        starttime < $5 and
        endtime > $4 and
        bucketrow = (
            select rowid from buckets where id = $1
        );

Can anyone suggest what I might be missing? One thing that occurs to me
is that due to the host language the timezone in starttime and endtime
is the local timezone, while the timezone in the elements of the $6
argument are in UTC. I've tried forcing the timezones to match and this
does not appear to be the issue.

This can be seen in the case of output from the first, non-filtering
update statement above:

    {
        ...
        "start": "2023-06-13T05:24:50+09:30",
        "end": "2023-06-13T05:24:55+09:30",
        "data": {
            ...
            "amend": [
                {
                    "endtime": "2023-06-13T05:24:55+09:30",
                    "msg": "testing",
                    "replace": [
                        {
                            "data": {
                                ...
                            },
                            "end": "2023-06-12T19:54:51Z",
                            "start": "2023-06-12T19:54:39Z"
                        }
                    ],
                    "starttime": "2023-06-13T05:24:50+09:30",
                    ...
                }
            ]
        }
    },

(I'm hoping there's not a dumb logic error in the statement; it's
equivalent works with SQLite, and mapped all to UTC we have
starttime=2023-06-12T19:54:50Z endtime=2023-06-12T19:54:55Z start=2023-
06-12T19:54:39Z and end=2023-06-12T19:54:51Z which is an overlap).





> On 14 Sep 2024, at 10:33, Dan Kortschak <dan+pgsql@kortschak.io> wrote:

(…)

> I'm still having difficulties with the second part which is to update
> the contents of the amend array in the JSON.
>
> So far I'm able to append the relevant details to the append array, but
> I'm unable to correctly select the corrects elements from the $6
> argument, which is in the form
> [{"start":<RFC3339>,"end":<RFC3339>,"data":<object>}, ...]. The first
> update statement gives me broadly what I want, but includes elements of
> the array that it shouldn't.

(…)

> If I filter on the start and end time, I end up with no element coming
> through at all and the "replace" field ends up null.
>
> update
> events
> set
> datastr = jsonb_set(
> datastr,
> '{amend}',
> datastr->'amend' || jsonb_build_object(
> 'time', $2::TEXT,
> 'msg', $3::TEXT,
> 'replace', (
> select *
> from
> jsonb($6::TEXT) as replacement
> where
> (replacement->>'start')::TIMESTAMP WITH TIME ZONE < endtime and
> (replacement->>'end')::TIMESTAMP WITH TIME ZONE > starttime
> )
> )
> )
> where
> starttime < $5 and
> endtime > $4 and
> bucketrow = (
> select rowid from buckets where id = $1
> );

That’s because the replacement data is an array of objects, not a single object.

You need to iterate through the array elements to build your replacement data, something like what I do here with a
select(because that’s way easier to play around with): 

with dollar6 as (
select jsonb($$[
                                                {
                                                        "data": { "foo": 1, "bar": 2
                                                        },
                                                        "end": "2023-06-12T19:54:51Z",
                                                        "start": "2023-06-12T19:54:39Z"
                                                }
                                        ]$$::text) replacement
)
select *
from dollar6
cross join lateral jsonb_array_elements(replacement) r
where (r->>'start')::timestamptz <= current_timestamp;


There are probably other ways to attack this problem, this is the one I came up with.


Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.




On Sat, 2024-09-14 at 12:05 +0200, Alban Hertroys wrote:
>
> That’s because the replacement data is an array of objects, not a
> single object.
>
> You need to iterate through the array elements to build your
> replacement data, something like what I do here with a select
> (because that’s way easier to play around with):
>
> with dollar6 as (
> select jsonb($$[
>                                                 {
>                                                         "data": {
> "foo": 1, "bar": 2
>                                                         },
>                                                         "end": "2023-
> 06-12T19:54:51Z",
>                                                         "start":
> "2023-06-12T19:54:39Z"
>                                                 }
>                                         ]$$::text) replacement
> )
> select *
> from dollar6
> cross join lateral jsonb_array_elements(replacement) r
> where (r->>'start')::timestamptz <= current_timestamp;


Thanks





I have come to hopefully my last stumbling point.

I am unable to see a way to express something like this SQLite syntax

select json_group_array(json_replace(value,
  '$.a', case
    when json_extract(value, '$.a') > 2 then
      2
    else
      json_extract(value, '$.a')
    end,
  '$.b', case
    when json_extract(value, '$.b') < -2 then
      -2
    else
      json_extract(value, '$.b')
    end
))
from
  json_each('[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]');

(in the repro above, the values are integers, but in the real case,
they are timestamps)

I have worked on multiple statements around the theme of

with t as (
  select jsonb($$[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]$$)
arr
)
select
  jsonb_array_elements(arr) as arr
from
  t;

The closest that I have come is

with t as (
  select jsonb($$[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]$$)
arr
)
select jsonb_set(arr, '{a}', case
  when (arr->>'a')::INTEGER > 2 then
    2
  else
    (arr->>'a')::INTEGER
  end
)
from (
  select
    jsonb_array_elements(arr) as arr
  from
    t
) elements;

but this is a millions miles from where I want to be (it doesn't work,
but I think the shape of the things that it's working with are maybe
heading in the right direction). I've read through the docs, but I just
don't seem able to get my head around this.

Any help would be greatful appreciated (also some reading direction so
that I'm not floundering so much).

thanks






> On 15 Sep 2024, at 11:07, Dan Kortschak <dan+pgsql@kortschak.io> wrote:
>
> I have come to hopefully my last stumbling point.
>
> I am unable to see a way to express something like this SQLite syntax
>
> select json_group_array(json_replace(value,
>  '$.a', case
>    when json_extract(value, '$.a') > 2 then
>      2
>    else
>      json_extract(value, '$.a')
>    end,
>  '$.b', case
>    when json_extract(value, '$.b') < -2 then
>      -2
>    else
>      json_extract(value, '$.b')
>    end
> ))
> from
>  json_each('[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]');

What’s the result of that query in SQLite?

I’m guessing it would be: [{"a":1, "b":-2},{"a":2, "b":-2},{"a":2, "b":-1}]


I see basically two approaches. One is to take the objects apart and build them back together again, the other is to
attemptto only replace the values that need replacing. 

For the sake of showing how both approaches modify the original, I added an extra field “c” to your objects that should
bein the result unmodified. 

The first approach rebuilds the objects:

with t as (
    select jsonb($$[{"a":1, "b":-3, "c":1},{"a":2, "b":-2, "c":2},{"a":3, "b":-1, "c":3},{"a":3, "b":-3, "c":4}]$$) arr
)
select jsonb_agg(jsonb_build_object(
    'a', case when records.a > 2 then 2 else records.a end
,       'b', case when records.b < -2 then -2 else records.b end
,       'c', c
))
from t
cross join lateral jsonb_to_recordset(t.arr) records(a int, b int, c int)
;
                                                  jsonb_agg
--------------------------------------------------------------------------------------------------------------
 [{"a": 1, "b": -2, "c": 1}, {"a": 2, "b": -2, "c": 2}, {"a": 2, "b": -1, "c": 3}, {"a": 2, "b": -2, "c": 4}]
(1 row)


The drawback is that you have to specify all fields and types, but you don’t need to cast the values all the time
either.



The replacement approach gets a bit trickier. I don’t see any good method that would replace both ‘a’ and ‘b’ values if
theyboth go outside bounds in the same object.  

The jsonb_set function in PG doesn’t seem to be able to handle setting a value conditionally, let alone, setting
multiplevalues conditionally in one call, so I ended up with replacing either ‘a’ or ‘b’. I did include a case where
both‘a’ and ‘b’ go out of bounds, replacing both values with there respective replacements, but the syntax for that
approachdoesn’t scale well to more combinations of fields and boundaries to check and replace. 

Hence I added the problematic case to the test string. As you can see from the previous query, that one does handle
thatcase properly without much extra hassle. 

with t as (
    select jsonb($$[{"a":1, "b":-3, "c":1},{"a":2, "b":-2, "c":2},{"a":3, "b":-1, "c":3},{"a":3, "b":-3, "c":4}]$$) arr
)
select jsonb_agg(
        case
            when (obj->>'a')::INTEGER > 2 and (obj->>'b')::INTEGER < -2
            then jsonb_set(jsonb_set(obj, '{a}', '2') ,'{b}', '-2')
            when (obj->>'a')::INTEGER > 2
            then jsonb_set(obj, '{a}', '2')
            when (obj->>'b')::INTEGER < -2
            then jsonb_set(obj, '{b}', '-2')
            else obj
        end) newArr
from (
select jsonb_array_elements(arr) obj from t
) elements;
                                                    newarr
--------------------------------------------------------------------------------------------------------------
 [{"a": 1, "b": -2, "c": 1}, {"a": 2, "b": -2, "c": 2}, {"a": 2, "b": -1, "c": 3}, {"a": 2, "b": -2, "c": 4}]
(1 row)


For understanding both queries better, it probably helps to take out the jsonb_agg calls to see the separate objects
fromthe array. Add the original obj back in for comparison, if you like. 


I typically use the documentation pages for the JSON functions and the one on aggregate functions, where the JSONB
aggregatesare located: 

https://www.postgresql.org/docs/16/functions-json.html
https://www.postgresql.org/docs/16/functions-aggregate.html

And if you’re not familiar with dollar quoting:
https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING



Alban Hertroys
--
There is always an exception to always.







On Sun, Sep 15, 2024 at 4:23 AM Alban Hertroys <haramrae@gmail.com> wrote:
>
> > On 15 Sep 2024, at 11:07, Dan Kortschak <dan+pgsql@kortschak.io> wrote:
> >
> > I have come to hopefully my last stumbling point.
> >
> > I am unable to see a way to express something like this SQLite syntax
> >
> > select json_group_array(json_replace(value,
> >  '$.a', case
> >    when json_extract(value, '$.a') > 2 then
> >      2
> >    else
> >      json_extract(value, '$.a')
> >    end,
> >  '$.b', case
> >    when json_extract(value, '$.b') < -2 then
> >      -2
> >    else
> >      json_extract(value, '$.b')
> >    end
> > ))
> > from
> >  json_each('[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]');
>
> [...]
>
> I see basically two approaches. One is to take the objects apart [...]
>
> with t as (
>         select jsonb($$[{"a":1, "b":-3, "c":1},{"a":2, "b":-2, "c":2},{"a":3, "b":-1, "c":3},{"a":3, "b":-3,
"c":4}]$$)arr 
> )
> select jsonb_agg(jsonb_build_object(
>         'a', case when records.a > 2 then 2 else records.a end
> ,       'b', case when records.b < -2 then -2 else records.b end
> ,       'c', c
> ))
> from t
> cross join lateral jsonb_to_recordset(t.arr) records(a int, b int, c int)
> ;
>
> [...]
>
> The drawback is that you have to specify all fields and types, but you don’t need to cast the values all the time
either.

Here is a variant of Alban's first method that does not require
specifying all fields and types, and so works with heterogeneous values:

    WITH t AS (
            SELECT jsonb($$[
                {"a": 1, "b": -3, "c": 1},
                {"a": 2, "b": -2, "c": 2},
                {"a": 3, "b": -1, "c": 3},
                {"a": 3, "b": -3, "c": 4}
            ]$$) arr
    )
    SELECT
        jsonb_agg(new_element ORDER BY idx) new_arr
        FROM t, LATERAL (
            SELECT idx, jsonb_object_agg(key, CASE
                WHEN key = 'a'
                    THEN least(old_value::numeric, 2)::text::jsonb
                WHEN key = 'b'
                    THEN greatest(old_value::numeric, -2)::text::jsonb
                ELSE old_value
            END)
            FROM
                jsonb_array_elements(arr)
                    WITH ORDINALITY old_elements(old_element, idx),
                jsonb_each(old_element) each(key, old_value)
            GROUP BY idx
        ) new_elements(idx, new_element)

I also took the liberties of using `least` / `greatest` to simplify the
clamping operations, and using `WITH ORDINALITY` / `ORDER BY` on the
array scan and re-aggregation to make the element ordering explicit
rather than relying on the query engine to not re-order the rows.

https://www.postgresql.org/docs/16/functions-conditional.html#FUNCTIONS-GREATEST-LEAST
https://www.postgresql.org/docs/16/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS



Thank you both. This has been extremely helpful. I still have more work
to do but this has made it possible to start playing with something,
and reading about it when it doesn't work.

On Sun, 2024-09-15 at 10:13 -0700, Willow Chargin wrote:
> On Sun, Sep 15, 2024 at 4:23 AM Alban Hertroys <haramrae@gmail.com>
> wrote:
> >
> > The drawback is that you have to specify all fields and types, but
> > you don’t need to cast the values all the time either.
>
> Here is a variant of Alban's first method that does not require
> specifying all fields and types, and so works with heterogeneous
> values:
>
>     WITH t AS (
>             SELECT jsonb($$[
>                 {"a": 1, "b": -3, "c": 1},
>                 {"a": 2, "b": -2, "c": 2},
>                 {"a": 3, "b": -1, "c": 3},
>                 {"a": 3, "b": -3, "c": 4}
>             ]$$) arr
>     )
>     SELECT
>         jsonb_agg(new_element ORDER BY idx) new_arr
>         FROM t, LATERAL (
>             SELECT idx, jsonb_object_agg(key, CASE
>                 WHEN key = 'a'
>                     THEN least(old_value::numeric, 2)::text::jsonb
>                 WHEN key = 'b'
>                     THEN greatest(old_value::numeric, -
> 2)::text::jsonb
>                 ELSE old_value
>             END)
>             FROM
>                 jsonb_array_elements(arr)
>                     WITH ORDINALITY old_elements(old_element, idx),
>                 jsonb_each(old_element) each(key, old_value)
>             GROUP BY idx
>         ) new_elements(idx, new_element)
>
> I also took the liberties of using `least` / `greatest` to simplify
> the
> clamping operations, and using `WITH ORDINALITY` / `ORDER BY` on the
> array scan and re-aggregation to make the element ordering explicit
> rather than relying on the query engine to not re-order the rows.
>
> https://www.postgresql.org/docs/16/functions-conditional.html#FUNCTIONS-GREATEST-LEAST
> https://www.postgresql.org/docs/16/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS

I found that I needed to old_value::text::numeric to get Willow's code
to work, but I imagine this is due to the ancientness of the postgresql
I am using.

thanks
Dan





This is the toy with the shape of data that will be seen in the
application. The final trick was to use to_jsonb to allow the
timestamptz to be put back into the jsonb.

WITH replace AS (
    SELECT jsonb($$[
        {"start": "2023-06-12T19:54:39.248859996+10:00", "end": "2023-06-12T19:54:59.248859996+10:00", "data":
{"item":1}},
        {"start": "2023-06-12T19:54:50.248859996+10:00", "end": "2023-06-12T19:59:39.248859996+10:00", "data":
{"item":2}},
        {"start": "2023-06-12T19:56:39.248859996+10:00", "end": "2023-06-12T19:57:39.248859996+10:00", "data":
{"item":3}},
        {"start": "2023-06-12T19:54:39.248859996+10:00", "end": "2023-06-12T20:54:49.248859996+10:00", "data":
{"item":4}},
        {"start": "2024-06-12T19:54:39.248859996+10:00", "end": "2024-06-12T19:59:39.248859996+10:00", "data":
{"item":5}}
    ]$$) replacements
)
SELECT
    jsonb_agg(new ORDER BY idx) trimmed_replacements
FROM
    replace, LATERAL (
        SELECT idx, jsonb_object_agg(key,
            CASE
                WHEN key = 'start'
                    THEN to_jsonb(greatest(old::text::TIMESTAMPTZ, '2023-06-12T19:54:50+10:00'::TIMESTAMPTZ))
                WHEN key = 'end'
                    THEN to_jsonb(least(old::text::TIMESTAMPTZ, '2023-06-12T19:58:00+10:00'::TIMESTAMPTZ))
                ELSE old
            END
        )
        FROM
            jsonb_array_elements(replacements)
                WITH ORDINALITY rs(r, idx),
            jsonb_each(r) each(key, old)
        WHERE
            (r->>'start')::TIMESTAMPTZ < '2023-06-12T19:58:00+10:00'::TIMESTAMPTZ and
            (r->>'end')::TIMESTAMPTZ > '2023-06-12T19:54:50+10:00'::TIMESTAMPTZ
        GROUP BY idx
    ) news(idx, new);