Thread: create index on a jsonb timestamp field?
I'm playing with DNSTAP (dnstap.info) data and loading it into a database for analysis. when I try to create an index on the query_time field of the json structure I get: ler=# select id,data->'message'->>'query_time' from dns_query limit 2; id | ?column? ----+----------------------------- 2 | 2019-05-13T01:35:59.822984Z 3 | 2019-05-13T01:35:59.829801Z (2 rows) ler=# select id,(data->'message'->>'query_time')::timestamptz from dns_query limit 2; id | timestamptz ----+------------------------------- 2 | 2019-05-12 20:35:59.822984-05 3 | 2019-05-12 20:35:59.829801-05 (2 rows) ler=# ler=# create index dns_query_time_idx on dns_query(((data -> 'message' ->> 'query_time')::text::timestamptz)); ERROR: functions in index expression must be marked IMMUTABLE Is there any easy way to do this? Or, what would the experts recommend here? -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
On 05/18/2019 5:53 pm, Larry Rosenman wrote: > I'm playing with DNSTAP (dnstap.info) data and loading it into a > database for analysis. > > when I try to create an index on the query_time field of the json > structure I get: > > ler=# select id,data->'message'->>'query_time' from dns_query limit 2; > id | ?column? > ----+----------------------------- > 2 | 2019-05-13T01:35:59.822984Z > 3 | 2019-05-13T01:35:59.829801Z > (2 rows) > > ler=# select id,(data->'message'->>'query_time')::timestamptz from > dns_query limit 2; > id | timestamptz > ----+------------------------------- > 2 | 2019-05-12 20:35:59.822984-05 > 3 | 2019-05-12 20:35:59.829801-05 > (2 rows) > > ler=# > > ler=# create index dns_query_time_idx on dns_query(((data -> 'message' > ->> 'query_time')::text::timestamptz)); > ERROR: functions in index expression must be marked IMMUTABLE > > Is there any easy way to do this? Or, what would the experts recommend > here? I found a work-around in making a column for query time, and populating that in an UPDATE/INSERT trigger, and then making an index on that. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
Larry Rosenman <ler@lerctr.org> writes: > when I try to create an index on the query_time field of the json > structure I get: > ler=# create index dns_query_time_idx on dns_query(((data -> 'message' > ->> 'query_time')::text::timestamptz)); > ERROR: functions in index expression must be marked IMMUTABLE Yeah, because the timestamptz input function has dependencies on both the datestyle and timezone GUCs. Given that your input is ISO-format with explicit time zone, you don't really care about either of those things, but the mutability check doesn't know that. > Is there any easy way to do this? Or, what would the experts recommend > here? The sanest way to deal with this IMO is to make a column containing the extracted timestamp, which you could maintain with a trigger, and then index that. You could alternatively make a custom function that you (mis?)label as immutable, but your queries would have to use that same function in order to get matched to the index, so I dunno about that being a user-friendly approach. BTW, I'd had the idea that the GENERATED option in PG v13 would allow setting up this sort of case without bothering with a handwritten trigger, but it seems not: regression=# create table foo(data jsonb, ts timestamptz GENERATED ALWAYS AS ((data->>'ts')::timestamptz) stored); psql: ERROR: generation expression is not immutable I wonder if that's really necessary to insist on? regards, tom lane
On 05/18/2019 8:17 pm, Tom Lane wrote: > Larry Rosenman <ler@lerctr.org> writes: >> when I try to create an index on the query_time field of the json >> structure I get: >> ler=# create index dns_query_time_idx on dns_query(((data -> 'message' >> ->> 'query_time')::text::timestamptz)); >> ERROR: functions in index expression must be marked IMMUTABLE > > Yeah, because the timestamptz input function has dependencies on > both the datestyle and timezone GUCs. Given that your input is > ISO-format with explicit time zone, you don't really care about > either of those things, but the mutability check doesn't know that. > >> Is there any easy way to do this? Or, what would the experts >> recommend >> here? > > The sanest way to deal with this IMO is to make a column containing > the extracted timestamp, which you could maintain with a trigger, > and then index that. You could alternatively make a custom function > that you (mis?)label as immutable, but your queries would have to > use that same function in order to get matched to the index, so > I dunno about that being a user-friendly approach. This is what I wound up figuring out between my original post and yours, and it works great. > > BTW, I'd had the idea that the GENERATED option in PG v13 would allow > setting up this sort of case without bothering with a handwritten > trigger, > but it seems not: > > regression=# create table foo(data jsonb, ts timestamptz GENERATED > ALWAYS AS ((data->>'ts')::timestamptz) stored); > psql: ERROR: generation expression is not immutable > > I wonder if that's really necessary to insist on? Good question. Is that something the project is going to look into? > > regards, tom lane -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
Larry Rosenman <ler@lerctr.org> writes: > On 05/18/2019 8:17 pm, Tom Lane wrote: >> BTW, I'd had the idea that the GENERATED option in PG v13 would allow >> setting up this sort of case without bothering with a handwritten >> trigger, but it seems not: >> >> regression=# create table foo(data jsonb, ts timestamptz GENERATED >> ALWAYS AS ((data->>'ts')::timestamptz) stored); >> psql: ERROR: generation expression is not immutable >> >> I wonder if that's really necessary to insist on? > Good question. Is that something the project is going to look into? Well, I was just asking the question, not opining on whether it was right or wrong. GENERATED is a SQL-spec feature, and it might be that the semantics the spec calls for wouldn't work without the restriction. I've not looked... regards, tom lane