Thread: Performance on JSONB select
select * from MyTable where ((JsonBField->>'status'='descartada' and To_Date(JsonBField->'descartada'->>'data','yyyy-mm-dd') > Current_Date) or (JsonBField->>'status'='contrato' and To_Date(JsonBField->'contrato'->>'data','yyyy-mm-dd') > Current_Date-7) or (JsonBField->>'status'='naoatribuido')); Considering this table has a million records, I would like to use a proper index, but how can I use an index when using operator >. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Much of indexing strategy depends on knowing the data like how many distinct values and what the distribution is like. Is JsonBField->>'status' always set? Are those three values mentioned in this query common or rare? Can you re-write this query to avoid using an OR in the where clause? Are you just wanting to add a GIN index for the jsonb paths? Or do you want indexed like below that are a bit stylized to this query?
CREATE INDEX idx_MyTable_status USING btree( JsonBField->>'status' );
CREATE INDEX idx_MyTable_descartada_date USING btree( To_Date(JsonBField->'descartada'->>'data','yyyy-mm-dd') );
CREATE INDEX idx_MyTable_contrato_date USING btree( To_Date(JsonBField->'contrato'->>'data','yyyy-mm-dd') );
ok, my select performed better but I had to create 8 indices to speed up my query. I would love to create just one index using GIN(JsonBField jsonb_ops) but using version 11 I cannot use operators like > and <. I see on docs that version 12 has jsonpath Filter Expression Elements and they include > and <. So, the question is, will be possible to have just one index and use it with "jsonpath Filter Expression Elements ? -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Greetings, * Michael Lewis (mlewis@entrata.com) wrote: > Much of indexing strategy depends on knowing the data like how many > distinct values and what the distribution is like. Is JsonBField->>'status' > always set? Are those three values mentioned in this query common or rare? > Can you re-write this query to avoid using an OR in the where clause? Are > you just wanting to add a GIN index for the jsonb paths? Or do you want > indexed like below that are a bit stylized to this query? If you know a field is going to always be there, you're better off, by far, by just having a regular column for that value and a straight up btree for it. This saves a significant amount of space and makes it much easier to index and work with. Thanks, Stephen
Attachment
I think I solved my problem. I didn´t know it was possible but I´ve created an almost dynamic index, because it´ll index for status and corresponding datevalue of an object with that status value. Postgres, I love you. Just one B-Tree index with 2 fields, Status and DateTime of that respective Status object. My Json has always a status and a respective object of that status with other values. So I know if it has that status it has a Date Value or not on its correspondent object. { "status": "visitadescartada", "contrato": {}, "atribuido": {"datevalue": "2019-09-05 14:47:11.149095-03"}, "trabalhando": {}, "visitaagendada": {"datevalue": "2019-09-05 15:06:24.255548-03","caption": "Agendado"}, "visitadescartada": {"datevalue": "2019-09-12 11:47:17.45782-03","caption": "Desagendado"}, "digitacaodescartada": {} } create index IndexByStatusAndDate on MyTable ( (JsonBField->>$$status$$), castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')) where (JsonBField ? $$status$$); select * from MyTable where (JsonBField ? $$status$$) and case when JsonBField->>$$status$$=$$cadastrodescartado$$ then castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue') > castimmutabletimestamp($$2019-10-01$$) when JsonBField->>$$status$$=$$visitadescartada$$ then castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue') > castimmutabletimestamp($$2019-10-01$$) when JsonBField->>$$status$$=$$contrato$$ then castimmutabletimestamp(funilvendas->(JsonBField->>$$status$$)->>'data') > castimmutabletimestamp($$2019-10-01$$) when JsonBField->>$$status$$=$$naoatribuido$$ then True end; And performance now is great because I´m using both fields on index. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
I think I solved my problem. I didn´t know it was possible but I´ve created an almost dynamic index, because it´ll index for status and corresponding datevalue of an object with that status value. Postgres, I love you. Just one B-Tree index with 2 fields, Status and DateTime of that respective Status object. My Json has always a status and a respective object of that status with other values. So I know if it has that status it has a Date Value or not on its correspondent object. { "status": "visitadescartada", "contrato": {}, "atribuido": {"datevalue": "2019-09-05 14:47:11.149095-03"}, "trabalhando": {}, "visitaagendada": {"datevalue": "2019-09-05 15:06:24.255548-03","caption": "Agendado"}, "visitadescartada": {"datevalue": "2019-09-12 11:47:17.45782-03","caption": "Desagendado"}, "digitacaodescartada": {} } create index IndexByStatusAndDate on MyTable ( (JsonBField->>$$status$$), castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')) where (JsonBField ? $$status$$); select * from MyTable where (JsonBField ? $$status$$) and case when JsonBField->>$$status$$=$$cadastrodescartado$$ then castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue') > castimmutabletimestamp($$2019-10-01$$) when JsonBField->>$$status$$=$$visitadescartada$$ then castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue') > castimmutabletimestamp($$2019-10-01$$) when JsonBField->>$$status$$=$$contrato$$ then castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'data') > castimmutabletimestamp($$2019-10-01$$) when JsonBField->>$$status$$=$$naoatribuido$$ then True end; And performance now is great because I´m using both fields on index. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Hi
so 5. 10. 2019 v 13:34 odesílatel PegoraroF10 <marcos@f10.com.br> napsal:
I think I solved my problem. I didn´t know it was possible but I´ve created
an almost dynamic index, because it´ll index for status and corresponding
datevalue of an object with that status value.
you created multicolumn functional index (there are no any dynamic index :))
Pavel
Postgres, I love you.
Just one B-Tree index with 2 fields, Status and DateTime of that respective
Status object. My Json has always a status and a respective object of that
status with other values. So I know if it has that status it has a Date
Value or not on its correspondent object.
{
"status": "visitadescartada",
"contrato": {},
"atribuido": {"datevalue": "2019-09-05 14:47:11.149095-03"},
"trabalhando": {},
"visitaagendada": {"datevalue": "2019-09-05
15:06:24.255548-03","caption": "Agendado"},
"visitadescartada": {"datevalue": "2019-09-12
11:47:17.45782-03","caption": "Desagendado"},
"digitacaodescartada": {}
}
create index IndexByStatusAndDate on MyTable (
(JsonBField->>$$status$$),
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue'))
where (JsonBField ? $$status$$);
select * from MyTable where (JsonBField ? $$status$$) and
case when JsonBField->>$$status$$=$$cadastrodescartado$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')
> castimmutabletimestamp($$2019-10-01$$)
when JsonBField->>$$status$$=$$visitadescartada$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')
> castimmutabletimestamp($$2019-10-01$$)
when JsonBField->>$$status$$=$$contrato$$ then
castimmutabletimestamp(funilvendas->(JsonBField->>$$status$$)->>'data') >
castimmutabletimestamp($$2019-10-01$$)
when JsonBField->>$$status$$=$$naoatribuido$$ then True end;
And performance now is great because I´m using both fields on index.
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
I told it was almost dynamic because it uses DateValue from an object or another, depending on value of staus key. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Em sáb, 5 de out de 2019 às 11:49, PegoraroF10 <marcos@f10.com.br> escreveu:
-- I told it was almost dynamic because it uses DateValue from an object or
another, depending on value of staus key.
Actually it’s named partial index.
Regards,
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
No, no, no, partial index is the where clause of it and I´m not talking about that.
I´m talking about datevalue, which will be used one or another, depending on status value
This record iindex will be status and datevalue from "visitadescartada" object
{
"status": "visitadescartada",
"atribuido": {"datevalue": "2019-09-05 14:47:11"},
"visitadescartada": {"datevalue": "2019-09-12 11:47:17"},
"digitacaodescartada": {}
}
And here the index record will be status and datevalue from "atribuido" object
{
"status": "atribuido",
"atribuido": {"datevalue": "2019-09-05 14:47:11"},
"visitadescartada": {"datevalue": "2019-09-12 11:47:17"},
"digitacaodescartada": {}
}
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
I´m talking about datevalue, which will be used one or another, depending on status value
This record iindex will be status and datevalue from "visitadescartada" object
{
"status": "visitadescartada",
"atribuido": {"datevalue": "2019-09-05 14:47:11"},
"visitadescartada": {"datevalue": "2019-09-12 11:47:17"},
"digitacaodescartada": {}
}
And here the index record will be status and datevalue from "atribuido" object
{
"status": "atribuido",
"atribuido": {"datevalue": "2019-09-05 14:47:11"},
"visitadescartada": {"datevalue": "2019-09-12 11:47:17"},
"digitacaodescartada": {}
}
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
My Json has always a status
Why declare the "where" clause when creating the index? It would not seem needed if status is always set and so your index will reference all rows in the table.
Thanks for sharing the trick of having the second column in the index determine the key based on the first column. I don't know if/when I might need this, but an interesting solution. Generically, it seems like the below-
create index idx_mytable_jsonb_dependent_fields on public.mytable ( (JsonBField->>'primary_field_to_filter_on'), JsonBField->(JsonBField->>'primary_field_to_filter_on')->>'secondary_field' );
Sorry, I told you that the status was always populated but not, I need that filter for the index because not all records are using that json structure. When json is not null then yes, Status is always there. I have maybe 20 or 25% of records having json populated, so, I really need that filter. Yes, that solution is fine. Selectivity is fine, speed is fine and the index is "almost dynamic" -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html