Thread: index on values stored in a json array
<div dir="ltr">Hi,<br /><br />I have a table with a column of type json, in which arrays of products are stored in the form:<br/><br />{products: [ {id : 123 , name ='product1'}, {id: 214, name: 'product 2}], size: 'XL'}<br /><br />Is is possibleto create an index on the row's products' ids?<br />I know I have to use GIN index, but I don't know how to collectthe product ids out of the array in the create index command.<br /><br />Thanks in advance<br /><br />raph<br /></div>
Hi Raph, On Oct 25, 2013, at 5:17 AM, Raphael Bauduin wrote: > Hi, > > I have a table with a column of type json, in which arrays of products are stored in the form: > > {products: [ {id : 123 , name ='product1'}, {id: 214, name: 'product 2}], size: 'XL'} > > Is is possible to create an index on the row's products' ids? > I know I have to use GIN index, but I don't know how to collect the product ids out of the array in the create index command. So with 9.3 this is unfortunately still nontrivial to create an index on an array nested in a JSON object. GIN indexes arenot supported on JSON yet. If the ids were not in a nested array, you would be able to create an expression index using"json_extract_path_text" pointing to the ids you want to extract. However, another approach would be to pull the array out of the JSON data, extract the integer IDs and put them into an integerarray, and then put the integer array into a separate column. From there you could create a GIN index on the integerarray column. Best, Jonathan
Thanks for your answer, it got me thinking about writing a function to extract the product ids and use it in an index. Here's the current result:
CREATE OR REPLACE FUNCTION product_ids(products json) RETURNS integer[] AS $$
select array_agg(value#>>'{id}')::integer[] from json_array_elements(products);
$$ LANGUAGE SQL IMMUTABLE;
CREATE index product_ids_index on events_2012_05 USING GIN (product_ids(event->'products'));
This allows me to issue a query like this to know which users have looked at a certain product id
select user_id from events_2012_05 where product_ids(event->'products')@>ARRAY[545932]
resulting in this query plan
QUERY PLAN
-----------------------------------------------------------------------------------------
Bitmap Heap Scan on events_2012_05 (cost=63.55..18103.42 rows=5071 width=4)
Recheck Cond: (product_ids((event -> 'products'::text)) @> '{545932}'::integer[])
-> Bitmap Index Scan on product_ids_index (cost=0.00..62.28 rows=5071 width=0)
Index Cond: (product_ids((event -> 'products'::text)) @> '{545932}'::integer[])
Cheers
raph
CREATE OR REPLACE FUNCTION product_ids(products json) RETURNS integer[] AS $$
select array_agg(value#>>'{id}')::integer[] from json_array_elements(products);
$$ LANGUAGE SQL IMMUTABLE;
CREATE index product_ids_index on events_2012_05 USING GIN (product_ids(event->'products'));
This allows me to issue a query like this to know which users have looked at a certain product id
select user_id from events_2012_05 where product_ids(event->'products')@>ARRAY[545932]
resulting in this query plan
QUERY PLAN
-----------------------------------------------------------------------------------------
Bitmap Heap Scan on events_2012_05 (cost=63.55..18103.42 rows=5071 width=4)
Recheck Cond: (product_ids((event -> 'products'::text)) @> '{545932}'::integer[])
-> Bitmap Index Scan on product_ids_index (cost=0.00..62.28 rows=5071 width=0)
Index Cond: (product_ids((event -> 'products'::text)) @> '{545932}'::integer[])
Cheers
raph
On Fri, Oct 25, 2013 at 4:21 PM, Jonathan S. Katz <jonathan.katz@excoventures.com> wrote:
Hi Raph,So with 9.3 this is unfortunately still nontrivial to create an index on an array nested in a JSON object. GIN indexes are not supported on JSON yet. If the ids were not in a nested array, you would be able to create an expression index using "json_extract_path_text" pointing to the ids you want to extract.
On Oct 25, 2013, at 5:17 AM, Raphael Bauduin wrote:
> Hi,
>
> I have a table with a column of type json, in which arrays of products are stored in the form:
>
> {products: [ {id : 123 , name ='product1'}, {id: 214, name: 'product 2}], size: 'XL'}
>
> Is is possible to create an index on the row's products' ids?
> I know I have to use GIN index, but I don't know how to collect the product ids out of the array in the create index command.
However, another approach would be to pull the array out of the JSON data, extract the integer IDs and put them into an integer array, and then put the integer array into a separate column. From there you could create a GIN index on the integer array column.
Best,
Jonathan
--
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org