Re: Indexing on JSONB field not working - Mailing list pgsql-bugs

From Zhihong Zhang
Subject Re: Indexing on JSONB field not working
Date
Msg-id E3D6070E-72E0-4FE5-9E30-D114530F383F@gmail.com
Whole thread Raw
In response to Re: Indexing on JSONB field not working  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Indexing on JSONB field not working  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-bugs
Run those 2 EXPLAINs with seqscan off and on. See what difference it makes!


SET enable_seqscan = off;
explain analyze select id, _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision < 3.0 limit 100;

"Limit  (cost=0.43..313.25 rows=100 width=53) (actual time=0.092..0.236 rows=7 loops=1)"
"  ->  Index Scan using assets_floatvalue_idx on assets  (cost=0.43..2714072.57 rows=867607 width=53) (actual time=0.089..0.230 rows=7 loops=1)"
"        Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)"
"Planning Time: 1.617 ms"
"Execution Time: 0.276 ms"

SET enable_seqscan = on;
explain analyze select id, _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision < 3.0 limit 100;

"Limit  (cost=0.00..107.95 rows=100 width=53) (actual time=41021.311..313501.746 rows=7 loops=1)"
"  ->  Seq Scan on assets  (cost=0.00..936605.40 rows=867607 width=53) (actual time=41021.309..313501.732 rows=7 loops=1)"
"        Filter: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)"
"        Rows Removed by Filter: 2602824"
"Planning Time: 0.283 ms"
"Execution Time: 313501.777 ms"



Zhihong

On Dec 20, 2019, at 5:30 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On Fri, Dec 20, 2019 at 04:52:17PM -0500, Zhihong Zhang wrote:
I have an index on JSONB fields like this,



CREATE INDEX float_number_index_path2

  ON public.assets USING btree

  (((_doc #> '{floatValue}'::text[])::double precision) ASC NULLS LAST)

  TABLESPACE pg_default;



However query doesn't use it,



explain select id, _doc->>'floatValue' from assets where (_doc #>
'{floatValue}'::text[])::double precision < 3.0 limit 3;



Limit  (cost=0.00..3.24 rows=3 width=53)

->  Seq Scan on assets  (cost=0.00..936605.40 rows=867607 width=53)

      Filter: (((_doc #> '{floatValue}'::text[]))::double precision <
'3'::double precision)



The index scan is likely expected to be more expensive than the plain
sequential scan with the LIMIT interrupting it pretty much right away
(it's expected to scan only ~0.0003% of the table.

You can probably push the database to use the index by disabling
sequential scans, i.e.

 SET enable_seqscan = off;

and then doing the explain again.

The interesting question however is which of the plans is faster. It's
quite possible the database is making the right choice - index scans are
not necessarily faster.


The version of the database,



"PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2
20140120 (Red Hat 4.8.2-16), 64-bit"



However, the index works for text field in JSONB.


Well, the text field probably has different statistics, so the
sequential scan would have so scan much larger part of the table. Who
knows - you haven't shared the execution plans.



Let me know if I can provide more information.


Show us explain analyze for both queries, with both index-scan and
seq-scan (you'll have to use enable_seqscan and enable_indexscan to
force the plan choice).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 

pgsql-bugs by date:

Previous
From: Zhihong Zhang
Date:
Subject: Re: Indexing on JSONB field not working
Next
From: Michael Paquier
Date:
Subject: Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)