Home > mailing lists

Re: Indexing on JSONB field not working - Mailing list pgsql-bugs

From	Tom Lane
Subject	Re: Indexing on JSONB field not working
Date	December 21, 2019 15:49:16
Msg-id	21855.1576943356@sss.pgh.pa.us Whole thread Raw
In response to	Re: Indexing on JSONB field not working (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: Indexing on JSONB field not working
List	pgsql-bugs

Tree view

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Fri, Dec 20, 2019 at 08:15:39PM -0500, Zhihong Zhang wrote:
>> "Limit  (cost=0.43..313.25 rows=100 width=53) (actual time=0.092..0.236 rows=7 loops=1)"
>> "  ->  Index Scan using assets_floatvalue_idx on assets  (cost=0.43..2714072.57 rows=867607 width=53) (actual
time=0.089..0.230rows=7 loops=1)" 
>> "        Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)"

> Well, this confirms what I suspected before - the optimizer believes the
> seqscan plan is a bit cheaper compared to index scan (107 vs. 313) but
> that mismatches the actual performance. The question is why ...

The planner evidently believes that 867607 rows will match the query
condition, so it expects that the scan will stop (after collecting
100 rows) very quickly.  In reality only 7 rows match, so the scan
has to run to completion.  This is what's bollixing the plan choice.

I suspect that 867607 is just a default estimate, but if ANALYZE has
been run then there should be stats for the index column, so why isn't
it doing better?  When I try a similar case here, I get good estimates:

regression=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), 64-bit
(1 row)

regression=# create table public.assets(_doc jsonb);
CREATE TABLE
regression=# CREATE INDEX on assets (((_doc #> '{floatValue}'::text[])::double precision) ASC NULLS LAST);
CREATE INDEX
regression=# insert into assets select ('{"floatValue": ' || x || '}')::jsonb from generate_series(1,10000) x;
INSERT 0 10000
regression=# explain select _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision <
3.0limit 100; 
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..7.71 rows=100 width=32)
   ->  Seq Scan on assets  (cost=0.00..293.44 rows=3808 width=32)
         Filter: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)
(3 rows)

regression=# analyze assets;
ANALYZE
regression=# explain select _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision <
3.0limit 100; 
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..8.33 rows=2 width=32)
   ->  Index Scan using assets_float8_idx on assets  (cost=0.29..8.33 rows=2 width=32)
         Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)
(3 rows)

The "3808" estimate is just a default for '<' with no stats, but
with stats I get a dead-on estimate.

            regards, tom lane

pgsql-bugs by date:

From: Jeff Janes
Date: 21 December 2019, 15:17:26
Subject: Re: Indexing on JSONB field not working

From: Bruce Momjian
Date: 21 December 2019, 18:00:06
Subject: Re: BUG #16151: startup timing problem

Re: Indexing on JSONB field not working - Mailing list pgsql-bugs

Previous

Next