Thread: BUG #18939: GIN index not used when it's expected to be

BUG #18939: GIN index not used when it's expected to be

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18939
Logged by:          Mateusz Rzeczkowski
Email address:      mateusz.rzeczkowski96@gmail.com
PostgreSQL version: 16.8
Operating system:   Debian 12.2.0 locally and Ubuntu 7.5.0 on prod
Description:

Hi,
In my project we're using jsonb and GIN indexes very extensively. We've
started with version 15.4 and recently upgraded to 16.8. The behavior that
I'll describe also occurs in version 17.4.
I'm using the official Docker image - https://hub.docker.com/_/postgres
Pretty much all our queries involve filtering on jsonb and a timestamptz
column. This includes JOINs based on the contents of jsonb. I know that this
is sub-optimal but we've tested it in version 15.4 and were happy with the
performance we managed to get with some tweaks.
Recently we found that some of our queries have lackluster performance (>1s)
and I found that in version 15.4 the performance was much better (~200ms).
The performance got worse because the GIN index is not used in some
scenarios. Instead of it being used to perform the JOIN a `Join Filter`
operation was performed.
I've also played around with the query and found that when I remove a
json_path filter the GIN index is used to perform the JOIN. In queries when
we use >1 json_path filters on the same table the GIN index is not used for
JOIN.
I'm not sure if it's a bug but this behavior is a bit unexpected because it
seems like for queries like this the GIN index should be used. If this is
not a bug and this behavior is expected I'd like to ask for some guidance on
how we should move forward - right now we're considering using pg_hint_plan
to have more control over query plans.
SPECULATION: I did some digging and I think that this may be caused by this
change -
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=cd9479af2
I've "verified" this hypothesis by doing `SET cpu_operator_cost = 0.0;` and
this caused the query plan to go back to looking the same way as in version
15.4.
Here is an MRE that I've prepared. It relies on some random data so it's not
perfect but I hope it will be helpful. The random data may make it difficult
to reproduce what I've described so I've placed some comments in it
including query plans.
```sql
SELECT VERSION(); -- PostgreSQL 16.8 (Debian 16.8-1.pgdg120+1) on
x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
CREATE TABLE public.entities1 (
    "data" jsonb NOT NULL,
    sys_time timestamptz NULL
);
CREATE TABLE public.entities2 (
    "data" jsonb NOT NULL,
    sys_time timestamptz NULL
);
DO
$do$
BEGIN
    FOR i IN 1..2500 LOOP
        INSERT INTO public.entities1
        VALUES
        (('{' || '"test": ' || i || ' }')::jsonb, null);
        INSERT INTO public.entities2
        VALUES
        (('{' || '"test": ' || i || ' }')::jsonb, null);
    END LOOP;
END
$do$;
DO
$do$
BEGIN
    FOR i IN 1..10000 LOOP
        INSERT INTO public.entities1
        VALUES
        (('{' || '"test": ' || i || ' }')::jsonb, NOW() + (random() *
(NOW()+'365 days' - NOW())) + '30 days');
        INSERT INTO public.entities2
        VALUES
        (('{' || '"test": ' || i || ' }')::jsonb, NOW() + (random() *
(NOW()+'365 days' - NOW())) + '30 days');
    END LOOP;
END
$do$;
CREATE INDEX ix_entities1_data_sys_time_null ON ONLY public.entities1 USING
gin (data jsonb_path_ops) WHERE (sys_time IS NULL);
CREATE INDEX ix_entities1_sys_time ON ONLY public.entities1 USING btree
(sys_time NULLS FIRST);
CREATE INDEX ix_entities2_data_sys_time_null ON ONLY public.entities2 USING
gin (data jsonb_path_ops) WHERE (sys_time IS NULL);
CREATE INDEX ix_entities2_sys_time ON ONLY public.entities2 USING btree
(sys_time NULLS FIRST);
-- We found these settings to be good for our use-case when we were testing
performance with Postges 15.4
SET enable_nestloop TO FALSE;
SET enable_material TO FALSE;
SET enable_sort TO FALSE;
SET enable_seqscan TO FALSE;
-- #1. `Join Filter: (entities2.data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text)))` instead of `Bitmap Index Scan on
ix_entities2_data_sys_time_null`
EXPLAIN (ANALYZE, BUFFERS)
 SELECT count(*) FROM (
 SELECT 1
 FROM public.entities1
 JOIN public.entities2 ON (public.entities2.data@>jsonb_build_object('test',
public.entities1.data->>'test'))
 WHERE (
 public.entities1.data @? '$ ? ($.test <= 5)'
 AND public.entities2.data @? '$ ? ($.test >= 1)')
 AND public.entities2.sys_time IS NULL
 AND public.entities1.sys_time IS NULL
 LIMIT 10001
 ) AS sub_query;
/*
Aggregate  (cost=10000000630.23..10000000630.24 rows=1 width=8) (actual
time=137.099..137.101 rows=1 loops=1)
  Buffers: shared hit=138
  ->  Limit  (cost=10000000094.70..10000000629.61 rows=50 width=4) (actual
time=137.025..137.027 rows=0 loops=1)
        Buffers: shared hit=138
        ->  Nested Loop  (cost=10000000094.70..10000000629.61 rows=50
width=4) (actual time=14.042..14.044 rows=0 loops=1)
              Join Filter: (entities2.data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text)))
              Rows Removed by Join Filter: 12500
              Buffers: shared hit=138
              ->  Bitmap Heap Scan on entities1  (cost=47.04..180.29 rows=2
width=25) (actual time=0.312..3.313 rows=5 loops=1)
                    Recheck Cond: (sys_time IS NULL)
                    Filter: (data @? '$?($."test" <= 5)'::jsonpath)
                    Rows Removed by Filter: 2495
                    Heap Blocks: exact=19
                    Buffers: shared hit=23
                    ->  Bitmap Index Scan on ix_entities1_sys_time
(cost=0.00..47.03 rows=2500 width=0) (actual time=0.201..0.202 rows=2500
loops=1)
                          Index Cond: (sys_time IS NULL)
                          Buffers: shared hit=4
              ->  Bitmap Heap Scan on entities2  (cost=47.66..180.91
rows=2500 width=25) (actual time=0.058..0.756 rows=2500 loops=5)
                    Recheck Cond: (sys_time IS NULL)
                    Filter: (data @? '$?($."test" >= 1)'::jsonpath)
                    Heap Blocks: exact=95
                    Buffers: shared hit=115
                    ->  Bitmap Index Scan on ix_entities2_sys_time
(cost=0.00..47.03 rows=2500 width=0) (actual time=0.047..0.047 rows=2500
loops=5)
                          Index Cond: (sys_time IS NULL)
                          Buffers: shared hit=20
Planning:
  Buffers: shared hit=11
Planning Time: 5.408 ms
JIT:
  Functions: 15
  Options: Inlining true, Optimization true, Expressions true, Deforming
true
  Timing: Generation 16.175 ms, Inlining 33.073 ms, Optimization 50.498 ms,
Emission 39.428 ms, Total 139.173 ms
Execution Time: 154.334 ms
*/
-- SET cpu_operator_cost = 0.0; -- This in most cases forced GIN index usage
in version 16.8. In practice this causes our query plans to look the same
way as they did in version 15.4
/*
Aggregate  (cost=10000000355.88..10000000355.89 rows=1 width=8) (actual
time=47.668..47.669 rows=1 loops=1)
  Buffers: shared hit=43
  ->  Limit  (cost=10000000078.52..10000000355.38 rows=50 width=4) (actual
time=47.658..47.659 rows=0 loops=1)
        Buffers: shared hit=43
        ->  Nested Loop  (cost=10000000078.52..10000000355.38 rows=50
width=4) (actual time=0.963..0.965 rows=0 loops=1)
              Buffers: shared hit=43
              ->  Bitmap Heap Scan on entities1  (cost=40.50..167.50 rows=2
width=25) (actual time=0.068..0.609 rows=5 loops=1)
                    Recheck Cond: (sys_time IS NULL)
                    Filter: (data @? '$?($."test" <= 5)'::jsonpath)
                    Rows Removed by Filter: 2495
                    Heap Blocks: exact=19
                    Buffers: shared hit=23
                    ->  Bitmap Index Scan on ix_entities1_sys_time
(cost=0.00..40.50 rows=2500 width=0) (actual time=0.045..0.045 rows=2500
loops=1)
                          Index Cond: (sys_time IS NULL)
                          Buffers: shared hit=4
              ->  Bitmap Heap Scan on entities2  (cost=38.02..93.69 rows=25
width=25) (actual time=0.068..0.068 rows=0 loops=5)
                    Recheck Cond: ((data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text))) AND (data @? '$?($."test" >=
1)'::jsonpath) AND (sys_time IS NULL))
                    Buffers: shared hit=20
                    ->  Bitmap Index Scan on ix_entities2_data_sys_time_null
(cost=0.00..38.02 rows=25 width=0) (actual time=0.064..0.064 rows=0 loops=5)
                          Index Cond: ((data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text))) AND (data @? '$?($."test" >=
1)'::jsonpath))
                          Buffers: shared hit=20
Planning:
  Buffers: shared hit=11
Planning Time: 0.383 ms
JIT:
  Functions: 11
  Options: Inlining true, Optimization true, Expressions true, Deforming
true
  Timing: Generation 0.741 ms, Inlining 10.864 ms, Optimization 22.096 ms,
Emission 13.735 ms, Total 47.437 ms
Execution Time: 48.541 ms
*/
-- #2. `Bitmap Index Scan on ix_entities2_data_sys_time_null`. Less index
quals for `entities2`?
EXPLAIN (ANALYZE, BUFFERS)
 SELECT count(*) FROM (
 SELECT 1
 FROM public.entities1
 JOIN public.entities2 ON (public.entities2.data@>jsonb_build_object('test',
public.entities1.data->>'test'))
 WHERE (
 public.entities1.data @? '$ ? ($.test <= 5)')
 AND public.entities2.sys_time IS NULL
 AND public.entities1.sys_time IS NULL
 LIMIT 10001
 ) AS sub_query;
/*
Aggregate  (cost=10000000299.88..10000000299.89 rows=1 width=8) (actual
time=69.577..69.579 rows=1 loops=1)
  Buffers: shared hit=38
  ->  Limit  (cost=10000000050.52..10000000299.38 rows=50 width=4) (actual
time=69.561..69.562 rows=0 loops=1)
        Buffers: shared hit=38
        ->  Nested Loop  (cost=10000000050.52..10000000299.38 rows=50
width=4) (actual time=0.923..0.924 rows=0 loops=1)
              Buffers: shared hit=38
              ->  Bitmap Heap Scan on entities1  (cost=40.50..167.50 rows=2
width=25) (actual time=0.150..0.731 rows=5 loops=1)
                    Recheck Cond: (sys_time IS NULL)
                    Filter: (data @? '$?($."test" <= 5)'::jsonpath)
                    Rows Removed by Filter: 2495
                    Heap Blocks: exact=19
                    Buffers: shared hit=23
                    ->  Bitmap Index Scan on ix_entities1_sys_time
(cost=0.00..40.50 rows=2500 width=0) (actual time=0.110..0.110 rows=2500
loops=1)
                          Index Cond: (sys_time IS NULL)
                          Buffers: shared hit=4
              ->  Bitmap Heap Scan on entities2  (cost=10.03..65.69 rows=25
width=25) (actual time=0.036..0.036 rows=0 loops=5)
                    Recheck Cond: ((data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text))) AND (sys_time IS NULL))
                    Buffers: shared hit=15
                    ->  Bitmap Index Scan on ix_entities2_data_sys_time_null
(cost=0.00..10.03 rows=25 width=0) (actual time=0.031..0.031 rows=0 loops=5)
                          Index Cond: (data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text)))
                          Buffers: shared hit=15
Planning:
  Buffers: shared hit=11
Planning Time: 0.864 ms
JIT:
  Functions: 11
  Options: Inlining true, Optimization true, Expressions true, Deforming
true
  Timing: Generation 3.781 ms, Inlining 24.875 ms, Optimization 26.621 ms,
Emission 17.143 ms, Total 72.419 ms
Execution Time: 73.551 ms
*/
-- #3. In my original data by adding more conditions on `entities1` I was
able to cause a similar query plan like for query #1. In this MRE I didn't
manage to reproduce it. More index quals?
EXPLAIN (ANALYZE, BUFFERS)
 SELECT count(*) FROM (
 SELECT 1
 FROM public.entities1
 JOIN public.entities2 ON (public.entities2.data@>jsonb_build_object('test',
public.entities1.data->>'test'))
 WHERE (
 public.entities1.data @? '$ ? ($.test <= 5)'
 AND public.entities1.data @? '$ ? ($.test > 0)')
 AND public.entities2.sys_time IS NULL
 AND public.entities1.sys_time IS NULL
 LIMIT 10001
 ) AS sub_query;
-- #4. This yields a query plan that would be probably best for our
use-case. We'll probably move forward with pg_hint_plan.
CREATE EXTENSION pg_hint_plan;
/*+
BitMapScan(entities1 ix_entities1_data_sys_time_null)
BitMapScan(entities2 ix_entities2_data_sys_time_null)
*/
EXPLAIN (ANALYZE, BUFFERS)
 SELECT count(*) FROM (
 SELECT 1
 FROM public.entities1
 JOIN public.entities2 ON (public.entities2.data@>jsonb_build_object('test',
public.entities1.data->>'test'))
 WHERE (
 public.entities1.data @? '$ ? ($.test <= 5)'
 AND public.entities2.data @? '$ ? ($.test >= 1)')
 AND public.entities2.sys_time IS NULL
 AND public.entities1.sys_time IS NULL
 LIMIT 10001
 ) AS sub_query;
/*
Aggregate  (cost=10000025369.34..10000025369.35 rows=1 width=8) (actual
time=43.379..43.380 rows=1 loops=1)
  Buffers: shared hit=59
  ->  Limit  (cost=10000023486.04..10000025368.71 rows=50 width=4) (actual
time=43.372..43.373 rows=0 loops=1)
        Buffers: shared hit=59
        ->  Nested Loop  (cost=10000023486.04..10000025368.71 rows=50
width=4) (actual time=0.794..0.795 rows=0 loops=1)
              Buffers: shared hit=59
              ->  Bitmap Heap Scan on entities1  (cost=21722.88..21730.06
rows=2 width=25) (actual time=0.203..0.759 rows=5 loops=1)
                    Recheck Cond: ((data @? '$?($."test" <= 5)'::jsonpath)
AND (sys_time IS NULL))
                    Rows Removed by Index Recheck: 2495
                    Heap Blocks: exact=19
                    Buffers: shared hit=39
                    ->  Bitmap Index Scan on ix_entities1_data_sys_time_null
(cost=0.00..21722.88 rows=2 width=0) (actual time=0.181..0.181 rows=2500
loops=1)
                          Index Cond: (data @? '$?($."test" <=
5)'::jsonpath)
                          Buffers: shared hit=20
              ->  Bitmap Heap Scan on entities2  (cost=1763.16..1819.08
rows=25 width=25) (actual time=0.005..0.005 rows=0 loops=5)
                    Recheck Cond: ((data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text))) AND (data @? '$?($."test" >=
1)'::jsonpath) AND (sys_time IS NULL))
                    Buffers: shared hit=20
                    ->  Bitmap Index Scan on ix_entities2_data_sys_time_null
(cost=0.00..1763.15 rows=25 width=0) (actual time=0.002..0.002 rows=0
loops=5)
                          Index Cond: ((data @> jsonb_build_object('test',
(entities1.data ->> 'test'::text))) AND (data @? '$?($."test" >=
1)'::jsonpath))
                          Buffers: shared hit=20
Planning:
  Buffers: shared hit=16
Planning Time: 0.327 ms
JIT:
  Functions: 9
  Options: Inlining true, Optimization true, Expressions true, Deforming
true
  Timing: Generation 0.468 ms, Inlining 8.909 ms, Optimization 21.110 ms,
Emission 12.556 ms, Total 43.044 ms
Execution Time: 43.920 ms
*/
```