Home > mailing lists

Re: JSONB index not in use, but is TOAST the real cause of slow query? - Mailing list pgsql-general

From	Shaheed Haque
Subject	Re: JSONB index not in use, but is TOAST the real cause of slow query?
Date	May 29, 2022 16:03:20
Msg-id	CAHAc2jf04e2kPXbYcNHgcb_K7=PBG-uexk9P8L=e_zv2va9+Qw@mail.gmail.com Whole thread
In response to	Re: JSONB index not in use, but is TOAST the real cause of slow query? (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-general

Tree view

On Sun, 29 May 2022, 15:58 Tom Lane, <tgl@sss.pgh.pa.us> wrote:

Shaheed Haque <shaheedhaque@gmail.com> writes:
> Unfortunately, the real query which I think should behave very
> similarly is still at the several-seconds level despite using the
> index. ...

> -> Bitmap Heap Scan on paiyroll_payrun (cost=26.88..30.91 rows=1
> width=4) (actual time=32.488..2258.891 rows=62 loops=1)
> Recheck Cond: ((company_id = 173) AND ((snapshot ->
> 'employee'::text) ? '16376'::text))
> Filter: (((snapshot #>
> '{employee,16376,last_run_of_employment}'::text[]) <> 'true'::jsonb)
> OR ((snapshot #> '{employee,16376,pay_graph}'::text[]) <> '0'::jsonb)
> OR ((snapshot #> '{employee,16376,state,employment,-1,2}'::text[]) <=
> '0'::jsonb))
> Heap Blocks: exact=5
> -> BitmapAnd (cost=26.88..26.88 rows=1 width=0) (actual
> time=0.038..0.039 rows=0 loops=1)
> -> Bitmap Index Scan on paiyroll_payrun_company_id_ce341888
> (cost=0.00..6.56 rows=304 width=0) (actual time=0.016..0.016 rows=304
> loops=1)
> Index Cond: (company_id = 173)
> -> Bitmap Index Scan on idx1 (cost=0.00..20.07 rows=9
> width=0) (actual time=0.021..0.021 rows=62 loops=1)
> Index Cond: ((snapshot -> 'employee'::text) ? '16376'::text)

> IIUC, at the bottom, the indices are doing their thing, but a couple
> of layers up, the "Bitmap Heap Scan" jumps from ~30ms to 2200ms. But I
> cannot quite see why.

I suppose it's the execution of that "Filter" condition, which will
require perhaps as many as three fetches of the "snapshot" column.

Thanks, that's clearly in the frame.

You really need to rethink that data structure. Sure, you can store tons
of unorganized data in a jsonb column, but you pay for that convenience
with slow access. Normalizing the bits you need frequently into a more
traditional relational schema is the route to better-performing queries.

Ack. Indeed, the current design works very well for all of the access patterns other than this one, which only recently came into view as a problem.

Ahead of contemplating a design change I have been looking at how to optimise this bit. I'm currently mired in a crash course on SQL syntax as pertains to JSONB, jsonpath et. al. And the equally mysterious side effects of "?" and "@>" and so on in terms of the amount of data being fetched etc. (and all wrapped in a dose of ORM for good measure).

I'll write separately with more specific questions if needed on those details.

Thanks again for the kind help.

Shaheed

regards, tom lane

pgsql-general by date:

From: Laurenz Albe
Date: 29 May 2022, 15:02:28
Subject: Re: autovacuum on primary blocking queries on replica?

From: Alastair McKinley
Date: 29 May 2022, 16:46:22
Subject: Function definition regression in 15beta1 when specific parameter name (string) is used

Re: JSONB index not in use, but is TOAST the real cause of slow query? - Mailing list pgsql-general

Previous

Next