Home > mailing lists

Re: Slow "not in array" operation - Mailing list pgsql-performance

From	Jeff Janes
Subject	Re: Slow "not in array" operation
Date	November 13, 2019 14:30:10
Msg-id	CAMkU=1xwG6OZd7tzi3+jCmt=QhCX6KTj4027iN7xAT42nA=LmA@mail.gmail.com Whole thread Raw
In response to	Re: Slow "not in array" operation (Marco Colli <collimarco91@gmail.com>)
Responses	Re: Slow "not in array" operation (Marco Colli <collimarco91@gmail.com>)
List	pgsql-performance

Tree view

On Wed, Nov 13, 2019 at 4:20 AM Marco Colli <collimarco91@gmail.com> wrote:

Replying to the previous questions:
- work_mem = 64MB (there are hundreds of connections)
- the project 123 has more than 7M records, and those that don't have the tag 'en' are 4.8M

What was the plan for the one that took 500ms?

This is the query / plan without the filter on tags:

SELECT COUNT(*) FROM "subscriptions" WHERE "subscriptions"."project_id" = 123 AND "subscriptions"."trashed_at" IS NULL;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=291342.67..291342.68 rows=1 width=8) (actual time=354.556..354.556 rows=1 loops=1)
-> Gather (cost=291342.05..291342.66 rows=6 width=8) (actual time=354.495..374.305 rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
-> Partial Aggregate (cost=290342.05..290342.06 rows=1 width=8) (actual time=349.799..349.799 rows=1 loops=7)
-> Parallel Index Only Scan using index_subscriptions_on_project_id_and_uid on subscriptions (cost=0.56..287610.27 rows=1092713 width=0) (actual time=0.083..273.018 rows=1030593 loops=7)
Index Cond: (project_id = 123)
Heap Fetches: 280849
Planning Time: 0.753 ms
Execution Time: 374.483 ms
(10 rows)

My previous comment about the bitmap index scan taking half the time was a slip of the eye, I was comparing *cost* of the bitmap index scan to the *time* of the overall plan. But then the question is, why isn't it doing an index-only scan on "index_subscriptions_on_project_id_and_tags"? And the answer is that is because it is a GIN index. Make the same index only as btree, and you should get good performance as it can filter the tags within a given project without visiting the table.

Cheers,

Jeff

pgsql-performance by date:

From: Rick Otten
Date: 13 November 2019, 14:18:16
Subject: Re: Slow "not in array" operation

From: Marco Colli
Date: 13 November 2019, 14:56:22
Subject: Re: Slow "not in array" operation

Re: Slow "not in array" operation - Mailing list pgsql-performance

Previous

Next