Re: [PERFORM] - Mailing list pgsql-performance
From | Pavel Stehule |
---|---|
Subject | Re: [PERFORM] |
Date | |
Msg-id | CAFj8pRBqvyQrEK-msP5DO7aame+Sfc_yCuxRUDQ+xxhXM3tdfQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PERFORM] (Yevhenii Kurtov <yevhenii.kurtov@gmail.com>) |
Responses |
Re: [PERFORM]
|
List | pgsql-performance |
2017-06-28 9:28 GMT+02:00 Yevhenii Kurtov <yevhenii.kurtov@gmail.com>:
SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
Hello Pavel,Can you please give a tip how to rewrite the query with UNION clause? I didn't use it at all before actually and afraid that will not get it properly from the first time :)
WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
UNION SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE ((c0."status" = $3) AND (c0."failed_at" > $4))
UNION SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE ((c0."status" = $5) AND (c0."started_at" < $6))
ORDER BY c0."priority" DESC, c0."times_failed"
LIMIT $7
FOR UPDATE SKIP LOCKED
WHERE ((c0."status" = $5) AND (c0."started_at" < $6))
ORDER BY c0."priority" DESC, c0."times_failed"
LIMIT $7
FOR UPDATE SKIP LOCKED
Something like this
Pavel
On Wed, Jun 28, 2017 at 2:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:2017-06-28 8:47 GMT+02:00 Yevhenii Kurtov <yevhenii.kurtov@gmail.com>:Hello,
We have a query that is run almost each second and it's very important to squeeze every other ms out of it. The query is:
SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
OR ((c0."status" = $3) AND (c0."failed_at" > $4))
OR ((c0."status" = $5) AND (c0."started_at" < $6))
ORDER BY c0."priority" DESC, c0."times_failed"
LIMIT $7
FOR UPDATE SKIP LOCKEDI added following index:CREATE INDEX ON campaign_jobs(id, status, failed_at, started_at, priority DESC, times_failed);And it didn't help at all, even opposite - the planning phase time grew up from ~2ms up to ~40 ms leaving execution time intact:Limit (cost=29780.02..29781.27 rows=100 width=18) (actual time=827.753..828.113 rows=100 loops=1)-> LockRows (cost=29780.02..32279.42 rows=199952 width=18) (actual time=827.752..828.096 rows=100 loops=1)-> Sort (cost=29780.02..30279.90 rows=199952 width=18) (actual time=827.623..827.653 rows=100 loops=1)Sort Key: priority DESC, times_failedSort Method: external sort Disk: 5472kB-> Seq Scan on campaign_jobs c0 (cost=0.00..22138.00 rows=199952 width=18) (actual time=1.072..321.410 rows=200000 loops=1)Filter: (((status = 0) AND (id <> ALL ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22, 23,24,25,26,27,28,29,30,31,32, 33,34,35,36,37,38,39,40,41,42, 43,44,45,46,47,48}'::integer[] ))) OR ((status = 2) AND (failed_at > '2017-06-22 03:18:09'::timestamp without time zone)) OR ((status = 1) AND (started_at < '2017-06-23 03:11:09'::timestamp without time zone))) Planning time: 40.734 msExecution time: 913.638 ms(9 rows)I see that query still went through the Seq Scan instead of Index Scan. Is it due to poorly crafted index or because of query structure? Is it possible to make this query faster?There are few issuesa) parametrized LIMITb) complex predicate with lot of ORc) slow external sortb) signalize maybe some strange in design .. try to replace "OR" by "UNION" queryc) if you can and you have good enough memory .. try to increase work_mem .. maybe 20MBif you change query to union queries, then you can use conditional indexescreate index(id) where status = 0;create index(failed_at) where status = 2;create index(started_at) where status = 1;RegardsPavelThanks
pgsql-performance by date: