Query optimization help - Mailing list pgsql-performance

From Szymon Kosok
Subject Query optimization help
Date
Msg-id CAOJGGcF7uwUjwxYeUv-8nd351SmkJ=nUFL+eYAZ=QtB7dYk7CQ@mail.gmail.com
Whole thread Raw
Responses Re: Query optimization help
List pgsql-performance
Hello,

I asked that question on StackOverflow, but didn't get any valuable
response, so I'll ask it here. :)

I have such query:

SELECT "spoleczniak_tablica"."id", "spoleczniak_tablica"."postac_id",
"spoleczniak_tablica"."hash", "spoleczniak_tablica"."typ",
"spoleczniak_tablica"."ikona", "spoleczniak_tablica"."opis",
"spoleczniak_tablica"."cel", "spoleczniak_tablica"."data",
"postac_postacie"."id",
"postac_postacie"."user_id", "postac_postacie"."avatar",
"postac_postacie"."ikonka",
"postac_postacie"."imie", "postac_postacie"."nazwisko",
"postac_postacie"."pseudonim",
"postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga",
"postac_postacie"."ur_tydz", "postac_postacie"."ur_rok",
"postac_postacie"."ur_miasto_id",
"postac_postacie"."akt_miasto_id", "postac_postacie"."kasa",
"postac_postacie"."punkty",
"postac_postacie"."zmeczenie", "postac_postacie"."zdrowie",
"postac_postacie"."kariera"
FROM "spoleczniak_tablica" INNER JOIN "postac_postacie" ON
("spoleczniak_tablica"."postac_id" = "postac_postacie"."id") WHERE
spoleczniak_tablica.postac_id = 1 or spoleczniak_tablica.id in(select
wpis_id from
spoleczniak_oznaczone where etykieta_id in(select tag_id from
spoleczniak_subskrypcje where
postac_id = 1)) or (spoleczniak_tablica.postac_id in(select obserwowany_id from
spoleczniak_obserwatorium where obserwujacy_id = 1) and hash not
in('dyskusja', 'kochanie',
'szturniecie')) or (spoleczniak_tablica.cel = 1 and
spoleczniak_tablica.hash in('dyskusja',
'kochanie', 'obserwatorium', 'szturchniecie')) or spoleczniak_tablica.hash =
'administracja-info' or exists(select 1 from spoleczniak_komentarze
where kredka_id =
spoleczniak_tablica.id and postac_id = 1) ORDER BY
"spoleczniak_tablica"."id" DESC LIMIT
21;

and it's real performance bottleneck for us. It's one of the most
often executed query on our site.

Here is EXPLAIN ANALYZE:

 Limit  (cost=52.69..185979.44 rows=21 width=283) (actual
time=5.981..149.110 rows=21 loops=1)
   ->  Nested Loop  (cost=52.69..27867127142.57 rows=3147528
width=283) (actual time=5.981..149.103 rows=21 loops=1)
         ->  Index Scan Backward using spoleczniak_tablica_pkey on
spoleczniak_tablica  (cost=52.69..27866103743.37 rows=3147528
width=194) (actual time=5.971..148.963 rows=21 loops=1)
               Filter: ((postac_id = 1) OR (SubPlan 1) OR ((hashed
SubPlan 2) AND ((hash)::text <> ALL
('{dyskusja,kochanie,szturniecie}'::text[]))) OR ((cel = 1) AND
((hash)::text = ANY
('{dyskusja,kochanie,obserwatorium,szturchniecie}'::text[]))) OR
((hash)::text = 'administracja-info'::text) OR (alternatives: SubPlan
3 or hashed SubPlan 4))
               SubPlan 1
                 ->  Materialize  (cost=13.28..11947.85 rows=1264420
width=4) (actual time=0.000..0.024 rows=485 loops=2137)
                       ->  Nested Loop  (cost=13.28..685.75
rows=1264420 width=4) (actual time=0.119..0.664 rows=485 loops=1)
                             ->  HashAggregate  (cost=5.89..5.90
rows=1 width=4) (actual time=0.015..0.017 rows=7 loops=1)
                                   ->  Index Scan using
spoleczniak_subskrypcje_postac_id on spoleczniak_subskrypcje
(cost=0.00..5.89 rows=2 width=4) (actual time=0.005..0.009 rows=7
loops=1)
                                         Index Cond: (postac_id = 1)
                             ->  Bitmap Heap Scan on
spoleczniak_oznaczone  (cost=7.38..674.96 rows=391 width=8) (actual
time=0.019..0.082 rows=69 loops=7)
                                   Recheck Cond: (etykieta_id =
spoleczniak_subskrypcje.tag_id)
                                   ->  Bitmap Index Scan on
spoleczniak_oznaczone_etykieta_id  (cost=0.00..7.29 rows=391 width=0)
(actual time=0.013..0.013 rows=69 loops=7)
                                         Index Cond: (etykieta_id =
spoleczniak_subskrypcje.tag_id)
               SubPlan 2
                 ->  Index Scan using
spoleczniak_obserwatorium_obserwujacy_id on spoleczniak_obserwatorium
(cost=0.00..39.36 rows=21 width=4) (actual time=0.006..0.030 rows=26
loops=1)
                       Index Cond: (obserwujacy_id = 1)
               SubPlan 3
                 ->  Bitmap Heap Scan on spoleczniak_komentarze
(cost=18.67..20.68 rows=1 width=0) (never executed)
                       Recheck Cond: ((kredka_id =
spoleczniak_tablica.id) AND (postac_id = 1))
                       ->  BitmapAnd  (cost=18.67..18.67 rows=1
width=0) (never executed)
                             ->  Bitmap Index Scan on
spoleczniak_komentarze_kredka_id  (cost=0.00..2.98 rows=24 width=0)
(never executed)
                                   Index Cond: (kredka_id =
spoleczniak_tablica.id)
                             ->  Bitmap Index Scan on
spoleczniak_komentarze_postac_id  (cost=0.00..15.44 rows=890 width=0)
(never executed)
                                   Index Cond: (postac_id = 1)
               SubPlan 4
                 ->  Index Scan using spoleczniak_komentarze_postac_id
on spoleczniak_komentarze  (cost=0.00..1610.46 rows=890 width=4)
(actual time=0.013..2.983 rows=3605 loops=1)
                       Index Cond: (postac_id = 1)
         ->  Index Scan using postac_postacie_pkey on postac_postacie
(cost=0.00..0.31 rows=1 width=89) (actual time=0.004..0.005 rows=1
loops=21)
               Index Cond: (id = spoleczniak_tablica.postac_id)
 Total runtime: 149.211 ms (in rush hours runtime is ~600 ms)

If I delete ORDER BY clause, runtime is less than 30 ms. As you can
see - it's big table, more than 3 000 000 records. Any hints how to
optimize this query?

pgsql-performance by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: 8.4 optimization regression?
Next
From: Szymon Kosok
Date:
Subject: Query optimization help