Two fast queries get slow when combined - Mailing list pgsql-performance

From cluster
Subject Two fast queries get slow when combined
Date
Msg-id fg85ae$1740$1@news.hub.org
Whole thread Raw
Responses Re: Two fast queries get slow when combined
Re: Two fast queries get slow when combined
List pgsql-performance
I have two small queries which are both very fast to evaluate
separately. The first query, "Query 1", calculates some statistics and
the the second query, "Query 2", finds a subset of relevant keys.
When combined into a single query which calculates statistics from only
the subset of relevant keys the evaluation plan explodes and uses both
seq scans and bitmap heap scans.
How can I improve the performance of the combined query?

Queries and output from EXPLAIN ANALYZE can be seen here with some
syntax highlighting:
    http://rafb.net/p/BJIW4p69.html

I will also paste it here:

=============================================================================
QUERY 1 (very *fast*):
=============================================================================
SELECT keyId, count(1) as num_matches
FROM stats
GROUP BY keyId
LIMIT 50

  Limit  (cost=0.00..23.65 rows=50 width=8) (actual time=0.090..2.312
rows=50 loops=1)
    ->  GroupAggregate  (cost=0.00..4687.46 rows=9912 width=8) (actual
time=0.085..2.145 rows=50 loops=1)
          ->  Index Scan using stats_keyId on stats  (cost=0.00..3820.19
rows=99116 width=8) (actual time=0.031..1.016 rows=481 loops=1)
  Total runtime: 2.451 ms
(4 rows)


=============================================================================
QUERY 2 (very *fast*):
=============================================================================
SELECT keyId, sortNum
FROM items i
WHERE sortNum > 123
ORDER BY sortNum
LIMIT 50

  Limit  (cost=0.01..9.87 rows=50 width=8) (actual time=0.068..0.610
rows=50 loops=1)
    InitPlan
      ->  Limit  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.009..0.025 rows=1 loops=1)
            ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.006..0.007 rows=1 loops=1)
    ->  Index Scan using items_sortNum on items i  (cost=0.00..1053.67
rows=5344 width=8) (actual time=0.063..0.455 rows=50 loops=1)
          Index Cond: (sortNum >= $0)
  Total runtime: 0.749 ms
(7 rows)



=============================================================================
COMBINED QUERY (very *slow*):
=============================================================================
   SELECT keyId, sortNum, count(1)
   FROM stats s, items i
   WHERE s.keyId = i.keyId AND i.sortNum > 123
   GROUP BY i.keyId, i.sortNum
   ORDER BY i.sortNum
   LIMIT 50

Limit  (cost=3281.72..3281.84 rows=50 width=16) (actual
time=435.838..436.043 rows=50 loops=1)
    InitPlan
      ->  Limit  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.016..0.021 rows=1 loops=1)
            ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.012..0.013 rows=1 loops=1)
    ->  Sort  (cost=3281.71..3289.97 rows=3304 width=16) (actual
time=435.833..435.897 rows=50 loops=1)
          Sort Key: i.sortNum
          ->  Hash Join  (cost=2745.80..3088.59 rows=3304 width=16)
(actual time=364.247..413.164 rows=8490 loops=1)
                Hash Cond: (s.keyId = i.keyId)
                ->  HashAggregate  (cost=2270.53..2394.43 rows=9912
width=8) (actual time=337.869..356.533 rows=9911 loops=1)
                      ->  Seq Scan on items  (cost=0.00..1527.16
rows=99116 width=8) (actual time=0.016..148.118 rows=99116 loops=1)
                ->  Hash  (cost=408.47..408.47 rows=5344 width=12)
(actual time=26.342..26.342 rows=4491 loops=1)
                      ->  Bitmap Heap Scan on items i
(cost=121.67..408.47 rows=5344 width=12) (actual time=5.007..16.898
rows=4491 loops=1)
                            Recheck Cond: (sortNum >= $0)
                            ->  Bitmap Index Scan on items_sortNum
(cost=0.00..120.33 rows=5344 width=0) (actual time=4.273..4.273
rows=13375 loops=1)
                                  Index Cond: (sortNum >= $0)
Total runtime: 436.421 ms
(16 rows)

pgsql-performance by date:

Previous
From: "Steven Flatt"
Date:
Subject: Re: tables with 300+ partitions
Next
From: Heikki Linnakangas
Date:
Subject: Re: Two fast queries get slow when combined