hashjoin chosen over 1000x faster plan - Mailing list pgsql-performance

From Kevin Grittner
Subject hashjoin chosen over 1000x faster plan
Date
Msg-id 470B99BE.EE98.0025.0@wicourts.gov
Whole thread Raw
Responses Re: hashjoin chosen over 1000x faster plan  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-performance
I have a situation where a query is running much slower than I would
expect.  The ANALYZE showed that it is hashing some information which
is rarely needed.  When I set enable_hashjoin = off for the
connection the query run in 1/1000 the time.

This isn't a debilitating level of performance, but it would be nice
to clean it up, and we haven't yet come up with a viable solution.

The runs below are after several identical runs to create a fully
cached situation.  Autovacuum is aggressive and there is a nightly
vacuum analyze of the whole database.  This box has 4 x 2 GHz Xeon
CPUs, 6 GB RAM, RAID 5 with 13 spindles on 256 MB BBU controller.

I simplified the original a bit; sorry it's still kinda big.

-Kevin

listen_addresses = '*'
max_connections = 200
shared_buffers = 160MB
temp_buffers = 50MB
work_mem = 10MB
maintenance_work_mem = 160MB
max_fsm_pages = 800000
bgwriter_lru_percent = 20.0
bgwriter_lru_maxpages = 200
bgwriter_all_percent = 10.0
bgwriter_all_maxpages = 600
wal_buffers = 160kB
checkpoint_segments = 10
random_page_cost = 2.0
effective_cache_size = 5GB
redirect_stderr = on
log_line_prefix = '[%m] %p %q<%u %d %r> '
stats_block_level = on
stats_row_level = on
autovacuum = on
autovacuum_naptime = 10s
autovacuum_vacuum_threshold = 1
autovacuum_analyze_threshold = 1
datestyle = 'iso, mdy'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
escape_string_warning = off
standard_conforming_strings = on
sql_inheritance = off

bigbird=> select version();
                                       version
-------------------------------------------------------------------------------------
 PostgreSQL 8.2.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)
(1 row)

bigbird=> explain analyze
SELECT
    "CH"."caseNo",
    "CH"."countyNo",
    "CH"."chargeNo",
    "CH"."statuteCite",
    "CH"."sevClsCode",
    "CH"."modSevClsCode",
    "CH"."descr",
    "CH"."offenseDate",
    "CH"."pleaCode",
    "CH"."pleaDate",
    "CH"."chargeSeqNo",
    "CHST"."eventDate" AS "reopEventDate",
    "CTHE"."descr" AS "reopEventDescr"
  FROM "Charge" "CH"
  LEFT OUTER JOIN "CaseHist" "CHST"
      ON ( "CHST"."countyNo" = "CH"."countyNo"
       AND "CHST"."caseNo" = "CH"."caseNo"
       AND "CHST"."histSeqNo" = "CH"."reopHistSeqNo"
         )
  LEFT OUTER JOIN "CaseTypeHistEvent" "CTHE"
      ON ( "CHST"."eventType" = "CTHE"."eventType"
       AND "CHST"."caseType" = "CTHE"."caseType"
       AND "CHST"."countyNo" = "CTHE"."countyNo"
         )
  WHERE (
        ("CH"."caseNo" = '2005CF000001')
    AND ("CH"."countyNo" = 13))
  ORDER BY
    "chargeNo",
    "chargeSeqNo"
;
                                                                                               QUERY PLAN
                             

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=2554.50..2554.52 rows=7 width=146) (actual time=443.068..443.070 rows=3 loops=1)
   Sort Key: "CH"."chargeNo", "CH"."chargeSeqNo"
   ->  Hash Left Join  (cost=2318.91..2554.40 rows=7 width=146) (actual time=443.004..443.039 rows=3 loops=1)
         Hash Cond: ((("CHST"."eventType")::bpchar = ("CTHE"."eventType")::bpchar) AND (("CHST"."caseType")::bpchar =
("CTHE"."caseType")::bpchar))
         ->  Nested Loop Left Join  (cost=0.00..208.13 rows=7 width=131) (actual time=0.062..0.093 rows=3 loops=1)
               ->  Index Scan using "Charge_pkey" on "Charge" "CH"  (cost=0.00..15.37 rows=7 width=112) (actual
time=0.052..0.059rows=3 loops=1) 
                     Index Cond: ((("countyNo")::smallint = 13) AND (("caseNo")::bpchar = '2005CF000001'::bpchar))
               ->  Index Scan using "CaseHist_pkey" on "CaseHist" "CHST"  (cost=0.00..27.46 rows=6 width=41) (actual
time=0.002..0.002rows=0 loops=3) 
                     Index Cond: ((("CHST"."countyNo")::smallint = 13) AND (("CHST"."caseNo")::bpchar =
'2005CF000001'::bpchar)AND (("CHST"."histSeqNo")::smallint = ("CH"."reopHistSeqNo")::smallint)) 
         ->  Hash  (cost=2084.80..2084.80 rows=15607 width=98) (actual time=442.919..442.919 rows=15607 loops=1)
               ->  Subquery Scan "CTHE"  (cost=1630.43..2084.80 rows=15607 width=98) (actual time=331.665..411.390
rows=15607loops=1) 
                     ->  Merge Right Join  (cost=1630.43..1928.73 rows=15607 width=89) (actual time=331.661..391.999
rows=15607loops=1) 
                           Merge Cond: (((d."countyNo")::smallint = "inner"."?column9?") AND ((d."caseType")::bpchar =
"inner"."?column10?")AND ((d."eventType")::bpchar = "inner"."?column11?")) 
                           ->  Index Scan using "CaseTypeHistEventD_pkey" on "CaseTypeHistEventD" d  (cost=0.00..87.77
rows=2051width=21) (actual time=0.026..0.730 rows=434 loops=1) 
                           ->  Sort  (cost=1630.43..1669.45 rows=15607 width=76) (actual time=331.022..341.450
rows=15607loops=1) 
                                 Sort Key: (c."countyNo")::smallint, (b."caseType")::bpchar, (b."eventType")::bpchar
                                 ->  Nested Loop  (cost=0.00..543.41 rows=15607 width=76) (actual time=0.035..47.206
rows=15607loops=1) 
                                       ->  Index Scan using "ControlRecord_pkey" on "ControlRecord" c  (cost=0.00..4.27
rows=1width=2) (actual time=0.010..0.017 rows=1 loops=1) 
                                             Index Cond: (("countyNo")::smallint = 13)
                                       ->  Seq Scan on "CaseTypeHistEventB" b  (cost=0.00..383.07 rows=15607 width=74)
(actualtime=0.019..14.634 rows=15607 loops=1) 
 Total runtime: 444.452 ms
(21 rows)

bigbird=> set enable_hashjoin = off;
SET
bigbird=> [same query]
                                                                                                  QUERY PLAN
                             

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3497.26..3497.28 rows=7 width=146) (actual time=0.115..0.117 rows=3 loops=1)
   Sort Key: "CH"."chargeNo", "CH"."chargeSeqNo"
   ->  Merge Left Join  (cost=3380.05..3497.17 rows=7 width=146) (actual time=0.091..0.097 rows=3 loops=1)
         Merge Cond: (("outer"."?column16?" = "inner"."?column5?") AND ("outer"."?column17?" = "inner"."?column6?"))
         ->  Sort  (cost=208.23..208.25 rows=7 width=131) (actual time=0.087..0.089 rows=3 loops=1)
               Sort Key: ("CHST"."caseType")::bpchar, ("CHST"."eventType")::bpchar
               ->  Nested Loop Left Join  (cost=0.00..208.13 rows=7 width=131) (actual time=0.053..0.070 rows=3
loops=1)
                     ->  Index Scan using "Charge_pkey" on "Charge" "CH"  (cost=0.00..15.37 rows=7 width=112) (actual
time=0.043..0.048rows=3 loops=1) 
                           Index Cond: ((("countyNo")::smallint = 13) AND (("caseNo")::bpchar =
'2005CF000001'::bpchar))
                     ->  Index Scan using "CaseHist_pkey" on "CaseHist" "CHST"  (cost=0.00..27.46 rows=6 width=41)
(actualtime=0.001..0.001 rows=0 loops=3) 
                           Index Cond: ((("CHST"."countyNo")::smallint = 13) AND (("CHST"."caseNo")::bpchar =
'2005CF000001'::bpchar)AND (("CHST"."histSeqNo")::smallint = ("CH"."reopHistSeqNo")::smallint)) 
         ->  Sort  (cost=3171.82..3210.84 rows=15607 width=98) (never executed)
               Sort Key: ("CTHE"."caseType")::bpchar, ("CTHE"."eventType")::bpchar
               ->  Subquery Scan "CTHE"  (cost=1630.43..2084.80 rows=15607 width=98) (never executed)
                     ->  Merge Right Join  (cost=1630.43..1928.73 rows=15607 width=89) (never executed)
                           Merge Cond: (((d."countyNo")::smallint = "inner"."?column9?") AND ((d."caseType")::bpchar =
"inner"."?column10?")AND ((d."eventType")::bpchar = "inner"."?column11?")) 
                           ->  Index Scan using "CaseTypeHistEventD_pkey" on "CaseTypeHistEventD" d  (cost=0.00..87.77
rows=2051width=21) (never executed) 
                           ->  Sort  (cost=1630.43..1669.45 rows=15607 width=76) (never executed)
                                 Sort Key: (c."countyNo")::smallint, (b."caseType")::bpchar, (b."eventType")::bpchar
                                 ->  Nested Loop  (cost=0.00..543.41 rows=15607 width=76) (never executed)
                                       ->  Index Scan using "ControlRecord_pkey" on "ControlRecord" c  (cost=0.00..4.27
rows=1width=2) (never executed) 
                                             Index Cond: (("countyNo")::smallint = 13)
                                       ->  Seq Scan on "CaseTypeHistEventB" b  (cost=0.00..383.07 rows=15607 width=74)
(neverexecuted) 
 Total runtime: 0.437 ms
(24 rows)

bigbird=> \d "Charge"
                Table "public.Charge"
       Column       |        Type         | Modifiers
--------------------+---------------------+-----------
 caseNo             | "CaseNoT"           | not null
 chargeSeqNo        | "ChargeSeqNoT"      | not null
 countyNo           | "CountyNoT"         | not null
 areSentCondsMet    | boolean             | not null
 caseType           | "CaseTypeT"         | not null
 chargeNo           | "ChargeNoT"         | not null
 descr              | "StatuteDescrT"     | not null
 isPartyTo          | boolean             | not null
 lastChargeModSeqNo | integer             | not null
 lastJdgmtSeqNo     | "JdgmtSeqNoT"       | not null
 ordStatuteFlag     | character(1)        | not null
 plntfAgencyNo      | "PlntfAgencyNoT"    | not null
 statuteAgencyNo    | "PlntfAgencyNoT"    | not null
 statuteCite        | "StatuteCiteT"      | not null
 statuteEffDate     | "DateT"             |
 arrestCaseNo       | "ArrestCaseNoT"     |
 arrestDate         | "DateT"             |
 arrestTrackingNo   | "ArrestTrackingNoT" |
 bookAgencyNo       | "IssAgencyNoT"      |
 bookCaseNo         | "BookCaseNoT"       |
 chargeId           | "ChargeIdT"         |
 dispoCode          | "DispoCodeT"        |
 issAgencyNo        | "IssAgencyNoT"      |
 modSevClsCode      | "SevClsCodeT"       |
 offenseDate        | "DateT"             |
 offenseDateRange   | "OffenseDateRangeT" |
 pleaCode           | "PleaCodeT"         |
 pleaDate           | "DateT"             |
 reopHistSeqNo      | "HistSeqNoT"        |
 sevClsCode         | "SevClsCodeT"       |
 statuteSevSeqNo    | "StatuteSevSeqNoT"  |
 wcisClsCode        | "WcisClsCodeT"      |
 pleaHistSeqNo      | "HistSeqNoT"        |
 chargeStatusCode   | "ChargeStatusCodeT" |
Indexes:
    "Charge_pkey" PRIMARY KEY, btree ("countyNo", "caseNo", "chargeSeqNo")
    "Charge_ArrestTrackingNo" UNIQUE, btree ("arrestTrackingNo", "countyNo", "caseNo", "chargeSeqNo")
    "Charge_OffenseDate" btree ("offenseDate", "countyNo", "issAgencyNo")

bigbird=> \d "CaseHist"
           Table "public.CaseHist"
    Column     |       Type       | Modifiers
---------------+------------------+-----------
 caseNo        | "CaseNoT"        | not null
 histSeqNo     | "HistSeqNoT"     | not null
 countyNo      | "CountyNoT"      | not null
 caseType      | "CaseTypeT"      |
 eventAmt      | "MoneyT"         |
 eventDate     | "DateT"          |
 eventType     | "EventTypeT"     |
 userId        | "UserIdT"        |
 courtRptrCode | "CtofcNoT"       |
 ctofcNo       | "CtofcNoT"       |
 dktTxt        | "TextT"          |
 prevRespCtofc | "CtofcNoT"       |
 tag           | "TagTypeT"       |
 tapeCounterNo | "TapeCounterNoT" |
 tapeLoc       | "TapeLocT"       |
 wcisReported  | "DateT"          |
 weightPd      | "PdCodeT"        |
 weightTime    | "CalDurationT"   |
 sealCtofcNo   | "CtofcNoT"       |
 sccaCaseNo    | "SccaCaseNoT"    |
Indexes:
    "CaseHist_pkey" PRIMARY KEY, btree ("countyNo", "caseNo", "histSeqNo")
    "CaseHist_CaseHistCibRpt" btree ("countyNo", "eventDate", "eventType", "caseType")

bigbird=> \d "CaseTypeHistEvent"
      View "public.CaseTypeHistEvent"
     Column     |     Type      | Modifiers
----------------+---------------+-----------
 caseType       | "CaseTypeT"   |
 eventType      | "EventTypeT"  |
 descr          | "EventDescrT" |
 isActive       | boolean       |
 isKeyEvent     | boolean       |
 isMoneyEnabled | boolean       |
 keyEventSeqNo  | integer       |
 countyNo       | "CountyNoT"   |
View definition:
 SELECT b."caseType", b."eventType", b.descr, b."isActive",
        CASE
            WHEN d."eventType" IS NOT NULL THEN d."isKeyEvent"
            ELSE b."isKeyEvent"
        END AS "isKeyEvent",
        CASE
            WHEN d."eventType" IS NOT NULL THEN d."isMoneyEnabled"
            ELSE b."isMoneyEnabled"
        END AS "isMoneyEnabled", COALESCE(
        CASE
            WHEN d."eventType" IS NOT NULL THEN d."keyEventSeqNo"::smallint
            ELSE b."keyEventSeqNo"::smallint
        END::integer, 0) AS "keyEventSeqNo", c."countyNo"
   FROM ONLY "CaseTypeHistEventB" b
   JOIN ONLY "ControlRecord" c ON 1 = 1
   LEFT JOIN ONLY "CaseTypeHistEventD" d ON d."caseType"::bpchar = b."caseType"::bpchar AND d."eventType"::bpchar =
b."eventType"::bpcharAND d."countyNo"::smallint = c."countyNo"::smallint; 

bigbird=> \d "CaseTypeHistEventB"
      Table "public.CaseTypeHistEventB"
     Column     |      Type      | Modifiers
----------------+----------------+-----------
 caseType       | "CaseTypeT"    | not null
 eventType      | "EventTypeT"   | not null
 descr          | "EventDescrT"  | not null
 isActive       | boolean        | not null
 isKeyEvent     | boolean        | not null
 isMoneyEnabled | boolean        | not null
 keyEventSeqNo  | "KeyEventSeqT" |
Indexes:
    "CaseTypeHistEventB_pkey" PRIMARY KEY, btree ("caseType", "eventType") CLUSTER

bigbird=> \d "CaseTypeHistEventD"
      Table "public.CaseTypeHistEventD"
     Column     |      Type      | Modifiers
----------------+----------------+-----------
 countyNo       | "CountyNoT"    | not null
 caseType       | "CaseTypeT"    | not null
 eventType      | "EventTypeT"   | not null
 isMoneyEnabled | boolean        | not null
 isKeyEvent     | boolean        | not null
 keyEventSeqNo  | "KeyEventSeqT" |
Indexes:
    "CaseTypeHistEventD_pkey" PRIMARY KEY, btree ("countyNo", "caseType", "eventType")
    "CaseTypeHistEventD_CaseType" btree ("caseType", "eventType")

bigbird=> select count(*), count("reopHistSeqNo") from "Charge";
  count   | count
----------+--------
 14041511 | 141720
(1 row)


pgsql-performance by date:

Previous
From: "Radhika S"
Date:
Subject: Postgres running Very slowly
Next
From: "Radhika S"
Date:
Subject: Shared Buffer setting in postgresql.conf