Re: hashjoin chosen over 1000x faster plan - Mailing list pgsql-performance
From | Kevin Grittner |
---|---|
Subject | Re: hashjoin chosen over 1000x faster plan |
Date | |
Msg-id | 470C9813.EE98.0025.0@wicourts.gov Whole thread Raw |
In response to | Re: hashjoin chosen over 1000x faster plan (Simon Riggs <simon@2ndquadrant.com>) |
Responses |
Re: hashjoin chosen over 1000x faster plan
|
List | pgsql-performance |
>>> On Wed, Oct 10, 2007 at 1:31 AM, in message <1191997904.4233.125.camel@ebony.site>, Simon Riggs <simon@2ndquadrant.com> wrote: > On Tue, 2007-10-09 at 15:09 -0500, Kevin Grittner wrote: > >> I have a situation where a query is running much slower than I would >> expect. The ANALYZE showed that it is hashing some information which >> is rarely needed. When I set enable_hashjoin = off for the >> connection the query run in 1/1000 the time. > > Can you confirm the two queries give identical outputs? I checked; the output is identical. > It isn't clear > to me why the second sort is (never executed) in your second plan, which > I would only expect to see for an inner merge join. I assume that is because there were no rows to sort. The CaseTypeHistEvent view is only needed if there is a link to an event which reopens the charge after it is disposed. This only happens for about 1% of the Charge records. The view is a bit weird, but evolved this way. Originally, there was a table by that name which was maintained statewide by our organization (the Consolidated Court Automation Programs, or CCAP). Then there was a decision to allow counties to override certain columns with their own values. Modifying the central copy and merging the changes into 72 modified copies was nasty, so we split the state-maintained portion and the county overrides into two tables with a B (for Base) and D (for Distributed) suffix, and provided a view to present the merged form to the existing queries. That way only the software to maintain the data needed to be modified, rather than all the references to it. There aren't a lot of rows in the distributed table; most counties take the defaults. The ControlRecord table is joined to the base to show one row of the base data per county in the database. This performance problem is on the central, consolidated copy of all 72 counties. > Can you show the details for ControlRecord also. bigbird=> \d "ControlRecord" Table "public.ControlRecord" Column | Type | Modifiers --------------------+------------------------+----------- countyNo | "CountyNoT" | not null dispEventTime | boolean | not null exportDeletes | boolean | not null standAloneMode | boolean | not null sysMailData | character(1) | not null chargeClsEvent | "EventTypeT" | checkPrinterDriver | character varying(50) | cofcCtofcNo | "CtofcNoT" | ctofcNo | "CtofcNoT" | defaultDaAttyNo | "AttyNoT" | districtNo | "DistrictNoT" | dktFee | "MoneyT" | dotCourtNo | character(8) | initialTrafCal | "ActivityTypeT" | maxToPrint | smallint | postJdgmtStatus | "StatusCodeT" | rcptPrinterDriver | character varying(50) | savedTxtFilePath | character varying(120) | scffAmt | "MoneyT" | scsfAmt | "MoneyT" | taxWarrantNo | "CountyNoT" | dorAgencyNo | character(10) | jurorMailerPrntDrv | character varying(50) | calKioskMessage | "TextT" | autoAssgnCaseEqual | boolean | not null sectionLimit | integer | not null sectionBufferLimit | integer | not null calKioskKeyboard | character(1) | saveCFRdoc | boolean | showAudioRecTab | boolean | weekdayStartTime | "TimeT" | weekdayEndTime | "TimeT" | saturdayStartTime | "TimeT" | saturdayEndTime | "TimeT" | sundayStartTime | "TimeT" | sundayEndTime | "TimeT" | reportStorageDays | integer | Indexes: "ControlRecord_pkey" PRIMARY KEY, btree ("countyNo") -Kevin
pgsql-performance by date: