Re: [GENERAL] Weird performance difference - Mailing list pgsql-general

From Israel Brewster
Subject Re: [GENERAL] Weird performance difference
Date
Msg-id 033D1F44-56D4-4149-B9A4-8B8A8DFE96A5@ravnalaska.net
Whole thread Raw
In response to Re: [GENERAL] Weird performance difference  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: [GENERAL] Weird performance difference
List pgsql-general
On Oct 20, 2017, at 3:31 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Fri, Oct 20, 2017 at 03:08:26PM -0800, Israel Brewster wrote:
Summary: the following query takes around 12 seconds on my test machine. On my production machine, it's at half an hour and counting. What's going on?

which, when run on my test server, has this explain analyze output: https://explain.depesz.com/s/4piv <https://explain.depesz.com/s/4piv>. Around 12 second runtime, which isn't too bad (in the grand scheme of things), although there is probably room for improvement.

Are these cast to ::date cast is really needed (Alternately, do you have an index on column::date ?)
|WHERE outtime::date>='2017-01-01'
|ON outtime::date BETWEEN oag_schedules.startdate
|AND outtime::date BETWEEN oag_batches.eff_from

Possibly not, but outtime is a timestamp while startdate and eff_from are simple date fields. I was being explicit :-) and no, I don't have an index on column::date, that's certainly something I can try to optimize things.

The problem is clearly here:

Merge Join (cost=30,604.12..31,301.12 ROWS=1 width=76) (actual time=1,153.883..9,812.434 ROWS=3,420,235 loops=1)
   Merge Cond: ((((oag_schedules.flightnum)::text) = (legdetail.flightnum)::text) AND ((oag_schedules.origin)::text = (legdetail.legfrom)::text))
   Join Filter: (((legdetail.outtime)::date >= oag_schedules.startdate) AND ((legdetail.outtime)::date <= COALESCE(oag_schedules.enddate, 'infinity'::date)) AND (date_part('isodow'::text, ((legdetail.outtime)::date)::timestamp without time zone) = ANY ((oag_schedules.frequency)::double precision[])))
   ROWS REMOVED BY JOIN FILTER: 6822878

Can you send "explain" (not explain analyze) for the production server?

Not for the full query - it only just completed, after 70 minutes or so, and I wasn't running under EXPLAIN ANALYZE. Running with a shorter date range of only 7 days, as you suggest below: https://explain.depesz.com/s/r80j

I notice that this is a COMPLETELY different query plan from what I got on my test machine, but I'm not sure if that's a good thing or not. The end result is obviously not.


And \d for those tables.

Sure:

flightlogs=# \d legdetail
                                    Table "public.legdetail"
   Column   |           Type           |                       Modifiers                        
------------+--------------------------+--------------------------------------------------------
 id         | integer                  | not null default nextval('legdetail_id_seq'::regclass)
 logid      | integer                  | 
 flightnum  | character varying(32)    | 
 legfrom    | character varying(6)     | 
 legto      | character varying(6)     | 
 pax        | integer                  | default 0
 cargo      | integer                  | default 0
 legdate    | date                     | default now()
 outtime    | timestamp with time zone | not null
 ontime     | timestamp with time zone | 
 offtime    | timestamp with time zone | 
 intime     | timestamp with time zone | 
 blocktime  | interval                 | 
 flighttime | interval                 | 
 mail       | integer                  | default 0
 csom       | character(1)             | 
 nonrevpax  | smallint                 | 
 legsic     | character varying(128)   | 
 nonrevfrt  | integer                  | 
 ti         | boolean                  | default false
Indexes:
    "legdetail_pkey" PRIMARY KEY, btree (id)
    "csom_idx" btree (csom)
    "flightnum_idx" btree (flightnum)
    "legdate_idx" btree (legdate)
    "legfrom_idx" btree (legfrom)
    "legto_idx" btree (legto)
    "logid_idx" btree (logid)
    "outtime_idx" btree (outtime)
Foreign-key constraints:
    "logfk" FOREIGN KEY (logid) REFERENCES logs(id)
Triggers:
    legdetail_audit AFTER UPDATE ON legdetail FOR EACH ROW WHEN (old.* <> new.*) EXECUTE PROCEDURE changeaudit()
    update_hours_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE update_hours()
    update_logtime_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE update_logtime()
    update_status_legs AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE acstatupdate_loc()

flightlogs=# \d oag_batches
                     Foreign table "public.oag_batches"
  Column  |           Type           | Modifiers |       FDW Options        
----------+--------------------------+-----------+--------------------------
 batchid  | uuid                     | not null  | (column_name 'batchid')
 name     | character varying(256)   | not null  | (column_name 'name')
 created  | timestamp with time zone |           | (column_name 'created')
 eff_from | date                     |           | (column_name 'eff_from')
 eff_to   | date                     |           | (column_name 'eff_to')
Server: localhost
FDW Options: (schema_name 'public', table_name 'batches')

flightlogs=# \d oag_schedules
                       Foreign table "public.oag_schedules"
    Column    |          Type          | Modifiers |         FDW Options          
--------------+------------------------+-----------+------------------------------
 id           | uuid                   | not null  | (column_name 'id')
 batchid      | uuid                   |           | (column_name 'batchid')
 company      | character varying(3)   |           | (column_name 'company')
 flightnum    | integer                | not null  | (column_name 'flightnum')
 startdate    | date                   |           | (column_name 'startdate')
 enddate      | date                   |           | (column_name 'enddate')
 frequency    | integer[]              |           | (column_name 'frequency')
 origin       | character varying(3)   |           | (column_name 'origin')
 depart_time  | time without time zone |           | (column_name 'depart_time')
 destination  | character varying(3)   |           | (column_name 'destination')
 arrival_time | time without time zone |           | (column_name 'arrival_time')
 equipment    | character varying(3)   |           | (column_name 'equipment')
 flight_type  | character(1)           |           | (column_name 'flight_type')
 num_seats    | smallint               |           | (column_name 'num_seats')
Server: localhost
FDW Options: (schema_name 'public', table_name 'schedules')


And/or EXPLAIN ANALYZE for a query with shorter date range on production (to
confirm it has a similar problem in rowcount estimation).

You can try munging the query to move/change the "Join Filter" components of
the query to see which one is contributing most to the rowcount estimate being
off by a factor of 3e6.

Justin

pgsql-general by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: [GENERAL] Weird performance difference
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] Weird performance difference