How can I speed up this search? - Mailing list pgsql-general

From Oliver Elphick
Subject How can I speed up this search?
Date
Msg-id 1020782559.13683.2689.camel@linda
Whole thread Raw
Responses Re: How can I speed up this search?
List pgsql-general
I am searching on a FULL JOIN of two tables (using 7.2.1).  VACUUM
ANALYZE has been done.

If I search on either of them separately, the search uses the
appropriate index, but when they are joined, the second table uses a
sequential scan and takes far longer than (I think) it ought.  Can I
reshape the query or add additional indexes to improve the planner's
guess?

    bray=# explain analyze  SELECT '' AS van,
    bray-#        COALESCE(a.year, f.year) AS year,
    bray-#        COALESCE(a.month, f.month) AS month,
    bray-#        COALESCE(f.qty,0) AS fc_qty, COALESCE(a.qty,0) AS
    qty,        COALESCE(a.home_inv_qty,0) AS home_inv_qty,
    bray-#        COALESCE(a.exp_inv_qty,0) AS exp_inv_qty,
    bray-#        COALESCE(f.revenue,0) AS revenue,
    bray-#        COALESCE(a.ccy, 'GBP') AS ccy,
    bray-#        COALESCE(a.val, 0.) AS val,
    bray-#        COALESCE(a.home_inv_val, 0.) AS home_inv_val,
    bray-#        COALESCE(a.exp_inv_val, 0.) AS exp_inv_val
    bray-#   FROM stock_allocation AS a
    bray-#        FULL JOIN sales_forecast AS f
    bray-#          ON a.product = f.product AND
    bray-#             a.year = f.year AND
    bray-#             a.month = f.month AND
    bray-#             a.ccy = f.ccy
    bray-#   WHERE a.product = 'CC009' OR f.product = 'CC009';
    NOTICE:  QUERY PLAN:

    Merge Join  (cost=49104.90..51498.34 rows=241423 width=101) (actual
    time=31518.32..33565.86 rows=48 loops=1)
      ->  Index Scan using sales_forecast_pkey on sales_forecast f
    (cost=0.00..1543.00 rows=27528 width=40) (actual time=0.27..278.38
    rows=27528 loops=1)
      ->  Sort  (cost=49104.90..49104.90 rows=241423 width=61) (actual
    time=29576.37..30185.22 rows=241423 loops=1)
            ->  Seq Scan on stock_allocation a  (cost=0.00..5580.23
    rows=241423 width=61) (actual time=0.07..2772.26 rows=241423
    loops=1)
    Total runtime: 38289.59 msec

    EXPLAIN
    bray=# \d stock_allocation
                     Table "stock_allocation"
        Column    |         Type          |     Modifiers
    --------------+-----------------------+--------------------
     product      | character varying(10) | not null
     year         | smallint              | not null
     month        | smallint              | not null
     ccy          | character(3)          | not null
     qty          | integer               | not null default 0
     val          | numeric(12,2)         | default 0
     home_inv_qty | integer               | default 0
     home_inv_val | numeric(12,2)         | default 0
     exp_inv_qty  | integer               | default 0
     exp_inv_val  | numeric(12,2)         | default 0
    Indexes: product_currency_index
    Primary key: stock_allocation_pkey
    Check constraints: "month in range" (("month" >= 1) AND
    (((float8("year") < date_part('year'::text,
    ('now'::text)::timestamp(6) with time zone)) AND ("month" <= 12)) OR
    ((float8("year") = date_part('year'::text,
    ('now'::text)::timestamp(6) with time zone)) AND (float8("month") <=
    date_part('month'::text, ('now'::text)::timestamp(6) with time
    zone)))))
                       "year in range" (("year" > 1987) AND
    (float8("year") <= date_part('year'::text,
    ('now'::text)::timestamp(6) with time zone)))
    Triggers: RI_ConstraintTrigger_26246616,
              RI_ConstraintTrigger_26246610

    bray=# \d stock_allocation_pkey
      Index "stock_allocation_pkey"
     Column  |         Type
    ---------+-----------------------
     product | character varying(10)
     year    | smallint
     month   | smallint
     ccy     | character(3)
    unique btree (primary key)

    bray=# \d product_currency_index
     Index "product_currency_index"
     Column  |         Type
    ---------+-----------------------
     product | character varying(10)
     ccy     | character(3)
    btree


    bray=# explain analyze select * from stock_allocation where product
    = 'CC009';
    NOTICE:  QUERY PLAN:

    Index Scan using product_currency_index on stock_allocation
    (cost=0.00..204.90 rows=68 width=61) (actual time=0.43..1.33 rows=48
    loops=1)
    Total runtime: 1.54 msec

    EXPLAIN
    bray=# \d sales_forecast
               Table "sales_forecast"
     Column  |         Type          | Modifiers
    ---------+-----------------------+-----------
     product | character varying(10) | not null
     year    | integer               | not null
     month   | integer               | not null
     ccy     | character(3)          | not null
     qty     | integer               | not null
     revenue | numeric(12,2)         | not null
    Primary key: sales_forecast_pkey
    Check constraints: "must be sterling" (ccy = 'GBP'::bpchar)
                       "month in range" (("month" >= 1) AND ("month" <=
    12))
                       "year in range" (("year" > 2001) AND
    (float8("year") < (date_part('year'::text, date('now'::text)) + 3)))
    Triggers: RI_ConstraintTrigger_26246754

    bray=# \d sales_forecast_pkey
       Index "sales_forecast_pkey"
     Column  |         Type
    ---------+-----------------------
     product | character varying(10)
     year    | integer
     month   | integer
     ccy     | character(3)
    unique btree (primary key)


    bray=# explain select * from sales_forecast where product = 'CC009';
    NOTICE:  QUERY PLAN:

    Index Scan using sales_forecast_pkey on sales_forecast
    (cost=0.00..92.57 rows=25 width=40)

    EXPLAIN


Forcing use of the index in the join does not help; it takes slightly
longer, so the planner is correct.  With ENABLE_SEQSCAN off:

    NOTICE:  QUERY PLAN:

    Merge Join  (cost=544454.29..546847.73 rows=241423 width=101)
    (actual time=31523.99..33582.48 rows=48 loops=1)
      ->  Index Scan using sales_forecast_pkey on sales_forecast f
    (cost=0.00..1543.00 rows=27528 width=40) (actual time=0.26..283.75
    rows=27528 loops=1)
      ->  Sort  (cost=544454.29..544454.29 rows=241423 width=61) (actual
    time=29593.48..30177.42 rows=241423 loops=1)
            ->  Index Scan using product_currency_index on
    stock_allocation a  (cost=0.00..500929.62 rows=241423 width=61)
    (actual time=35.60..3539.31 rows=241423 loops=1)
    Total runtime: 38485.57 msec

    EXPLAIN

It seems to me that something is being done in the wrong order, since
there are actually 25 rows to be joined with 48, but the planner is
expecting to join 27528 rows with 241423.  Should it not plan to do the
selection before doing the join?  Or what am I missing?

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C

     "Dearly beloved, avenge not yourselves, but rather give
      place unto wrath. For it is written, Vengeance is
      mine; I will repay, saith the Lord. Therefore if thine
      enemy hunger, feed him; if he thirst, give him drink;
      for in so doing thou shalt heap coals of fire on his
      head. Be not overcome of evil, but overcome evil with
      good."      Romans 12:19-21

Attachment

pgsql-general by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Subject: bool / vacuum full bug followup part 2
Next
From: Tom Lane
Date:
Subject: Re: Error with vacuum