Query performance issue - Mailing list pgsql-general

From
Subject Query performance issue
Date
Msg-id 00e001c32940$227effe0$2766f30a@development.greatgulfhomes.com
Whole thread Raw
List pgsql-general
This query runs fast:
SELECT  supplier_id, supplier_id AS aging__item_id, supplier_name AS
aging__item_name
FROM suppliers AS supp
WHERE supp.division_id = 'GGH'
AND supp.supplier_id IN (SELECT DISTINCT master__dt.assigned_supplier_id
FROM deficiency_table AS master__dt WHERE master__dt.deficiency_status_id IN
('LEGAL', 'OPEN', 'PENDIN', 'REOPEN') )
ORDER BY aging__item_name

NOTICE:  QUERY PLAN:
Sort  (cost=111725225.41..111725225.41 rows=2737 width=58)
  ->  Seq Scan on suppliers supp  (cost=0.00..111725069.18 rows=2737
width=58)
        SubPlan
          ->  Materialize  (cost=5348.15..5348.15 rows=308 width=4)
                ->  Unique  (cost=5340.44..5348.15 rows=308 width=4)
                      ->  Sort  (cost=5340.44..5340.44 rows=3083 width=4)
                            ->  Seq Scan on deficiency_table master__dt
(cost=0.00..5161.78 rows=3083 width=4)

And so does the (equivalent) nested one:
SELECT count(dt.lot_id) AS def_count_inner
 FROM deficiency_table AS dt
 WHERE dt.deficiency_status_id IN ('LEGAL', 'OPEN', 'REOPEN')
     AND dt.assigned_supplier_id = '100001'

NOTICE:  QUERY PLAN:
Aggregate  (cost=5161.82..5161.82 rows=1 width=4)
  ->  Seq Scan on deficiency_table dt  (cost=0.00..5161.78 rows=14 width=4)


But when I put the nested one inside, it takes a long time to run.  The
query takes a long time to run:
SELECT  supplier_id, supplier_id AS aging__item_id, supplier_name AS
aging__item_name
, (SELECT count(dt.lot_id) AS def_count_inner
 FROM deficiency_table AS dt
 WHERE dt.deficiency_status_id IN ('LEGAL', 'OPEN', 'REOPEN')
     AND dt.assigned_supplier_id = supp.supplier_id
) AS def_count
FROM suppliers AS supp
WHERE supp.division_id = 'GGH'
AND supp.supplier_id IN (SELECT DISTINCT master__dt.assigned_supplier_id
FROM deficiency_table AS master__dt WHERE master__dt.deficiency_status_id IN
('LEGAL', 'OPEN', 'PENDIN', 'REOPEN') )
ORDER BY aging__item_name

NOTICE:  QUERY PLAN:
Sort  (cost=111725225.41..111725225.41 rows=2737 width=58)
  ->  Seq Scan on suppliers supp  (cost=0.00..111725069.18 rows=2737
width=58)
        SubPlan
          ->  Aggregate  (cost=5161.86..5161.86 rows=1 width=4)
                ->  Seq Scan on deficiency_table dt  (cost=0.00..5161.78
rows=32 width=4)
          ->  Materialize  (cost=5348.15..5348.15 rows=308 width=4)
                ->  Unique  (cost=5340.44..5348.15 rows=308 width=4)
                      ->  Sort  (cost=5340.44..5340.44 rows=3083 width=4)
                            ->  Seq Scan on deficiency_table master__dt
(cost=0.00..5161.78 rows=3083 width=4)

PLEASE NOTE:
I have even tried changing the outer WHERE clause from:
AND supp.supplier_id IN (SELECT DISTINCT master__dt.assigned_supplier_id
FROM deficiency_table AS master__dt WHERE master__dt.deficiency_status_id IN
('LEGAL', 'OPEN', 'PENDIN', 'REOPEN') )
To:
AND supp.supplier_id IN ('0', '100001', '100002', '100004', '100007',
'100018', '100040', '100070', '100130', '100177', '100223', '100284',
'100315', '100350', '100360', '100380', '100392', '100422', '100535',
'100927', '101311', '101359', '101420', '101665', '101686', '101690',
'101711', '101908', '101945', '102014', '102319', '102420', '102425',
'102498', '102542', '102593', '102672', '102683', '102726', '102730',
'102755', '102795', '102888', '102890', '102951', '103000', '103060',
'103163', '103349', '103350', '103361', '103512', '103526', '103694',
'103708', '103710', '103790', '103832', '103945', '103976', '104023',
'104039', '104206', '104289', '104314', '104350', '104359', '104364',
'104478', '104483', '104519', '104521', '104589', '104621', '104633',
'104636', '104642', '104646', '104659', '104662', '104669', '104705',
'104710', '104714', '104722', '104866', '104914', '105061', '105225',
'105233', '105462', '')

Which gives:
NOTICE:  QUERY PLAN:
Sort  (cost=519.17..519.17 rows=24 width=58)
  ->  Index Scan using suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey, suppliers_pkey,
suppliers
_pkey, suppliers_pkey, suppliers_pkey on suppliers supp  (cost=0.00..518.63
rows
=24 width=58)
        SubPlan
          ->  Aggregate  (cost=5161.86..5161.86 rows=1 width=4)
                ->  Seq Scan on deficiency_table dt  (cost=0.00..5161.78
rows=32 width=4)

SELECT  supplier_id, supplier_id AS aging__item_id, supplier_name AS
aging__item_name
, (SELECT count(dt.lot_id) AS def_count_inner
 FROM deficiency_table AS dt
 WHERE dt.deficiency_status_id IN ('LEGAL', 'OPEN', 'REOPEN')
     AND dt.assigned_supplier_id = supp.supplier_id
) AS def_count
FROM suppliers AS supp
WHERE supp.division_id = 'GGH'
AND supp.supplier_id IN ('100001', '100002', '100004', '100007', '100018',
'100040', '100070', '100130', '100177', '100223', '100284', '100315',
'100350', '100360', '100380', '100392', '100422', '100535', '100927',
'101311', '101359', '101420', '101665', '101686', '101690', '101711',
'101908', '101945', '102014', '102319', '102420', '102425', '102498',
'102542', '102593', '102672', '102683', '102726', '102730', '102755',
'102795', '102888', '102890', '102951', '103000', '103060', '103163',
'103349', '103350', '103361', '103512', '103526', '103694', '103708',
'103710', '103790', '103832', '103945', '103976', '104023', '104039',
'104206', '104289', '104314', '104350', '104359', '104364', '104478',
'104483', '104519', '104521', '104589', '104621', '104633', '104636',
'104642', '104646', '104659', '104662', '104669', '104705', '104710',
'104714', '104722', '104866', '104914', '105061', '105225', '105233',
'105462')
ORDER BY aging__item_name

Which is also very long to run, only a very slight time saving if any.

THE BIG PROBLEM:  The above query only has 1 nested clause for the value
"def_count_inner".  My real query has numerous such clauses, as it is a
breakdown report, and I need one clause for each of < 30 days, 30-60 days,
60-90, over 90, etc.


I realize that my nested query does take a 100-400 milliseconds to run, and
there are about 90 suppliers returned.  So 30 second response time is not
unreasonable, but my response time seems to be almost 2 minutes.

Can anyone help me with optimizing this query?

Is there a better way to write this query?

Postgres 7.2.4 on linux.

Thanks

Terry Fielder
Manager Software Development and Deployment
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com
Fax: (416) 441-9085


pgsql-general by date:

Previous
From: "Rod Cope"
Date:
Subject: [ANN] PostgreSQL included in Out-of-the-Box Open Source distribution
Next
From: Thomas Kellerer
Date:
Subject: Re: Powerbuilder with Postgre