Re: The nested view from hell - Restricting a subquerry - Mailing list pgsql-sql

From Nis Jørgensen
Subject Re: The nested view from hell - Restricting a subquerry
Date
Msg-id f7v6na$2gc$1@sea.gmane.org
Whole thread Raw
In response to The nested view from hell - Restricting a subquerry  (Bryce Nesbitt <bryce1@obviously.com>)
Responses Re: The nested view from hell - Restricting a subquerry  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
Bryce Nesbitt skrev:
> I've got a legacy app with a hefty performance problem.  The basic
> problem is stupid design. It takes 10-15 seconds of CPU time to look up
> an invoice.
> Basically it's trying to mash up extra columns on an otherwise simple
> query, and those extra columns are subtotals.  Simplified (this looks
> best in a fixed width font):
> 
>     SELECT  max(order_view.order_id),max(order_view.invoice_id)
> ,sum(order_view.mileage)
>     FROM    (SELECT order_id,invoice_id, 0 as miles FROM eg_order
>              UNION   
>              SELECT order_id,0         ,      miles FROM eg_order_line)
>              order_view GROUP BY order_view.order_id;
> 
> A select by order_id is fast.  The problem is the application uses
> "select * from view where invoice_id=x", and the second part of the
> UNION returns all possible rows in the database.  These get filtered out
> later, but at considerable performance hit.

Just for the record, I believe your simplified example should look like
this (changed "max(order_id)" to "order_id" in outer select , changed
"miles" to "mileage"):
   SELECT  order_id,max(order_view.invoice_id),sum(order_view.mileage)   FROM    (SELECT order_id,invoice_id, 0 as
mileageFROM eg_order            UNION            SELECT order_id, 0, mileage FROM eg_order_line)            order_view
GROUPBY order_view.order_id;
 


It is pretty clear that the problem comes from joining on the result of
an aggregate. PG apparently is not smart enough to recognize that the
result of a max must be one of the values of the column (meaning that it
can use an index)

It is not clear whether there is a FK relation between eg_order and
eg_order_line and what the PK of eg_order is. If there is a FK, you can
do something along the lines of

SELECT  order_id,invoice_idCOALESCE(sum(mileage),0) as mileage   FROM    eg_order LEFT JOIN eg_order_line USING
order_id  GROUP BY order_id, invoice_id
 


If there can be more than one invoice_id per order_id, you might need to add

HAVING invoice_id = (SELECT max(invoice_id) FROM eg_order eo2 WHERE
eg_order.order_id = eo2.order_id)

or similar.

Hope this helps,

Nis



pgsql-sql by date:

Previous
From: Bryce Nesbitt
Date:
Subject: The nested view from hell - Restricting a subquerry
Next
From: Tom Lane
Date:
Subject: Re: The nested view from hell - Restricting a subquerry