Re: [GENERAL] a JOIN to a VIEW seems slow - Mailing list pgsql-general
From | Frank Millman |
---|---|
Subject | Re: [GENERAL] a JOIN to a VIEW seems slow |
Date | |
Msg-id | 65C13571D24B42A5A4044A0A3392B4CA@FrankLaptop Whole thread Raw |
In response to | Re: [GENERAL] a JOIN to a VIEW seems slow (Merlin Moncure <mmoncure@gmail.com>) |
List | pgsql-general |
Merlin Moncure wrote:
On Thu, Sep 21, 2017 at 12:02 PM, Frank Millman <frank@chagford.com> wrote:
>
> SELECT q.cust_row_id,
> SUM(CASE WHEN q.tran_date > '2015-08-31' THEN q.balance ELSE 0 END
> ) AS "balance_curr AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-08-31' AND q.tran_date > '2015-07-31'
> THEN q.balance ELSE 0 END
> ) AS "balance_30 AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-07-31' AND q.tran_date > '2015-06-30'
> THEN q.balance ELSE 0 END
> ) AS "balance_60 AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-06-30' AND q.tran_date > '2015-05-31'
> THEN q.balance ELSE 0 END
> ) AS "balance_90 AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-05-31' THEN q.balance ELSE 0 END
> ) AS "balance_120 AS [DECTEXT]"
> FROM
> (SELECT
> due_trans.cust_row_id,
> due_trans.tran_date,
> trans_due.amount_cust +
> COALESCE((SELECT SUM(trans_alloc.alloc_cust+trans_alloc.disc_cust)
> FROM prop.ar_trans_alloc trans_alloc
> LEFT JOIN prop.ar_trans alloc_trans ON
> alloc_trans.tran_type = trans_alloc.tran_type
> AND alloc_trans.tran_row_id = trans_alloc.tran_row_id
> WHERE trans_alloc.due_row_id = trans_due.row_id
> AND alloc_trans.tran_date <= '2015-09-30'
> ), 0)
> AS balance
> FROM prop.ar_trans_due trans_due
> LEFT JOIN prop.ar_trans due_trans ON
> due_trans.tran_type = trans_due.tran_type
> AND due_trans.tran_row_id = trans_due.tran_row_id
> WHERE due_trans.tran_date <= '2015-09-30'
> ) AS q
> GROUP BY q.cust_row_id
> ORDER BY q.cust_row_id;
> What is the performance with this portion simplified out?
> COALESCE((SELECT SUM(trans_alloc.alloc_cust+trans_alloc.disc_cust)
> FROM prop.ar_trans_alloc trans_alloc
> LEFT JOIN prop.ar_trans alloc_trans ON
> alloc_trans.tran_type = trans_alloc.tran_type
> AND alloc_trans.tran_row_id = trans_alloc.tran_row_id
> WHERE trans_alloc.due_row_id = trans_due.row_id
> AND alloc_trans.tran_date <= '2015-09-30'
> ), 0)
> Change that to just '0' and rerun the query. If timings are good, I
> think we want to explore converting this to LATERAL type join. I
> think (but am not sure) this is defeating the optimizer. Also, is
> this the actual query you want to run quickly? You are not filtering
> on cust_row_id?
>
> SELECT q.cust_row_id,
> SUM(CASE WHEN q.tran_date > '2015-08-31' THEN q.balance ELSE 0 END
> ) AS "balance_curr AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-08-31' AND q.tran_date > '2015-07-31'
> THEN q.balance ELSE 0 END
> ) AS "balance_30 AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-07-31' AND q.tran_date > '2015-06-30'
> THEN q.balance ELSE 0 END
> ) AS "balance_60 AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-06-30' AND q.tran_date > '2015-05-31'
> THEN q.balance ELSE 0 END
> ) AS "balance_90 AS [DECTEXT]",
> SUM(CASE WHEN q.tran_date <= '2015-05-31' THEN q.balance ELSE 0 END
> ) AS "balance_120 AS [DECTEXT]"
> FROM
> (SELECT
> due_trans.cust_row_id,
> due_trans.tran_date,
> trans_due.amount_cust +
> COALESCE((SELECT SUM(trans_alloc.alloc_cust+trans_alloc.disc_cust)
> FROM prop.ar_trans_alloc trans_alloc
> LEFT JOIN prop.ar_trans alloc_trans ON
> alloc_trans.tran_type = trans_alloc.tran_type
> AND alloc_trans.tran_row_id = trans_alloc.tran_row_id
> WHERE trans_alloc.due_row_id = trans_due.row_id
> AND alloc_trans.tran_date <= '2015-09-30'
> ), 0)
> AS balance
> FROM prop.ar_trans_due trans_due
> LEFT JOIN prop.ar_trans due_trans ON
> due_trans.tran_type = trans_due.tran_type
> AND due_trans.tran_row_id = trans_due.tran_row_id
> WHERE due_trans.tran_date <= '2015-09-30'
> ) AS q
> GROUP BY q.cust_row_id
> ORDER BY q.cust_row_id;
> What is the performance with this portion simplified out?
> COALESCE((SELECT SUM(trans_alloc.alloc_cust+trans_alloc.disc_cust)
> FROM prop.ar_trans_alloc trans_alloc
> LEFT JOIN prop.ar_trans alloc_trans ON
> alloc_trans.tran_type = trans_alloc.tran_type
> AND alloc_trans.tran_row_id = trans_alloc.tran_row_id
> WHERE trans_alloc.due_row_id = trans_due.row_id
> AND alloc_trans.tran_date <= '2015-09-30'
> ), 0)
> Change that to just '0' and rerun the query. If timings are good, I
> think we want to explore converting this to LATERAL type join. I
> think (but am not sure) this is defeating the optimizer. Also, is
> this the actual query you want to run quickly? You are not filtering
> on cust_row_id?
It makes a big difference – the query runs in 0.18 seconds.
This query can be used to return the age analysis for a single debtor or for all debtors, so yes I would sometimes run it without filtering.
A couple of comments -
1. My app supports 3 databases, PostgreSQL, SqlServer, and SQLite3. I am trying to keep my SQL as generic as possible. However, if I have to use something that is PostgreSQL-specific, I may have to live with that.
2. This is probably irrelevant but here is the query plan that SQLite3 creates -
3|0|0|SCAN TABLE ar_tran_inv
4|0|0|SCAN TABLE ar_tran_crn
2|0|0|COMPOUND SUBQUERIES 3 AND 4 (UNION ALL)
5|0|0|SCAN TABLE ar_tran_rec
1|0|0|COMPOUND SUBQUERIES 2 AND 5 (UNION ALL)
0|0|0|SCAN TABLE ar_trans_due AS trans_due
0|1|1|SEARCH SUBQUERY 1 AS due_trans USING AUTOMATIC COVERING INDEX (tran_type=? AND tran_row_id=?)
0|0|0|USE TEMP B-TREE FOR GROUP BY
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 6
9|0|0|SCAN TABLE ar_tran_inv
10|0|0|SCAN TABLE ar_tran_crn
8|0|0|COMPOUND SUBQUERIES 9 AND 10 (UNION ALL)
11|0|0|SCAN TABLE ar_tran_rec
7|0|0|COMPOUND SUBQUERIES 8 AND 11 (UNION ALL)
6|0|0|SEARCH TABLE ar_trans_alloc AS trans_alloc USING INDEX _ar_trans_alloc_due (due_row_id=?)
6|1|1|SEARCH SUBQUERY 7 AS alloc_trans USING AUTOMATIC COVERING INDEX (tran_type=? AND tran_row_id=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 12
15|0|0|SCAN TABLE ar_tran_inv
16|0|0|SCAN TABLE ar_tran_crn
14|0|0|COMPOUND SUBQUERIES 15 AND 16 (UNION ALL)
17|0|0|SCAN TABLE ar_tran_rec
13|0|0|COMPOUND SUBQUERIES 14 AND 17 (UNION ALL)
12|0|0|SEARCH TABLE ar_trans_alloc AS trans_alloc USING INDEX _ar_trans_alloc_due (due_row_id=?)
12|1|1|SEARCH SUBQUERY 13 AS alloc_trans USING AUTOMATIC COVERING INDEX (tran_type=? AND tran_row_id=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 18
21|0|0|SCAN TABLE ar_tran_inv
22|0|0|SCAN TABLE ar_tran_crn
20|0|0|COMPOUND SUBQUERIES 21 AND 22 (UNION ALL)
23|0|0|SCAN TABLE ar_tran_rec
19|0|0|COMPOUND SUBQUERIES 20 AND 23 (UNION ALL)
18|0|0|SEARCH TABLE ar_trans_alloc AS trans_alloc USING INDEX _ar_trans_alloc_due (due_row_id=?)
18|1|1|SEARCH SUBQUERY 19 AS alloc_trans USING AUTOMATIC COVERING INDEX (tran_type=? AND tran_row_id=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 24
27|0|0|SCAN TABLE ar_tran_inv
28|0|0|SCAN TABLE ar_tran_crn
26|0|0|COMPOUND SUBQUERIES 27 AND 28 (UNION ALL)
29|0|0|SCAN TABLE ar_tran_rec
25|0|0|COMPOUND SUBQUERIES 26 AND 29 (UNION ALL)
24|0|0|SEARCH TABLE ar_trans_alloc AS trans_alloc USING INDEX _ar_trans_alloc_due (due_row_id=?)
24|1|1|SEARCH SUBQUERY 25 AS alloc_trans USING AUTOMATIC COVERING INDEX (tran_type=? AND tran_row_id=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 30
33|0|0|SCAN TABLE ar_tran_inv
34|0|0|SCAN TABLE ar_tran_crn
32|0|0|COMPOUND SUBQUERIES 33 AND 34 (UNION ALL)
35|0|0|SCAN TABLE ar_tran_rec
31|0|0|COMPOUND SUBQUERIES 32 AND 35 (UNION ALL)
30|0|0|SEARCH TABLE ar_trans_alloc AS trans_alloc USING INDEX _ar_trans_alloc_due (due_row_id=?)
30|1|1|SEARCH SUBQUERY 31 AS alloc_trans USING AUTOMATIC COVERING INDEX (tran_type=? AND tran_row_id=?)
I *think* that the important line is the last one (repeated elsewhere in the plan as well) – when joining alloc_trans, it uses an index on tran_type and tran_row_id. This seems to be what PostgreSQL is not doing.
Frank
pgsql-general by date: