Re: Slow SELECT - Mailing list pgsql-general
From | Frank Millman |
---|---|
Subject | Re: Slow SELECT |
Date | |
Msg-id | 2460452e-0f9b-cd33-fd93-8d506b03f2d0@chagford.com Whole thread Raw |
In response to | Re: Slow SELECT (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: Slow SELECT
|
List | pgsql-general |
On 2020-05-26 12:04 PM, David Rowley wrote: > On Tue, 26 May 2020 at 19:23, Frank Millman <frank@chagford.com> wrote: >> The table sizes are - >> my_table : 167 rows >> table_1 : 21 rows >> table_2 : 11 rows >> table_3 : 3 rows >> table_4 : 16 rows >> >> Therefore for each tran_date in my_table there are potentially >> 21x11x3x16 = 11088 rows. Most will be null. >> >> I want to select the row_id for the last tran_date for each of those >> potential groups. This is my select - >> >> SELECT ( >> SELECT a.row_id FROM my_table a >> WHERE a.fld_1 = b.row_id >> AND a.fld_2 = c.row_id >> AND a.fld_3 = d.row_id >> AND a.fld_4 = e.row_id >> AND a.deleted_id = 0 >> ORDER BY a.tran_date DESC LIMIT 1 >> ) >> FROM table_1 b, table_2 c, table_3 d, table_4 e >> >> Out of 11088 rows selected, 103 are not null. > > Perhaps SQL Server is doing something to rewrite the subquery in the > target list to a LEFT JOIN. PostgreSQL currently does not do that. > > Since "my_table" is small, you'd likely be much better doing a manual > rewrite of the query to join a subquery containing the required > details from "my_table". It looks like you want the row_id from the > latest tran_date for each fld_N column. So something like: > > SELECT a.row_id > FROM table_1 b > CROSS JOIN table_2 c > CROSS JOIN table_3 d > CROSS JOIN table_4 e > LEFT OUTER JOIN ( > SELECT fld_1,fld_2,fld_3,fld_4,row_id,tran_date, > ROW_NUMBER() OVER (PARTITION BY fld_1,fld_2,fld_3,fld_4 ORDER BY > tran_date DESC) row_num > FROM my_table > WHERE deleted_id = 0 > ) a ON a.fld_1 = b.row_id AND a.fld_2 = c.row_id AND a.fld_3 = > d.row_id AND a.fld_4 = e.row_id AND a.row_num = 1; > > Should do it. You could also perhaps do something with DISTINCT ON > instead of using ROW_NUMBER(). That might be a bit more efficient, but > it's unlikely to matter too much since there are only 167 rows in that > table. > Thank you David. I tried that and it produced the correct result in 53ms, which is what I am looking for. It will take me some time to understand it fully, so I have some homework to do! Much appreciated. Frank
pgsql-general by date: