Re: Slow SELECT - Mailing list pgsql-general

From Frank Millman
Subject Re: Slow SELECT
Date
Msg-id 2460452e-0f9b-cd33-fd93-8d506b03f2d0@chagford.com
Whole thread Raw
In response to Re: Slow SELECT  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Slow SELECT  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-general

On 2020-05-26 12:04 PM, David Rowley wrote:
> On Tue, 26 May 2020 at 19:23, Frank Millman <frank@chagford.com> wrote:
>> The table sizes are -
>>       my_table : 167 rows
>>       table_1 : 21 rows
>>       table_2 : 11 rows
>>       table_3 : 3 rows
>>       table_4 : 16 rows
>>
>> Therefore for each tran_date in my_table there are potentially
>> 21x11x3x16 = 11088 rows. Most will be null.
>>
>> I want to select the row_id for the last tran_date for each of those
>> potential groups. This is my select -
>>
>>       SELECT (
>>           SELECT a.row_id FROM my_table a
>>           WHERE a.fld_1 = b.row_id
>>           AND a.fld_2 = c.row_id
>>           AND a.fld_3 = d.row_id
>>           AND a.fld_4 = e.row_id
>>           AND a.deleted_id = 0
>>           ORDER BY a.tran_date DESC LIMIT 1
>>       )
>>       FROM table_1 b, table_2 c, table_3 d, table_4 e
>>
>> Out of 11088 rows selected, 103 are not null.
> 
> Perhaps SQL Server is doing something to rewrite the subquery in the
> target list to a LEFT JOIN.  PostgreSQL currently does not do that.
> 
> Since "my_table" is small, you'd likely be much better doing a manual
> rewrite of the query to join a subquery containing the required
> details from "my_table".  It looks like you want the row_id from the
> latest tran_date for each fld_N column. So something like:
> 
> SELECT a.row_id
> FROM table_1 b
> CROSS JOIN table_2 c
> CROSS JOIN table_3 d
> CROSS JOIN table_4 e
> LEFT OUTER JOIN (
> SELECT fld_1,fld_2,fld_3,fld_4,row_id,tran_date,
> ROW_NUMBER() OVER (PARTITION BY fld_1,fld_2,fld_3,fld_4 ORDER BY
> tran_date DESC) row_num
> FROM my_table
> WHERE deleted_id = 0
> ) a ON a.fld_1 = b.row_id AND a.fld_2 = c.row_id AND a.fld_3 =
> d.row_id AND a.fld_4 = e.row_id AND a.row_num = 1;
> 
> Should do it. You could also perhaps do something with DISTINCT ON
> instead of using ROW_NUMBER(). That might be a bit more efficient, but
> it's unlikely to matter too much since there are only 167 rows in that
> table.
> 

Thank you David. I tried that and it produced the correct result in 
53ms, which is what I am looking for.

It will take me some time to understand it fully, so I have some 
homework to do!

Much appreciated.

Frank




pgsql-general by date:

Previous
From: Frank Millman
Date:
Subject: Re: Slow SELECT
Next
From: Vik Fearing
Date:
Subject: Re: Slow SELECT