Home > mailing lists

Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery ) - Mailing list pgsql-performance

From	David G. Johnston
Subject	Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery )
Date	May 7, 2020 15:46:51
Msg-id	CAKFQuwa0BdDqPeM+emDDo1XCa+R9RcVinHW6cECPuuWaO4GeZw@mail.gmail.com Whole thread Raw
In response to	Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery ) (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses	Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery ) Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery )
List	pgsql-performance

Tree view

On Thu, May 7, 2020 at 7:40 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 5/7/20 4:19 AM, Amarendra Konda wrote:
> Hi,
>
> PostgreSQL version : PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled
> by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
>
> We have noticed huge difference interms of execution plan ( response
> time) , When we pass the direct values Vs inner query to IN clause.
>
> High level details of the use case are as follows
>
> * As part of the SQL there are 2 tables named Process_instance
> (master) and Process_activity ( child)
> * Wanted to fetch TOP 50 rows from Process_activity table for the
> given values of the Process_instance.
> * When we used Inner Join / Inner query ( query1) between parent
> table and child table , LIMIT is not really taking in to account.
> Instead it is fetching more rows and columns that required, and
> finally limiting the result

It is doing what you told it to do which is SELECT all
process_instance_i's for user_id='317079413683604' and app_id =
'427380312000560' and then filtering further. I am going to guess that
if you run the inner query alone you will find it returns ~23496 rows.
You might have better results if you an actual join between
process_activity and process_instance. Something like below(obviously
not tested):

What the OP seems to want is a semi-join:

(not tested)

SELECT pa.process_activity_id

FROM process_activity pa WHERE pa.app_id = '427380312000560' AND pa.created > '1970-01-01 00:00:00'

AND EXISTS (

SELECT 1 FROM process_instance pi WHERE pi.app_id = pa.app_id AND pi.user_id = '317079413683604'

)

ORDER BY

pa.process_instance_id,

pa.created limit 50;

I'm unsure exactly how this will impact the plan choice but it should be an improvement, and in any case more correctly defines what it is you are looking for.

David J.

pgsql-performance by date:

From: Adrian Klaver
Date: 07 May 2020, 14:40:11
Subject: Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery )

From: samhitha g
Date: 07 May 2020, 17:10:55
Subject: pg_attribute, pg_class, pg_depend grow huge in count and size withmultiple tenants.

Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNERQuery ) - Mailing list pgsql-performance

Previous

Next