Re: Adding LIMIT changes PostgreSQL plan from good to a bad one - Mailing list pgsql-general

From Michael Korbakov
Subject Re: Adding LIMIT changes PostgreSQL plan from good to a bad one
Date
Msg-id CACLnFeE5X3HQKD7CLBeCW5ZmGm1cmHOUz2+Ouwdr4MiybdkPmg@mail.gmail.com
Whole thread Raw
In response to Re: Adding LIMIT changes PostgreSQL plan from good to a bad one  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general


On November 22, 2019 at 20:28:39, Michael Lewis (mlewis@entrata.com) wrote:

I try to avoid DISTINCT and use GROUP BY when feasible, as well as avoiding OR condition. If you combined anon1 and anon2 with UNION ALL, and did (inner) join instead of left, or even moved all of that to EXISTS, perhaps that gives you better consistent performance. Something like this-


SELECT contacts.id
FROM contacts
    JOIN contacts__aggregated AS contacts__aggregated_1 ON
         contacts__aggregated_1.company_id = contacts.company_id
         AND contacts__aggregated_1.contact_id = contacts.id
         AND contacts__aggregated_1.field_name = 'names'
WHERE contacts.company_id = '\x4c2118ad54397f271b000000'

AND EXISTS (
         SELECT
                     FROM contacts_values
                     WHERE contacts_values.field_id = '\x000000000000000000000000'
                       AND contacts_values.field_name = 'facebook'
                       AND nimble_contact_value_normalize(nimble_skip_long_values(contacts_values.value)) = nimble_contact_value_normalize('http://www.facebook.com/jon.ferrara'))
                       AND contacts_values.company_id = contacts.company_id AND contacts_values.id = contacts.id
         UNION ALL
         SELECT
                     FROM contacts_values
                     WHERE contacts_values.field_id = '\x000000000000000000000000'
                       AND contacts_values.field_name = 'last_name'
                       AND nimble_contact_value_normalize(nimble_skip_long_values(contacts_values.value)) = nimble_contact_value_normalize('Ferrara')
                       AND contacts_values.company_id = contacts.company_id AND contacts_values.id = contacts.id
)

ORDER BY contacts__aggregated_1.value ASC
LIMIT 30 OFFSET 0;

Hi Michael!

Thank you, your suggestion about replacing DISTINCT with GROUP BY improved the plan for my query significantly. It is still not perfect, but at least it is something we can live with.

However, rewriting OR with UNION does not change things. Here's the plan for it: https://explain.depesz.com/s/c6Ec. Judging from the number of loops it uses some form of nested loop internally.

It appears that any way of expressing OR ruins something for the planner. My performance is great for a single criterion and for multiple criteria joined by AND. Adding any OR into the mix results in giant JOINs and misestimations.

pgsql-general by date:

Previous
From: Jill Jade
Date:
Subject: Re: Pgadmin 4 schema visibility
Next
From: İlyas Derse
Date:
Subject: Call Stored Procedure with inout parameters in c#