Adding LIMIT changes PostgreSQL plan from good to a bad one - Mailing list pgsql-general

From Michael Korbakov
Subject Adding LIMIT changes PostgreSQL plan from good to a bad one
Date
Msg-id CACLnFeHgZiWUSz6aUg_jY458HTN=5KXdMWXi1b1UxDGLFp7wcg@mail.gmail.com
Whole thread Raw
Responses Re: Adding LIMIT changes PostgreSQL plan from good to a bad one  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: Adding LIMIT changes PostgreSQL plan from good to a bad one  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
Hi everybody.

I stumbled upon a weird problem with the query planner. I have a query
on a typical EAV schema:

SELECT contacts.id
FROM contacts
    LEFT OUTER JOIN (SELECT DISTINCT contacts_values.company_id AS
company_id, contacts_values.id AS id
                     FROM contacts_values
                     WHERE contacts_values.field_id =
'\x000000000000000000000000'
                       AND contacts_values.field_name = 'facebook'
                       AND
nimble_contact_value_normalize(nimble_skip_long_values(contacts_values.value))
=

nimble_contact_value_normalize('http://www.facebook.com/jon.ferrara'))
AS anon_1
                    ON anon_1.company_id = contacts.company_id AND
anon_1.id = contacts.id
    LEFT OUTER JOIN (SELECT DISTINCT contacts_values.company_id AS
company_id, contacts_values.id AS id
                     FROM contacts_values
                     WHERE contacts_values.field_id =
'\x000000000000000000000000'
                       AND contacts_values.field_name = 'last_name'
                       AND
nimble_contact_value_normalize(nimble_skip_long_values(contacts_values.value))
=
                           nimble_contact_value_normalize('Ferrara')) AS anon_2
                    ON anon_2.company_id = contacts.company_id AND
anon_2.id = contacts.id
    JOIN contacts__aggregated AS contacts__aggregated_1
         ON contacts__aggregated_1.company_id = contacts.company_id AND
            contacts__aggregated_1.contact_id = contacts.id AND
contacts__aggregated_1.field_name = 'names'
WHERE contacts.company_id = '\x4c2118ad54397f271b000000'
  AND (anon_1.id IS NOT NULL OR anon_2.id IS NOT NULL)
ORDER BY contacts__aggregated_1.value ASC
LIMIT 30 OFFSET 0;

My problem is that the LIMIT clause in this query makes the planner
choose a bad plan
with nested loops: https://explain.depesz.com/s/Mute. Running the same
query after
SET ENABLE_NESTLOOP TO OFF I am getting a much more efficient plan:
https://explain.depesz.com/s/b5kn. Removing the LIMIT from the query
results in a
similar plan: https://explain.depesz.com/s/wDqE.

One thing that concerns me is that the cost of the LIMIT node in the
bad nested loop
plan is just a fraction of the cost of its subnode. But for the better
merge join
plan LIMIT node has the same cost as its subnode. How could it be this
way? And what
can I do to make the planner pick up a better plan?

We are running PostgreSQL 10.10.

Sincerely,

-- Michael Korbakov



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Tablespace setup issue
Next
From: P V Tekawade
Date:
Subject: ON COMMIT options for non temporary tables