Re: Query is taking 5 HOURS to Complete on 8.1 version - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Query is taking 5 HOURS to Complete on 8.1 version
Date
Msg-id 468AB115.4070504@archonet.com
Whole thread Raw
In response to Re: Query is taking 5 HOURS to Complete on 8.1 version  (smiley2211 <smiley2211@yahoo.com>)
List pgsql-performance
smiley2211 wrote:
> Here are the VIEWS in question: query = (explain analyze select id from
> people_consent LIMIT 1;)

First thing I notice - you don't have any ordering, so the LIMIT isn't
returning a well-defined record. Might not matter in your particular
context.

> CREATE OR REPLACE VIEW temp_consent AS
>  SELECT p.id, max(en.enrolled_at) AS daterecorded, a.answer
>    FROM people p, enrollments en, encounters ec,
> encounters_questions_answers eqa, questions_answers qa, answers a
>   WHERE (qa.question_tag::text = 'consentTransfer'::text OR
> qa.question_tag::text = 'shareWithEval'::text) AND eqa.question_answer_id =
> qa.id AND ec.id = eqa.encounter_id AND ec.enrollment_id = en.id AND p.id =
> en.person_id AND qa.answer_id = a.id
>   GROUP BY p.id, a.answer
> UNION

I think you might be able to make this "UNION ALL" - a UNION will check
for duplicates and eliminate them. That's a match on
(id,daterecorded,answer) from both sub-queries - can that happen and do
you care?

>  SELECT p.id, max(c.entered_at) AS daterecorded, a.answer
>    FROM people p, ctccalls c, ctccalls_questions_answers cqa,
> questions_answers qa, answers a
>   WHERE (qa.question_tag::text = 'consentTransfer'::text OR
> qa.question_tag::text = 'shareWithEval'::text) AND cqa.question_answer_id =
> qa.id AND c.id = cqa.call_id AND p.id = c.person_id AND qa.answer_id = a.id
>   GROUP BY p.id, a.answer;
>
>
> CREATE OR REPLACE VIEW temp_consent2 AS
>  SELECT DISTINCT temp_consent.id, temp_consent.daterecorded
>    FROM temp_consent
>   WHERE temp_consent.answer::text = 'Yes'::text
>   ORDER BY temp_consent.daterecorded DESC, temp_consent.id;

Not sure what the DISTINCT is doing for us here. You've eliminated
duplicates in the previous view and so you can't have more than one
(id,daterecorded) for any given answer. (Assuming you keep the previous
UNION in)

> CREATE OR REPLACE VIEW people_consent AS
>  SELECT people.id, people.firstname, people.lastname, people.homephone,
> people.workphone, people.altphone, people.eligibilityzipcode,
> people.address1, people.address2, people.city, people.state,
> people.zipcode1, people.zipcode2, people.email, people.dayofbirth,
> people.monthofbirth, people.yearofbirth, people.ethnic_detail,
> people.external_id, people.highestlevelofeducation_id,
> people.ethnicgroup_id, people.ethnicotherrace, people.entered_at,
> people.entered_by, people.besttimetoreach_id, people.language_id,
> people.otherlanguage, people.gender_id, people.hispaniclatino_id,
> people.canscheduleapt_id, people.mayweleaveamessage_id, people.ethnictribe,
> people.ethnicasian, people.ethnicislander
>    FROM people
>   WHERE (people.id IN ( SELECT temp_consent2.id
>            FROM temp_consent2))
> UNION
>  SELECT people.id, '***MASKED***' AS firstname, '***MASKED***' AS lastname,
> '***MASKED***' AS homephone, '***MASKED***' AS workphone, '***MASKED***' AS
> altphone, '***MASKED***' AS eligibilityzipcode, '***MASKED***' AS address1,
> '***MASKED***' AS address2, '***MASKED***' AS city, '***MASKED***' AS state,
> '***MASKED***' AS zipcode1, '***MASKED***' AS zipcode2, people.email,
> '***MASKED***' AS dayofbirth, '***MASKED***' AS monthofbirth, '***MASKED***'
> AS yearofbirth, people.ethnic_detail, people.external_id,
> people.highestlevelofeducation_id, people.ethnicgroup_id,
> people.ethnicotherrace, people.entered_at, people.entered_by> people.besttimetoreach_id, people.language_id,
people.otherlanguage,
> people.gender_id, people.hispaniclatino_id, people.canscheduleapt_id,
> people.mayweleaveamessage_id, people.ethnictribe, people.ethnicasian,
> people.ethnicislander
>    FROM people
>   WHERE NOT (people.id IN ( SELECT temp_consent2.id
>            FROM temp_consent2));

OK, well the UNION here can certainly be UNION ALL.
1. You're using "***MASKED***" for a bunch of fields, so unless they're
occurring naturally in "people" you won't get duplicates.
2. Your WHERE clauses are the complement of each other.

One other point NOT (people.id IN...) would perhaps be usually written
as "people.id NOT IN (...)". The planner should realise they're the same
though.

However, there's one obvious thing you can do. As it stands you're
testing against temp_consent2 twice. You could rewrite the query
something like:

SELECT
   people.id,
   CASE WHEN temp_consent2.id IS NULL
     THEN '***MASKED***'
     ELSE people.firstname
   END AS firstname
   ...
FROM
   people LEFT JOIN temp_consent2 ON people.id=temp_consent2.id
;

You might want to try these tweaks, but I'd start by working with
temp_consent and seeing how long that takes to execute. Then work out.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: smiley2211
Date:
Subject: Re: Query is taking 5 HOURS to Complete on 8.1 version
Next
From: Richard Huxton
Date:
Subject: Re: Query is taking 5 HOURS to Complete on 8.1 version