Re: Very slow query (3-4mn) on a table with 25millions rows - Mailing list pgsql-performance

From Félix GERZAGUET
Subject Re: Very slow query (3-4mn) on a table with 25millions rows
Date
Msg-id CANVwZts5xFa9Wioz46eJfATtGrU_U0sR4wP0Y3LkR02fNgmULQ@mail.gmail.com
Whole thread Raw
In response to Re: Very slow query (3-4mn) on a table with 25millions rows  (Abadie Lana <Lana.Abadie@iter.org>)
Responses Re: Very slow query (3-4mn) on a table with 25millions rows  (Félix GERZAGUET <felix.gerzaguet@gmail.com>)
Re: Very slow query (3-4mn) on a table with 25millions rows  (Abadie Lana <Lana.Abadie@iter.org>)
List pgsql-performance
Hello Lana,

On Wed, Jul 27, 2016 at 8:03 AM, Abadie Lana <Lana.Abadie@iter.org> wrote:
Here the result of explain (analyse, buffer). Thanks for your help and let me know if you need more information.

I noticed 3 things in your query:

1. In the second part (after the except), the 2 tables utva and utv are not joined against the others table. Is there a missing join somewhere ?

Let that snipset:

select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
                                                           )

be called A

Let that snipset:

select *
  from usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15

be called B

Then you query is:

A
except
A CROSS JOIN B

If B is not the empty set, than the above query is guaranteed to always have 0 row.

2. Assuming your query is right (even if I failed to understand its point), we could only do the A snipset once instead of twice using a with clause as in:

with filtered_s as (
select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
                                                           )
)
select s.attvalue
  from filtered_s s
except
select s.attvalue
  from filtered_s s
     , usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15
;

This rewritten query should run about 2x. faster.

3. The planner believe that the e.name subselect will give 4926 rows (instead of 16 in reality), due to this wrong estimate it will consider the vat_funcvaratt_multi_idx index as not usefull. I don't know how to give the planner more accurate info ...

--
Félix

pgsql-performance by date:

Previous
From: Abadie Lana
Date:
Subject: Re: Very slow query (3-4mn) on a table with 25millions rows
Next
From: Félix GERZAGUET
Date:
Subject: Re: Very slow query (3-4mn) on a table with 25millions rows