Re: limiting results makes the query slower - Mailing list pgsql-performance

From Robert Haas
Subject Re: limiting results makes the query slower
Date
Msg-id 603c8f070908231441n19440d94vef2e32e28e514076@mail.gmail.com
Whole thread Raw
In response to limiting results makes the query slower  (Jaime Casanova <jcasanov@systemguards.com.ec>)
List pgsql-performance
On Thu, Aug 20, 2009 at 9:50 PM, Jaime
Casanova<jcasanov@systemguards.com.ec> wrote:
> in a web app we have a query that we want to show in limited results
> at a time, this one executes in 10 seconds if i use limit but executes
> in 300ms if i remove it.
> why is that happening? the query is using and index for avoiding the
> sort so the nestloop should go only for the first 20 records on
> tgen_persona, no?
> below some more info
>
> postgresql 8.3.7
> ram 32GB
> shared_buffers 8GB
> work_mem 8MB
>
> tgen_persona has 185732 records and tcom_invitacion is a partitioned
> (by dates: 1 month every partition) table and has more than 29million
> records in the partitions
>
> explain analyze here: http://explain.depesz.com/s/B4
>
> the situation improves if i disable nestloops, explain analyze with
> nestloop off here: http://explain.depesz.com/s/Jv
>
> select Per.razon_social as MAIL,inv.cata_esta_calificacion,
>       inv.observa_calificacion,
>       to_char(inv.fech_crea,'YYYY:MM:DD') as fech_crea,
>       case when (( select cod_estado FROM TPRO_PROVEEDOR
>                     WHERE id_proveedor = (select max(a.id_proveedor)
>                                             from tpro_proveedor a
>                                            where persona_id = Inv.persona_id )
>                  )='Habilitado')
>            then 'Habilitado'
>            else 'Deshabilitado'
>       end as empresa_id
>  from tgen_persona Per, tcom_invitacion Inv
>  where Per.persona_id = Inv.persona_id
>   and inv.id_soli_compra = '60505'
>  ORDER BY Per.razon_social asc limit 20 offset 0

This is pretty common.   Tom Lane pointed out in a message I don't
feel like searching for right now that LIMIT tends to magnify the
effect of bad selectivity estimates.  In this case, the join
selectivity is off by more than 3 orders of magnitude right here:

Nested Loop (cost=0.00..4280260.77 rows=8675 width=588) (actual
time=4835.934..11335.731 rows=2 loops=1)

I'm not familiar with how we estimate join selectivity in this case,
but it's obviously giving really, really wrong answers.  The problem
may be here:

Append (cost=0.00..22.00 rows=23 width=560) (actual time=0.055..0.055
rows=0 loops=185732)

It appears that we're estimating 23 rows because we have 23
partitions, and we're estimating one row for each.  There are a lot of
places in the planner where we round off to an integer with a floor of
1, which may be part of the problem here... but I don't know without
looking at the code.

...Robert

pgsql-performance by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: [PERFORMANCE] how to set wal_buffers
Next
From: Jeff Janes
Date:
Subject: Re: [PERFORMANCE] how to set wal_buffers