Re: [pgsql-hackers-win32] Poor Performance for large queries in functions - Mailing list pgsql-performance

From Tom Lane
Subject Re: [pgsql-hackers-win32] Poor Performance for large queries in functions
Date
Msg-id 29954.1096474693@sss.pgh.pa.us
Whole thread Raw
In response to Re: [pgsql-hackers-win32] Poor Performance for large queries in functions  (John Meinel <john@johnmeinel.com>)
Responses Re: [pgsql-hackers-win32] Poor Performance for large queries
List pgsql-performance
[ enlarging on Richard's response a bit ]

John Meinel <john@johnmeinel.com> writes:
> jfmeinel=> explain analyze execute myget(30000);
>                                                   QUERY PLAN
> --------------------------------------------------------------------
>   Seq Scan on tdata  (cost=0.00..9773.10 rows=181923 width=4)
>     (actual time=1047.000..1047.000 rows=0 loops=1)
>     Filter: (project_id = $1)
>   Total runtime: 1047.000 ms

> jfmeinel=> explain analyze select id from tdata where project_id = 30000;
>                                                           QUERY PLAN

> -------------------------------------------------------------------------
>   Index Scan using tdata_project_id_idx on tdata  (cost=0.00..4.20
> rows=1 width=4) (actual time=0.000..0.000 rows=0 loops =1)
>     Index Cond: (project_id = 30000)
>   Total runtime: 0.000 ms

> So notice that when doing the actual select it is able to do the index
> query. But for some reason with a prepared statement, it is not able to
> do it.

This isn't a "can't do it" situation, it's a "doesn't want to do it"
situation, and it's got nothing whatever to do with null or not null.
The issue is the estimated row count, which in the first case is so high
as to make the seqscan approach look cheaper.  So the real question here
is what are the statistics on the column that are making the planner
guess such a large number when it has no specific information about the
compared-to value.  Do you have one extremely common value in the column?
Have you done an ANALYZE recently on the table, and if so can you show
us the pg_stats row for the column?

            regards, tom lane

pgsql-performance by date:

Previous
From: Dirk Lutzebäck
Date:
Subject: why does explain analyze differ so much from estimated explain?
Next
From: John Meinel
Date:
Subject: Re: [pgsql-hackers-win32] Poor Performance for large queries