On Sun, 16 Mar 2008, Tom Lane wrote:
> > I have a weird query execution plan problem I am trying to debug on
> > Postgresql 8.2.6. I have a query that joins against a temporary table that
> > has very few rows.
>
> Is it possible that the temp table ever has exactly zero rows?
Ah, that is indeed a possibility. If I am to understand correctly, there is
no way to represent the difference between an un-analyzed table and a
zero-sized analyzed table as far as the query planner is concerned? Looks
like I'll have to do a "select count(*)" before running query to avoid
entering this trap. (That feels a bit suboptimal since the conary repository
code does extensive work with/through temporary tables, and this could very
well end up not being the only section affected...)
> That's entirely the wrong way to think about it. The planner is
> choosing a good plan based on its estimates of table sizes, which
> are wildly different in the two cases:
>
> > -> Seq Scan on tmpinstanceid (cost=0.00..1.02 rows=2 width=8) (actual time=0.005..0.007 rows=2
loops=1)
>
> > -> Seq Scan on tmpinstanceid (cost=0.00..29.40 rows=1940 width=8)
In this particular case it would be nice if there would be a differentiation
between "estimate size 0" and "estimate size unknown".
> The only idea I have for how the planner could "ignore" a previous
> analyze result is if the analyze found the table to be of zero size.
> Then the heuristic would still be applied because relpages == 0.
For now I will try to run with the assumption that the massive sequential
scans are caused by joing an empty table in the query and try to work my way
around it - unless there is some trick to tell the planner that this is a
query that would be much better optimized away instead of causing a massive
IO storm.
Thanks,
Cristian
--
Cristian Gafton
rPath, Inc.