Home > mailing lists

Re: performance of IN (subquery) - Mailing list pgsql-general

From	Tom Lane
Subject	Re: performance of IN (subquery)
Date	August 27, 2004 14:20:55
Msg-id	10390.1093616450@sss.pgh.pa.us Whole thread Raw
In response to	Re: performance of IN (subquery) (Greg Stark <gsstark@mit.edu>)
Responses	Re: performance of IN (subquery) Re: performance of IN (subquery)
List	pgsql-general

Tree view

Greg Stark <gsstark@mit.edu> writes:
> I'm not about to run analyze in the middle of the data generation
> (which wouldn't work anyways since it's in a transaction).

Since 7.3 or 7.4, you *can* run ANALYZE in the middle of a transaction.
The cached-plan business is a problem, I agree, but I think it's
orthogonal to this particular discussion (and you can always use EXECUTE
if you have to).

>> In the absence of any ANALYZE data the tuples-per-page estimate might be
>> pretty bogus, but it couldn't be off by more than an order of magnitude or
>> so either way.

> I don't see why it couldn't. If you have a table badly in need of vacuuming
> (or had one at the time of the last analyze) it could be off by way more than
> an order of magnitude.

Well, I was actually thinking of the physical tuples-per-page stat
(perhaps better expressed as an average tuple size), but you are right
that the fraction of dead tuples is also something to think about.
We don't model that explicitly ATM but maybe we should.  The original
VACUUM-based stats code couldn't really do much with it, since VACUUM
would leave no dead tuples behind in the first place; but separate
ANALYZE could definitely make an estimate of the fraction of dead tuples.

> Ideally I would want a guarantee that every query would *always*
> result in the same plan. Once I've tested them and approved the plans
> I want to know that only those approved plans will ever run, and I
> want to be present and be able to verify new plans before they go into
> production.

> I doubt I'm going to convince anyone today,

Nope, you aren't.  The above seems to me to be a recipe for degradation
of performance over time, precisely because the plans wouldn't change in
the face of changes in the situation.  I've resisted adding "planner
hints" to the language for this reason, and I'm certainly not eager to
offer any hard guarantees about plans not changing.

            regards, tom lane

pgsql-general by date:

From: Tom Lane
Date: 27 August 2004, 13:57:46
Subject: Re: performance of IN (subquery)

From: Bruce Momjian
Date: 27 August 2004, 14:47:48
Subject: Re: performance of IN (subquery)

Re: performance of IN (subquery) - Mailing list pgsql-general

Previous

Next