Home > mailing lists

Re: LIMIT causes SEQSCAN in subselect - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: LIMIT causes SEQSCAN in subselect
Date	December 11, 2004 18:09:03
Msg-id	17528.1102788383@sss.pgh.pa.us Whole thread Raw
In response to	Re: LIMIT causes SEQSCAN in subselect (Josh Berkus <josh@agliodbs.com>)
List	pgsql-performance

Tree view

Josh Berkus <josh@agliodbs.com> writes:
> The fact that the estimator knows that the LIMIT is pointless because there
> are less rows in the subselect than the LIMIT will return is not something we
> want to count on; sometimes the estimator has innaccurate information.

However, when the estimator is basing that estimate on the existence of
a unique index for the column, the estimate could be trusted.  There are
a couple of reasons that we don't perform that optimization at present,
though:

1. If the finished query plan doesn't actually *use* the index in
question, then dropping the index would not directly invalidate the
query plan, but nonetheless the query would be broken.  You could
subsequently get silently-wrong answers.

2. For the particular point at hand, there's an implementation problem,
which is that decisions about whether to flatten subqueries are taken
before we do any rowcount estimation.  So even if we discarded the LIMIT
clause once we realized it was redundant, it'd be too late to get the
optimal overall plan.

Point #1 is something I would like to fix whenever we get around to
implementing proper invalidation of cached plans.  There would need to
be a way to list "indirect" as well as direct dependencies of a plan.

            regards, tom lane

pgsql-performance by date:

From: Alvaro Nunes Melo
Date: 11 December 2004, 17:01:42
Subject: Very different index usage on similar tables

From: tomas@nocrew.org (Tomas Skäre)
Date: 12 December 2004, 08:32:40
Subject: Re: [GENERAL] Query is not using index when it should

Re: LIMIT causes SEQSCAN in subselect - Mailing list pgsql-performance

Previous

Next