Thread: LIMIT causes SEQSCAN in subselect
First off, WOO-HOO! The lists are back and I can finally get my PG fix!!! Now, on to the business at hand... I have four query plans below, the first two help explain my question, and the last two are about a speculative alternative. The first query use a subselects that are generated from a middleware layer and are then joined to create the final result set. In order to keep total execution time down, the middleware imposes a LIMIT clause on each. I'm using the fact that Postgres can elevate a subselect-join to a simple join when there are no aggregates involved and I think I remember there has been some work recently on elevating subselects that contain a LIMIT, so I went back and ran the plans without the LIMITs to see what would happen. Well, the limit killed the subselect elevation. I'm not too worried about the relative execution times since it's very fast, but more curious about the plan that was chosen. It seems that the planner knows that the results from subselect 'b' will contain just one row due to the fact that the index it is scanning is unique. Would it not make sense to discard the LIMIT clause on that subselect? That would result in the third plan, which has better performance than the generated query, and is guaranteed to return the same results since the index in use is unique. Also, wouldn't it make sense for subselect 'a' to be elevated sans LIMIT just to see if there is a unique index it might be able to use? I realize this is a rather specialized case and not really great form. But because PG can, in some cases, elevate subselects, writing middleware to join results becomes pretty easy. Just a matter of defining result sets independently, and creating a simple wrapper to join them. In any case, I will probably end up just detecting the subselect condition in the middleware and drop the limit when there are some WHERE clauses on the inner query. I just thought I'd bring up a possible optimization for the future, and was curious what the gurus might have to say! -- Version info and queries in question. oils4=# select version(); version --------------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 8.0.0beta4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6) (1 row) -- query 1: the query generated by middleware oils4=# EXPLAIN ANALYZE select a.record, b.control from (select * from biblio.record where id = 100000 limit 1000) b, (select * from biblio.metarecord_field_entry limit 1000) a where a.source = b.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Hash Join (cost=3.68..44.49 rows=5 width=40) (actual time=2.066..2.066 rows=0 loops=1) Hash Cond: ("outer".source = "inner".id) -> Subquery Scan a (cost=0.00..35.75 rows=1000 width=16) (actual time=0.005..1.295 rows=1000 loops=1) -> Limit (cost=0.00..25.75 rows=1000 width=87) (actual time=0.003..0.641 rows=1000 loops=1) -> Seq Scan on metarecord_field_entry (cost=0.00..43379.75 rows=1684575 width=87) (actual time=0.003..0.435 rows=1000 loops=1) -> Hash (cost=3.68..3.68 rows=1 width=40) (actual time=0.039..0.039 rows=0 loops=1) -> Subquery Scan b (cost=0.00..3.68 rows=1 width=40) (actual time=0.031..0.033 rows=1 loops=1) -> Limit (cost=0.00..3.67 rows=1 width=1070) (actual time=0.029..0.030 rows=1 loops=1) -> Index Scan using biblio_record_pkey on record (cost=0.00..3.67 rows=1 width=1070) (actual time=0.027..0.028 rows=1 loops=1) Index Cond: (id = 100000) Total runtime: 2.171 ms (11 rows) -- query 2: the fast query, no limit allows elevation of subselects oils4=# EXPLAIN ANALYZE select a.record, b.control from (select * from biblio.record where id = 100000) b, (select * from biblio.metarecord_field_entry) a where a.source = b.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=0.00..19.95 rows=9 width=22) (actual time=0.043..0.055 rows=7 loops=1) -> Index Scan using biblio_record_pkey on record (cost=0.00..3.67 rows=1 width=22) (actual time=0.025..0.026 rows=1 loops=1) Index Cond: (id = 100000) -> Index Scan using metarecord_field_entry_source_idx on metarecord_field_entry (cost=0.00..16.19 rows=9 width=16) (actual time=0.011..0.018 rows=7 loops=1) Index Cond: (source = 100000) Total runtime: 0.101 ms (6 rows) -- query 3: if we were to drop the limit, since we're using a unique index oils4=# EXPLAIN ANALYZE select a.record, b.control from (select * from biblio.record where id = 100000) b, (select * from biblio.metarecord_field_entry limit 1000) a where a.source = b.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=0.00..41.97 rows=5 width=22) (actual time=1.169..1.169 rows=0 loops=1) -> Index Scan using biblio_record_pkey on record (cost=0.00..3.67 rows=1 width=22) (actual time=0.036..0.038 rows=1 loops=1) Index Cond: (id = 100000) -> Subquery Scan a (cost=0.00..38.25 rows=5 width=16) (actual time=1.126..1.126 rows=0 loops=1) Filter: (source = 100000) -> Limit (cost=0.00..25.75 rows=1000 width=87) (actual time=0.005..0.673 rows=1000 loops=1) -> Seq Scan on metarecord_field_entry (cost=0.00..43379.75 rows=1684575 width=87) (actual time=0.004..0.424 rows=1000 loops=1) Total runtime: 1.243 ms (8 rows) -- query 4: what I would like the seqscan in query 3 to become... oils4=# EXPLAIN ANALYZE select * from biblio.metarecord_field_entry where source = 100000 limit 1000; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.00..16.19 rows=9 width=87) (actual time=0.026..0.035 rows=7 loops=1) -> Index Scan using metarecord_field_entry_source_idx on metarecord_field_entry (cost=0.00..16.19 rows=9 width=87) (actual time=0.025..0.032 rows=7 loops=1) Index Cond: (source = 100000) Total runtime: 0.069 ms (4 rows) -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
Mike, > I'm using the fact that Postgres can elevate a subselect-join to a > simple join when there are no aggregates involved and I think I > remember there has been some work recently on elevating subselects > that contain a LIMIT, so I went back and ran the plans without the > LIMITs to see what would happen. Well, the limit killed the subselect > elevation. Actually, this makes sense. A LIMIT requires the data to be ordered first, and then cut based on the order; it prevents collapsing the subselect into the main query. Some sort of materializing is necessary, even in cases like yours where the limit is inherently meaningless because you've neglected to use an ORDER BY. The fact that the estimator knows that the LIMIT is pointless because there are less rows in the subselect than the LIMIT will return is not something we want to count on; sometimes the estimator has innaccurate information. The UNIQUE index makes this more certain, except that I'm not sure that the planner distinguishes between actual UNIQUE indexes and columns which are estimated unique (per the pg_stats). And I think you can see in your case that there's quite a difference between a column we're CERTAIN is unique, versus a column we THINK is unique. > I realize this is a rather specialized case and not really great form. Exactly. You've grasped the main issue: that this has not been optimized because it's bizarre and not very sensible query writing. Someday we'll get around to optimizing the really wierd queries, but there's still a lot of work to be done on the common ones (like count(*) ...). Keep in mind that the only reason we support LIMIT inside subqueries in the first place is a workaround to slow aggregates, and a way to do RANK. It's certainly not SQL-standard. > Just a matter of > defining result sets independently, and creating a simple wrapper to > join them. Well, if you think so, you know where to submit patches ... -- Josh Berkus Aglio Database Solutions San Francisco
On Fri, 10 Dec 2004 21:40:18 -0800, Josh Berkus <josh@agliodbs.com> wrote: > Mike, > The fact that the estimator knows that the LIMIT is pointless because there > are less rows in the subselect than the LIMIT will return is not something we > want to count on; sometimes the estimator has innaccurate information. The > UNIQUE index makes this more certain, except that I'm not sure that the > planner distinguishes between actual UNIQUE indexes and columns which are > estimated unique (per the pg_stats). And I think you can see in your case > that there's quite a difference between a column we're CERTAIN is unique, > versus a column we THINK is unique. Absolutely. At first I was going to ask if perhaps using the stats to discard the LIMIT would be possible, but since the stats are only guidelines I dropped that. The stats are just so tempting! > > > I realize this is a rather specialized case and not really great form. > > Exactly. You've grasped the main issue: that this has not been optimized > because it's bizarre and not very sensible query writing. Someday we'll get > around to optimizing the really wierd queries, but there's still a lot of > work to be done on the common ones (like count(*) ...). Absolutely. And if I can help out with the common cases to gain some Karmic currency I will. ;) After thinking about it some more, I don't think those queries we really all that wacky though. The problem with the example is that the generated query is very simple, and real-world queries that would be used in the subselect would be much more complex, and row estimation would be untrustworthy without a UNIQUE index. > > Keep in mind that the only reason we support LIMIT inside subqueries in the > first place is a workaround to slow aggregates, and a way to do RANK. It's > certainly not SQL-standard. > No it's not, but then nobody ever accused the authors of the SQL spec of being omniscient... I' cant think of another way to get, say, a 'top 10' list from a subselect, or use a paging iterator (LIMIT .. OFFSET ..) as the seed for an outer query. Well, other than an SRF of course. > > Just a matter of > > defining result sets independently, and creating a simple wrapper to > > join them. > > Well, if you think so, you know where to submit patches ... > Well, I do, but I was talking about it being 'easy' in the middleware. Just let PG handle optimizing the subselects. For example, you have a pile of predefined SELECTS that don't know they are related and are used for simple lookups. You tell the SQL generator thingy that it should use two of those, queries A and B, that they are related on x, and that you want to see the 'id' from A and the 'value' from B. Instead of having to preplan every possible combination of JOINS the SQL generator will toss the preplanned ones into subselects and join them in the outer query instead of having to rip them apart and calculate the join syntax. And yes, I know that view will take care of most of that for me... :) Thanks for all your comments. Pretty much what I expected, but I thought I'd raise a use case. I'll just have to give the query builder more smarts. > -- > Josh Berkus > Aglio Database Solutions > San Francisco > -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
Josh Berkus <josh@agliodbs.com> writes: > The fact that the estimator knows that the LIMIT is pointless because there > are less rows in the subselect than the LIMIT will return is not something we > want to count on; sometimes the estimator has innaccurate information. However, when the estimator is basing that estimate on the existence of a unique index for the column, the estimate could be trusted. There are a couple of reasons that we don't perform that optimization at present, though: 1. If the finished query plan doesn't actually *use* the index in question, then dropping the index would not directly invalidate the query plan, but nonetheless the query would be broken. You could subsequently get silently-wrong answers. 2. For the particular point at hand, there's an implementation problem, which is that decisions about whether to flatten subqueries are taken before we do any rowcount estimation. So even if we discarded the LIMIT clause once we realized it was redundant, it'd be too late to get the optimal overall plan. Point #1 is something I would like to fix whenever we get around to implementing proper invalidation of cached plans. There would need to be a way to list "indirect" as well as direct dependencies of a plan. regards, tom lane
> The fact that the estimator knows that the LIMIT is pointless because > there > are less rows in the subselect than the LIMIT will return is not > something we > want to count on; sometimes the estimator has innaccurate information. > The > UNIQUE index makes this more certain, except that I'm not sure that the > planner distinguishes between actual UNIQUE indexes and columns which are > estimated unique (per the pg_stats). And I think you can see in your > case > that there's quite a difference between a column we're CERTAIN is unique, > versus a column we THINK is unique. I think a UNIQUE constraint can permit several 'different' NULL values... better say "UNIQUE NOT NULL" ?