Thread: Query optimizer bug
Hi, There is a bug in Postgresql query optimizer that makes the query that's supposed to return an empty result perform considerably slower when LIMIT clause is given. For exaple, the following query: select * from login_history where account = 570 order by timestamp; is performed in 0,28 msec and the following one: select * from login_history where account = 570 order by timestamp limit 1; is performed in 16022.19 msec (!?). Details are given below. Let's take a table Column | Type | Modifiers -----------+--------------------------+----------- account | integer | not null service | smallint | not null failed | boolean | not null timestamp | timestamp with time zone | not null client_ip | integer | Indexes: login_history_acct_idx Primary key: login_history_pkey Triggers: RI_ConstraintTrigger_5615540 with indexes Index "login_history_pkey" Column | Type -----------+-------------------------- timestamp | timestamp with time zone unique btree (primary key) and Index "login_history_acct_idx" Column | Type ---------+--------- account | integer btree The table contains some 4,7 milion rows. Let's try to have look for entries with account = 570: explain analyze select count(*) from login_history where account = 570; NOTICE: QUERY PLAN: Aggregate (cost=16705.35..16705.35 rows=1 width=0) (actual time=0.13..0.13 rows=1 loops=1) -> Index Scan using login_history_acct_idx on login_history (cost=0.00..16694.67 rows=4275 width=0) (actual time=0.12..0.12 rows=0 loops=1) Total runtime: 0.20 msec There's no such entries. Let's try perform SELECT, anyway. We want the result ordered by 'timestamp': explain analyze select * from login_history where account = 570 order by timestamp; NOTICE: QUERY PLAN: Sort (cost=16952.48..16952.48 rows=4275 width=19) (actual time=0.21..0.21 rows=0 loops=1) -> Index Scan using login_history_acct_idx on login_history (cost=0.00..16694.67 rows=4275 width=19) (actual time=0.13..0.13 rows=0 loops=1) Total runtime: 0.28 msec The response is given immediately. However, when we add LIMIT clause to the query, we'll have to wait for 16 seconds to get the very same, empty result (!?): explain analyze select * from login_history where account = 570 order by timestamp limit 1; NOTICE: QUERY PLAN: Limit (cost=0.00..27.03 rows=1 width=19) (actual time=16022.11..16022.11 rows=0 loops=1) -> Index Scan using login_history_pkey on login_history (cost=0.00..115531.35 rows=4275 width=19) (actual time=16022.10..16022.10 rows=0 loops=1) Total runtime: 16022.19 msec P.S. Sorry if the bug is known. P.S. 2 Without ORDER BY clause everything works fine. Best regards, -- Szymon Juraszczyk, szymon@juraszczyk.com
On Mon, 7 Oct 2002, Szymon Juraszczyk wrote: > The table contains some 4,7 milion rows. > > Let's try to have look for entries with account = 570: > It looks to me it's estimating that 4275 rows will match account=570. If you're using 7.2 and have analyzed, you may want to up the number of buckets the analyzer uses in order to get a better sampling. I think if it had a reasonable idea of how many rows it was returning, it'd probably pick the correct index. (As a side note, an index on account,timestamp (or is it timestamp, account) would possibly give the best results.) > There's no such entries. Let's try perform SELECT, anyway. We want the > result ordered by 'timestamp': > > explain analyze select * from login_history where account = 570 order by > timestamp; > NOTICE: QUERY PLAN: > > Sort (cost=16952.48..16952.48 rows=4275 width=19) (actual time=0.21..0.21 > rows=0 loops=1) > -> Index Scan using login_history_acct_idx on login_history > (cost=0.00..16694.67 rows=4275 width=19) (actual time=0.13..0.13 rows=0 > loops=1) > Total runtime: 0.28 msec > > The response is given immediately. However, when we add LIMIT clause to > the query, we'll have to wait for 16 seconds to get the very same, empty > result (!?): > > explain analyze select * from login_history where account = 570 order by > timestamp limit 1; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..27.03 rows=1 width=19) (actual time=16022.11..16022.11 > rows=0 loops=1) > -> Index Scan using login_history_pkey on login_history > (cost=0.00..115531.35 rows=4275 width=19) (actual time=16022.10..16022.10 > rows=0 loops=1) > Total runtime: 16022.19 msec