Thread: slow SELECT ... LIMIT query
Hi, I have a fairly large table (1 million records) with the following structure... sampleid int4 unitid int4 datetimestamp timestamp data1 float8 data2 float8 btree indexes on sampleid, unitid, and datetimestamp. I want to be able to pull out the most recent record for a certain unit. the query that seemed most reasonable was SELECT * FROM MYTABLE WHERE UNITID = unit_id ORDER BY DATETIMESTAMP DESC LIMIT 1; some strange results... 1) for units that have a lot of records (100K), the query is reasonably fast (~ 0.5 seconds) but for units with not too many records (100) the query is REALLY SLOW, like 15 seconds. the explain plan says INDEX SCAN BACKWARDS so I guess it is using the index. 2) when I take away the LIMIT 1 and run the query it returns the results almost immediately no matter how many records the unit has. It looks like the indexes are not being used properly when the LIMIT clause is inserted. I didn't do anything fancy when I created the indexes, just the standard CREATE INDEX statement. Ran VACUUM ANALYZE and everything. Is this common when you use the LIMIT clause? Is there a work around? Thanks, Simon
What query plans are you getting for these various combinations? regards, tom lane
On Wed, Jul 18, 2001 at 10:49:29AM -0700, Simon Stanlake wrote: > Hi, > > I have a fairly large table (1 million records) with the following > structure... > [snip] Well, it's fai;rly hard to guess at questions like this unless you provide some more details. Do you think you could post EXPLAIN outputs for both queries? > Is this common when you use the LIMIT clause? Is there a work around? It shouldn't happen but something that might work is to create a cursor and then do fetch 1; -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > It would be nice if someone came up with a certification system that > actually separated those who can barely regurgitate what they crammed over > the last few weeks from those who command secret ninja networking powers.
Thanks for the response... here's the explain for the two queries... with limit postgres=# explain select * from myTable where unitid = 2 order by datetimestamp desc limit 1; NOTICE: QUERY PLAN: Limit (cost=0.00..29.33 rows=1 width=32) -> Index Scan Backward using datetimestamp_idx on myTable (cost=0.00..159464.46 rows=5436 width=32) EXPLAIN and without limit... postgres=# explain select * from myTable where unitid = 2 order by datetimestamp desc; NOTICE: QUERY PLAN: Sort (cost=15977.53..15977.53 rows=5436 width=32) -> Index Scan using unitid_idx on myTable (cost=0.00..15640.27 rows=5436 width=32) hmmm, looks like either using the datetimestamp index or doing the backward scan could be messing it up. tried select * from (select * from myTable where unitid = 2 order by datetimestamp desc) a limit 1; and it's super fast. the explain for this one is... Limit (cost=15977.53..15977.53 rows=1 width=32) -> Subquery Scan a (cost=15977.53..15977.53 rows=5436 width=32) -> Sort (cost=15977.53..15977.53 rows=5436 width=32) -> Index Scan using unitid_idx on locationevent (cost=0.00..15640.27 rows=5436 width=32 EXPLAIN can anyone tell me what slowed down the first query? thanks, simon -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Wednesday, July 18, 2001 2:44 PM To: Simon Stanlake Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] slow SELECT ... LIMIT query What query plans are you getting for these various combinations? regards, tom lane
"Simon Stanlake" <stanlake@hi.ca> writes: > can anyone tell me what slowed down the first query? Well, the first one is going to scan (backwards) in datetimestamp order until it finds a row with unitid = 2. The second one is going to use the unitid index to pick out just the rows with unitid = 2, and then sort them by timestamp. Depending on how many rows have unitid = 2 and how new the latest one is, I could see either of these being way faster than the other. regards, tom lane