Thread: Which indexes does a query use?
In pg v7.4.5, I have this query:- select * from activities where activity_user_id = 2 and activity_type = 1 and activity_ts < now() order by activity_ts desc limit 1; where activity_user_id is a non-unique index and activity_ts is a non-unique index. When I explain it, I get:- Limit (cost=46.33..46.34 rows=1 width=58) -> Sort (cost=46.33..46.34 rows=1 width=58) Sort Key: activity_ts -> Seq Scan on activities (cost=0.00..46.32 rows=1 width=58) Filter: ((activity_user_id = 2) AND (activity_type = 1) AND ((activity_ts)::timestamp with time zone < now())) If I'm reading this right, it's telling me that is NOT using any indexes. Clearly, this is wrong. I would have expected that index on activity_user_id would have been used to help find all the records efficiently. Chris -- Chris Velevitch Manager - Sydney Flash Platform Developers Group www.flashdev.org.au
Chris Velevitch wrote: > In pg v7.4.5, I have this query:- > > select * > from activities > where activity_user_id = 2 and activity_type = 1 and activity_ts < now() > order by activity_ts desc > limit 1; > > where activity_user_id is a non-unique index and activity_ts is a > non-unique index. > > When I explain it, I get:- > > Limit (cost=46.33..46.34 rows=1 width=58) > -> Sort (cost=46.33..46.34 rows=1 width=58) > Sort Key: activity_ts > -> Seq Scan on activities (cost=0.00..46.32 rows=1 width=58) > Filter: ((activity_user_id = 2) AND (activity_type = 1) > AND ((activity_ts)::timestamp with time zone < now())) > > If I'm reading this right, it's telling me that is NOT using any indexes. > > Clearly, this is wrong. I would have expected that index on > activity_user_id would have been used to help find all the records > efficiently. Not necessarily. How many rows are there in the table at the moment. If pg uses and index, it first has to get the index page, then get the heap page. So if you have a small number of blocks in the heap it's actually cheaper to just scan the heap. I would guess the heap is small by the fact that the seq scan only find one row, and finds it in 46.32 units. The row width isn't high either and that means you get good block packing. Probably 80-100 row's per block. If you post "explain analyze" instead of "explain" and possibly the number row in that table, we might be able to help further, but that is my best guess from the information given. Regards Russell Smith > > > Chris > -- > Chris Velevitch > Manager - Sydney Flash Platform Developers Group > www.flashdev.org.au > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > >
Russell Smith <mr-russ@pws.com.au> writes: > Chris Velevitch wrote: >> If I'm reading this right, it's telling me that is NOT using any indexes. >> Clearly, this is wrong. > Not necessarily. > ... > If you post "explain analyze" instead of "explain" and possibly the > number row in that table, we might be able to help further, but that is > my best guess from the information given. Also, has the table been vacuumed/analyzed lately? The planner clearly thinks it is small, but that information could be out of date (and 7.x releases aren't going to notice if the table has gotten bigger since the last vacuum or analyze). regards, tom lane
On 2/27/06, Russell Smith <mr-russ@pws.com.au> wrote: > Not necessarily. How many rows are there in the table at the moment. > If pg uses and index, it first has to get the index page, then get the > heap page. So if you have a small number of blocks in the heap it's > actually cheaper to just scan the heap. I would guess the heap is small > by the fact that the seq scan only find one row, and finds it in 46.32 > units. The row width isn't high either and that means you get good > block packing. Probably 80-100 row's per block. It's about 200 records. Are you saying that the strategy pg uses is dynamic, in that as the size of the table grows the strategy changes? (here is the explain analyze output) Limit (cost=46.33..46.34 rows=1 width=58) (actual time=2.000..2.000 rows=1 loops=1) -> Sort (cost=46.33..46.34 rows=1 width=58) (actual time=2.000..2.000 rows=1 loops=1) Sort Key: activity_ts -> Seq Scan on activities (cost=0.00..46.32 rows=1 width=58) (actual time=0.000..2.000 rows=207 loops=1) Filter: ((activity_user_id = 2) AND (activity_type = 1) AND ((activity_ts)::timestamp with time zone < now())) Total runtime: 3.000 ms (6 rows) Chris -- Chris Velevitch Manager - Sydney Flash Platform Developers Group www.flashdev.org.au
Chris Velevitch wrote: > On 2/27/06, Russell Smith <mr-russ@pws.com.au> wrote: > >>Not necessarily. How many rows are there in the table at the moment. >>If pg uses and index, it first has to get the index page, then get the >>heap page. So if you have a small number of blocks in the heap it's >>actually cheaper to just scan the heap. I would guess the heap is small >>by the fact that the seq scan only find one row, and finds it in 46.32 >>units. The row width isn't high either and that means you get good >>block packing. Probably 80-100 row's per block. > > It's about 200 records. > > Are you saying that the strategy pg uses is dynamic, in that as the > size of the table grows the strategy changes? Yes, it does. For 200 rows, it's just as quick for it to look at every row. If you have 20,000 - then that's a different case. -- Postgresql & php tutorials http://www.designmagick.com/
Chris Velevitch wrote: > Are you saying that the strategy pg uses is dynamic, in that as the > size of the table grows the strategy changes? The planner is quite dynamic, and what strategy it comes up with will depend not just on the size of the table, but other things as well, even on the particular constants in your comparisons. For instance, I think your original query had: where activity_user_id = 2 If the table statistics show that 2 is a very common value for that column, the planner will not use the index, as it will not save enough. (I think I have that right.) - John D. Burger MITRE