Thread: Multi-key index not beeing used - bug?
Look at this: NBET=> explain select * from account_transaction where users_id=123456 order by created desc limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..27.40 rows=10 width=213) -> Index Scan Backward using account_transaction_on_user_and_timestamp on account_transaction (cost=0.00..1189.19 rows=434width=213) Index Cond: (users_id = 123456) (3 rows) NBET=> explain select * from account_transaction where users_id=123456 order by created desc, id desc limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Limit (cost=1114.02..1114.04 rows=10 width=213) -> Sort (cost=1114.02..1115.10 rows=434 width=213) Sort Key: created, id -> Index Scan using account_transaction_by_users_id on account_transaction (cost=0.00..1095.01 rows=434 width=213) Index Cond: (users_id = 123456) (5 rows) In case the explains doesn't explain themself good enough: we have a transaction table with ID (primary key, serial), created (a timestamp) and a users_id. Some of the users have generated thousands of transactions, and the above query is a simplified version of the query used to show the users their last transactions. Since we have a large user base hammering our servers with this request, the speed is significant. We have indices on the users_id field and the (users_id, created)-tuple. The timestamp is set by the application and has a resolution of 1 second - so there may easily be several transactions sharing the same timestamp, but this is an exception not the rule. I suppose the developers needed to add the ID to the sort list to come around a bug, but still prefering to have the primary sorting by created to be able to use the index. One workaround here is to order only by id desc and create a new index on (users_id, id) - but I really don't like adding more indices to the transaction table.
Thanks Tobias. The difference here though, is that in terms of your database I am doing a query to select the most recent transaction for EACH user at once, not just for one user. If I do a similar query to yours to get the last transaction for a single user, my query is fast like yours. It's when I'm doing a query to get the results for all users at once that it is slow. If you try a query to get the most recent transaction of all useres at once you will run into the same problem I am having. Graham. Tobias Brox wrote: >Look at this: > >NBET=> explain select * from account_transaction where users_id=123456 order by created desc limit 10; > QUERY PLAN >------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..27.40 rows=10 width=213) > -> Index Scan Backward using account_transaction_on_user_and_timestamp on account_transaction (cost=0.00..1189.19 rows=434width=213) > Index Cond: (users_id = 123456) >(3 rows) > >NBET=> explain select * from account_transaction where users_id=123456 order by created desc, id desc limit 10; > QUERY PLAN >------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=1114.02..1114.04 rows=10 width=213) > -> Sort (cost=1114.02..1115.10 rows=434 width=213) > Sort Key: created, id > -> Index Scan using account_transaction_by_users_id on account_transaction (cost=0.00..1095.01 rows=434 width=213) > Index Cond: (users_id = 123456) >(5 rows) > >In case the explains doesn't explain themself good enough: we have a >transaction table with ID (primary key, serial), created (a timestamp) >and a users_id. Some of the users have generated thousands of >transactions, and the above query is a simplified version of the query >used to show the users their last transactions. Since we have a large >user base hammering our servers with this request, the speed is >significant. > >We have indices on the users_id field and the (users_id, created)-tuple. > >The timestamp is set by the application and has a resolution of 1 second >- so there may easily be several transactions sharing the same >timestamp, but this is an exception not the rule. I suppose the >developers needed to add the ID to the sort list to come around a bug, >but still prefering to have the primary sorting by created to be able to >use the index. One workaround here is to order only by id desc and >create a new index on (users_id, id) - but I really don't like adding >more indices to the transaction table. > > >---------------------------(end of broadcast)--------------------------- >TIP 6: explain analyze is your friend > > -- Graham Davis Refractions Research Inc. gdavis@refractions.net
Tobias Brox <tobias@nordicbet.com> writes: > NBET=> explain select * from account_transaction where users_id=123456 order by created desc, id desc limit 10; > We have indices on the users_id field and the (users_id, created)-tuple. Neither of those indexes can provide the sort order the query is asking for. regards, tom lane
[Tom Lane - Wed at 04:33:54PM -0400] > > We have indices on the users_id field and the (users_id, created)-tuple. > > Neither of those indexes can provide the sort order the query is asking > for. Ah; that's understandable - the planner have two options, to do a index traversion without any extra sorting, or to take out everything and then sort. What I'd like postgres to do is to traverse the index and do some sorting for every unique value of created. Maybe such a feature can be found in future releases - like Postgres 56.3? ;-)