Thread: trick the query optimiser to skip some optimisations
Hi!
I have a table called 'feed'. It's a big table accessed by many types of queries, so I have quite a lot of indices on it.Those that are relevant looks like this:
"feed_user_id_active_id_added_idx" btree (user_id, active_id, added)
"feed_user_id_added_idx" btree (user_id, added DESC)
"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1
last one is very small and tailored for the specific query.
"added" field is timestamp, everything else is integers.That specific query looks like this:
But it doesn't use the last index. EXPLAIN shows this:
Limit (cost=0.00..463.18 rows=31 width=50)
-> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
Index Cond: ((user_id = 7) AND (active_id = 7))
Filter: (type = 1)
Limit (cost=0.00..463.18 rows=31 width=50)
-> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
Index Cond: ((user_id = 7) AND (active_id = 7))
Filter: (type = 1)
So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(
Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.
I have PostgreSQL 9.2.6 on Debian.
Dmitriy Shalashov
On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1
...
SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;But it doesn't use the last index. EXPLAIN shows this:
Limit (cost=0.00..463.18 rows=31 width=50)
-> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
Index Cond: ((user_id = 7) AND (active_id = 7))
Filter: (type = 1)So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.
How about "where user_id+0=?"
Cheers,
Jeff
Thanks for the tip!
Well, index is now used but...Limit (cost=264291.67..264291.75 rows=31 width=50)
-> Sort (cost=264291.67..264292.80 rows=453 width=50)
Sort Key: added
-> Bitmap Heap Scan on feed (cost=1850.99..264278.18 rows=453 width=50)
Recheck Cond: ((active_id = user_id) AND (type = 1))
Filter: ((user_id + 0) = 7)
-> Bitmap Index Scan on feed_user_id_added_idx2 (cost=0.00..1850.88 rows=90631 width=0)
Dmitriy Shalashov
2014-01-30 Jeff Janes <jeff.janes@gmail.com>
On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1...SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;But it doesn't use the last index. EXPLAIN shows this:
Limit (cost=0.00..463.18 rows=31 width=50)
-> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
Index Cond: ((user_id = 7) AND (active_id = 7))
Filter: (type = 1)So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.How about "where user_id+0=?"Cheers,Jeff
On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote: > I have a table called 'feed'. It's a big table accessed by many types of > queries, so I have quite a lot of indices on it. > > Those that are relevant looks like this: > > "feed_user_id_active_id_added_idx" btree (user_id, active_id, added) > "feed_user_id_added_idx" btree (user_id, added DESC) > "feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = > user_id AND type = 1 > > last one is very small and tailored for the specific query. > "added" field is timestamp, everything else is integers. [..] > Limit (cost=0.00..463.18 rows=31 width=50) > -> Index Scan Backward using feed_user_id_active_id_added_idx on > user_feed (cost=0.00..851.66 rows=57 width=50) > Index Cond: ((user_id = 7) AND (active_id = 7)) > Filter: (type = 1) [...] > Can I do something here so optimiser would use the feed_user_id_added_idx2 > index? It's around ten times smaller than the 'generic' > feed_user_id_active_id_added_idx index. > > I have PostgreSQL 9.2.6 on Debian. Could you please show EXPLAIN ANALYZE for both cases, the current one and with feed_user_id_active_id_added_idx dropped? -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
On Wed, Jan 29, 2014 at 4:17 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
Thanks for the tip!Well, index is now used but...
Limit (cost=264291.67..264291.75 rows=31 width=50)
-> Sort (cost=264291.67..264292.80 rows=453 width=50)
Sort Key: added
-> Bitmap Heap Scan on feed (cost=1850.99..264278.18 rows=453 width=50)
Recheck Cond: ((active_id = user_id) AND (type = 1))
Filter: ((user_id + 0) = 7)
-> Bitmap Index Scan on feed_user_id_added_idx2 (cost=0.00..1850.88 rows=90631 width=0)
Ah, of course. It prevents the optimization you want, as well as the one you don't want.
This is getting very ugly, but maybe change the index to match the degenerate query:
"feed_user_id_added_idx3" btree ((user_id + 0), added DESC) WHERE active_id = user_id AND type = 1
Long term I would probably look into refactoring the table so that "active_id = user_id" is not a magical condition, like it seems to be for you currently. Maybe introduce a boolean column.
Cheers,
Jeff