Thread: trick the query optimiser to skip some optimisations

trick the query optimiser to skip some optimisations

From

Дмитрий Шалашов

Date:

29 January 2014, 23:38:26

Hi!

I have a table called 'feed'. It's a big table accessed by many types of queries, so I have quite a lot of indices on it.

Those that are relevant looks like this:

"feed_user_id_active_id_added_idx" btree (user_id, active_id, added)
"feed_user_id_added_idx" btree (user_id, added DESC)
"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1

last one is very small and tailored for the specific query.

"added" field is timestamp, everything else is integers.

That specific query looks like this:

SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;

But it doesn't use the last index. EXPLAIN shows this:

Limit (cost=0.00..463.18 rows=31 width=50)
   -> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
         Index Cond: ((user_id = 7) AND (active_id = 7))
         Filter: (type = 1)

So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(

Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.

I have PostgreSQL 9.2.6 on Debian.

Best regards,
Dmitriy Shalashov

Re: trick the query optimiser to skip some optimisations

From

Jeff Janes

Date:

29 January 2014, 23:50:05

On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:

"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1

...

SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;

But it doesn't use the last index. EXPLAIN shows this:

Limit (cost=0.00..463.18 rows=31 width=50)
   -> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
         Index Cond: ((user_id = 7) AND (active_id = 7))
         Filter: (type = 1)

So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(
Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.

How about "where user_id+0=?"

Cheers,

Jeff

Re: trick the query optimiser to skip some optimisations

From

Дмитрий Шалашов

Date:

30 January 2014, 00:17:48

Thanks for the tip!

Well, index is now used but...

Limit (cost=264291.67..264291.75 rows=31 width=50)
   -> Sort (cost=264291.67..264292.80 rows=453 width=50)
         Sort Key: added
         -> Bitmap Heap Scan on feed (cost=1850.99..264278.18 rows=453 width=50)
               Recheck Cond: ((active_id = user_id) AND (type = 1))
               Filter: ((user_id + 0) = 7)
               -> Bitmap Index Scan on feed_user_id_added_idx2 (cost=0.00..1850.88 rows=90631 width=0)

Best regards,
Dmitriy Shalashov

2014-01-30 Jeff Janes <jeff.janes@gmail.com>

On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:

"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1

...

SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;

But it doesn't use the last index. EXPLAIN shows this:

Limit (cost=0.00..463.18 rows=31 width=50)
   -> Index Scan Backward using feed_user_id_active_id_added_idx on user_feed (cost=0.00..851.66 rows=57 width=50)
         Index Cond: ((user_id = 7) AND (active_id = 7))
         Filter: (type = 1)

So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(
Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.

How about "where user_id+0=?"

Cheers,

Jeff

Re: trick the query optimiser to skip some optimisations

From

Sergey Konoplev

Date:

30 January 2014, 20:36:51

On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> I have a table called 'feed'. It's a big table accessed by many types of
> queries, so I have quite a lot of indices on it.
>
> Those that are relevant looks like this:
>
> "feed_user_id_active_id_added_idx" btree (user_id, active_id, added)
> "feed_user_id_added_idx" btree (user_id, added DESC)
> "feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id =
> user_id AND type = 1
>
> last one is very small and tailored for the specific query.
> "added" field is timestamp, everything else is integers.
[..]
>  Limit  (cost=0.00..463.18 rows=31 width=50)
>    ->  Index Scan Backward using feed_user_id_active_id_added_idx on
> user_feed  (cost=0.00..851.66 rows=57 width=50)
>          Index Cond: ((user_id = 7) AND (active_id = 7))
>          Filter: (type = 1)
[...]
> Can I do something here so optimiser would use the feed_user_id_added_idx2
> index? It's around ten times smaller than the 'generic'
> feed_user_id_active_id_added_idx index.
>
> I have PostgreSQL 9.2.6 on Debian.

Could you please show EXPLAIN ANALYZE for both cases, the current one
and with feed_user_id_active_id_added_idx dropped?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

Re: trick the query optimiser to skip some optimisations

From

Jeff Janes

Date:

30 January 2014, 22:03:39

On Wed, Jan 29, 2014 at 4:17 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:

Thanks for the tip!

Well, index is now used but...

Limit (cost=264291.67..264291.75 rows=31 width=50)
   -> Sort (cost=264291.67..264292.80 rows=453 width=50)
         Sort Key: added
         -> Bitmap Heap Scan on feed (cost=1850.99..264278.18 rows=453 width=50)
               Recheck Cond: ((active_id = user_id) AND (type = 1))
               Filter: ((user_id + 0) = 7)
               -> Bitmap Index Scan on feed_user_id_added_idx2 (cost=0.00..1850.88 rows=90631 width=0)

Ah, of course. It prevents the optimization you want, as well as the one you don't want.

This is getting very ugly, but maybe change the index to match the degenerate query:

"feed_user_id_added_idx3" btree ((user_id + 0), added DESC) WHERE active_id = user_id AND type = 1

Long term I would probably look into refactoring the table so that "active_id = user_id" is not a magical condition, like it seems to be for you currently. Maybe introduce a boolean column.

Cheers,

Jeff