Re: massive performance hit when using "Limit 1" - Mailing list pgsql-general

From Richard Huxton
Subject Re: massive performance hit when using "Limit 1"
Date
Msg-id 4395875A.6080303@archonet.com
Whole thread Raw
In response to massive performance hit when using "Limit 1"  (Rich Doughty <rich@opusvl.com>)
Responses Re: massive performance hit when using "Limit 1"
List pgsql-general
Rich Doughty wrote:
>
> This one goes nuts and doesn't return. is there any way i can
> force a query plan similar to the one above?
>
>   EXPLAIN SELECT _t.* FROM
>        tokens.ta_tokens       _t INNER JOIN
>        tokens.ta_tokens_stock _s ON _t.token_id = _s.token_id
>   WHERE
>        _s.retailer_id = '96599' AND
>        _t.value       = '10'
>   ORDER BY
>        _t.number ASC
>   LIMIT '1';
>                                                     QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
>
>  Limit  (cost=0.00..14967.39 rows=1 width=27)
>    ->  Nested Loop  (cost=0.00..22316378.56 rows=1491 width=27)
>          ->  Index Scan using ta_tokens_number_key on ta_tokens _t
> (cost=0.00..15519868.33 rows=1488768 width=27)
>                Filter: ((value)::numeric = 10::numeric)
>          ->  Index Scan using ta_tokens_stock_pkey on ta_tokens_stock
> _s  (cost=0.00..4.55 rows=1 width=4)
>                Index Cond: (("outer".token_id)::integer =
> (_s.token_id)::integer)
>                Filter: ((retailer_id)::integer = 96599)

I *think* what's happening here is that PG thinks it will use the index
on _t.number (since you are going to sort by that anyway) and pretty
soon find a row that will:
   1. have value=10
   2. join to a row in _s with the right retailer_id
It turns out that isn't the case, and so the query takes forever.
Without knowing what "value" and "number" mean it's difficult to be
sure, but I'd guess it's the "token_id" join part that's the problem,
since at a guess a high-numbered retailer will have tokens with
high-numbered "retailer_id".

If you'd posted EXPLAIN ANALYSE then we'd be able to see what actually
did happen.

Try the same query but with a low retailer_id (100 or something) and see
if it goes a lot quicker. If that is what the problem is, try changing
the ORDER BY to something like "_s.retailer_id, _t.value, _t.number" and
see if that gives the planner a nudge in the right direction.

Failing that, a change to your indexes will almost certainly help.
--
   Richard Huxton
   Archonet Ltd

pgsql-general by date:

Previous
From: Howard Cole
Date:
Subject: Re: Unicode Corruption and upgrading to 8.0.4. to 8.1
Next
From: "Markus Wollny"
Date:
Subject: Re: Unicode Corruption and upgrading to 8.0.4. to 8.1