Thread: limit order by performance issue

limit order by performance issue

From
Pedro Jiménez
Date:
Hello,
   I'm trying to do a simple SQL query over Postgresl 9.0 running on Ubuntu.

I have a large table (over 100 million records) with three fields,
id_signal (bigint), time_stamp (timestamp) and var_value (float).

My query looks like this:

select var_value from ism_floatvalues where id_signal = 29660 order by
time_stamp desc limit 1;

So I want to select the last value from a determinated ID (is_signal).

This query runs FOREVER, while if I delete "limit 1" it runs instantly....

Any help?

Regards.




Re: limit order by performance issue

From
Pavel Stehule
Date:
2012/10/15 Pedro Jiménez <p.jimenez@ismsolar.com>:
> Hello,
>   I'm trying to do a simple SQL query over Postgresl 9.0 running on Ubuntu.
>
> I have a large table (over 100 million records) with three fields, id_signal
> (bigint), time_stamp (timestamp) and var_value (float).
>
> My query looks like this:
>
> select var_value from ism_floatvalues where id_signal = 29660 order by
> time_stamp desc limit 1;
>
> So I want to select the last value from a determinated ID (is_signal).
>
> This query runs FOREVER, while if I delete "limit 1" it runs instantly....

did you ANALYZE your tables?

Can you send EXPLAIN ANALYZE result of both queries?

Regards

Pavel Stehule


>
> Any help?
>
> Regards.
>
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Re: limit order by performance issue

From
Shaun Thomas
Date:
On 10/15/2012 12:44 PM, Pedro Jiménez wrote:

> select var_value from ism_floatvalues where id_signal = 29660 order by
> time_stamp desc limit 1;

Well, we'd have to see an EXPLAIN plan to really know what's going on
here, but it often boils down to the planner being overly optimistic
when low limits are specified. I bet you have an index on time_stamp,
don't you?

In that case, the planner would reverse index-scan that index,
estimating that the chances of it finding ID 29660 are less expensive
than fetching all of the rows that match the ID directly, and throwing
away all but 1 row. Remember, it would have to read all of those values
to know which is the most recent.

You can fix this a couple of ways:

1. Put a two-column index on these values:

CREATE INDEX idx_ordered_signal
     ON ism_floatvalues (id_signal, time_stamp DESC);

Which turns any request for that particular combo into a single index fetch.

2. You can trick the planner by introducing an optimization fence:

SELECT var_value
   FROM (
          SELECT var_value, time_stamp
            FROM ism_floatvalues
           WHERE id_signal = 29660
          OFFSET 0
        )
  ORDER BY time_stamp DESC
  LIMIT 1;

Quite a few people will probably grouse at me for giving you that as an
option, but it does work better than LIMIT 1 more often than it probably
should.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: limit order by performance issue

From
Karl Denninger
Date:
Put an index on time_stamp (I assume there is one on id_signal already)

On 10/15/2012 12:44 PM, Pedro Jiménez wrote:
Hello,
  I'm trying to do a simple SQL query over Postgresl 9.0 running on Ubuntu.

I have a large table (over 100 million records) with three fields, id_signal (bigint), time_stamp (timestamp) and var_value (float).

My query looks like this:

select var_value from ism_floatvalues where id_signal = 29660 order by time_stamp desc limit 1;

So I want to select the last value from a determinated ID (is_signal).

This query runs FOREVER, while if I delete "limit 1" it runs instantly....

Any help?

Regards.





--
-- Karl Denninger
The Market Ticker ®
Cuda Systems LLC

Re: limit order by performance issue

From
Marti Raudsepp
Date:
On Tue, Oct 16, 2012 at 10:47 PM, Karl Denninger <karl@denninger.net> wrote:
> Put an index on time_stamp (I assume there is one on id_signal already)

Well the optimal index for this particular query would include both columns:
(id_signal, time_stamp) -- in this order.

Additionally, if you want to take advantage of the index-only scans
feature, add the SELECTed column too:
(id_signal, time_stamp, var_value)

Regards,
Marti


Re: limit order by performance issue

From
Pedro Jiménez Pérez
Date:
For this query:

select var_value from ism_floatvalues where id_signal = 29660 order by time_stamp desc limit 1;

This is what EXPLAIN returns (can't make EXPLAIN ANALYZE because it "never" ends):

"Limit  (cost=0.00..258.58 rows=1 width=16)"
"  ->  Index Scan Backward using ism_floatvalues_index_time_stamp on ism_floatvalues  (cost=0.00..8912076.82 rows=34466 width=16)"
"        Filter: (id_signal = 29660)"

This is EXPLAIN ANALYZE without "limit 1":

"Sort  (cost=93683.39..93769.56 rows=34466 width=16) (actual time=188.643..188.650 rows=1 loops=1)"
"  Sort Key: time_stamp"
"  Sort Method:  quicksort  Memory: 17kB"
"  ->  Index Scan using ism_floatvalues_index on ism_floatvalues  (cost=0.00..90494.38 rows=34466 width=16) (actual time=188.019..188.030 rows=1 loops=1)"
"        Index Cond: (id_signal = 29660)"
"Total runtime: 189.033 ms"

Note that I have created two indexes, the first on id_signal and the second on time_stamp.
Regards.

El 16/10/2012 21:23, Pavel Stehule escribió:
2012/10/15 Pedro Jiménez <p.jimenez@ismsolar.com>:
Hello, I'm trying to do a simple SQL query over Postgresl 9.0 running on Ubuntu.

I have a large table (over 100 million records) with three fields, id_signal
(bigint), time_stamp (timestamp) and var_value (float).

My query looks like this:

select var_value from ism_floatvalues where id_signal = 29660 order by
time_stamp desc limit 1;

So I want to select the last value from a determinated ID (is_signal).

This query runs FOREVER, while if I delete "limit 1" it runs instantly....
did you ANALYZE your tables?

Can you send EXPLAIN ANALYZE result of both queries?

Regards

Pavel Stehule


Any help?

Regards.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

--

Pedro Jiménez Pérez
p.jimenez@ismsolar.com

Innovación en Sistemas de Monitorización, S.L.
Edificio Hevimar
C/ Iván Pavlov
2 y 4 - Parcela 4 2ª Planta Local 9
Parque Tecnológico de Andalucía
29590 Campanillas (Málaga)
Tlfno. 952 02 07 13
contacto@ismsolar.com

firma_gpt.jpg, 1 kB

Antes de imprimir, piensa en tu responsabilidad y compromiso con el MEDIO AMBIENTE!

Before printing, think about your responsibility and commitment with the ENVIRONMENT!

CLÁUSULA DE CONFIDENCIALIDAD.- Este mensaje, y en su caso, cualquier fichero anexo al mismo, puede contener información confidencial o legalmente protegida (LOPD 15/1999 de 13 de Diciembre), siendo para uso exclusivo del destinatario. No hay renuncia a la confidencialidad o secreto profesional por cualquier transmisión defectuosa o errónea, y queda expresamente prohibida su divulgación, copia o distribución a terceros sin la autorización expresa del remitente. Si ha recibido este mensaje por error, se ruega lo notifique al remitente enviando un mensaje al correo electrónico contacto@ismsolar.com y proceda inmediatamente al borrado del mensaje original y de todas sus copias. Gracias por su colaboración.

Attachment

Re: limit order by performance issue

From
Claudio Freire
Date:


On Wed, Oct 17, 2012 at 6:14 AM, Pedro Jiménez Pérez <p.jimenez@ismsolar.com> wrote:
select var_value from ism_floatvalues where id_signal = 29660 order by time_stamp desc limit 1;

This is what EXPLAIN returns (can't make EXPLAIN ANALYZE because it "never" ends):

"Limit  (cost=0.00..258.58 rows=1 width=16)"
"  ->  Index Scan Backward using ism_floatvalues_index_time_stamp on ism_floatvalues  (cost=0.00..8912076.82 rows=34466 width=16)"
"        Filter: (id_signal = 29660)"

This is EXPLAIN ANALYZE without "limit 1":

Add (or modify the existing) an index on id_signal, time_stamp desc, and you're done.

It must be a case of descending time stamps not hitting the filter condition (id_signal) soon enough.