Thread: [PERFORM] Query with no result set, really really slow adding ORBDER BY / LIMIT clause

[PERFORM] Query with no result set, really really slow adding ORBDER BY / LIMIT clause

From

Marco Renzi

Date:

20 April 2017, 10:19:23

Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query with no result set.
If i add the order by/limit clause it runs really really slow.

QUERY 1 FAST:
--------------------------------

SELECT  fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE   tipofase.agendafrontoffice = true

EXPLAIN ANALYZE:

Nested Loop  (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)

  ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
        Filter: agendafrontoffice
        Rows Removed by Filter: 102
  ->  Index Only Scan using fase_test_prova_4 on fase  (cost=0.43..595.59 rows=19158 width=8) (never executed)
        Index Cond: (tipofase = tipofase.id)
        Heap Fetches: 0
Planning time: 0.669 ms
Execution time: 0.141 ms

---

It's perfect because it starts from tipofase, where there are no agendafrontoffice = true

fase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
fase.id is PRIMARY key on fase, 
tipofase.id is PRIMARY key on tipofase, 
fase.tipofase is FK on tipofase.id
and tipofase.agendafrontoffice is a boolean.

I've also created a btree index on tipofase.agendafrontoffice.

**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)




QUERY 2 SLOW(WITH limit and order by):
--------------------------------


SELECT  fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE   tipofase.agendafrontoffice = true
ORDER BY        fase.id DESC limit 10 offset 0

Limit  (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
  ->  Nested Loop  (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
        Join Filter: (fase.tipofase = tipofase.id)
        ->  Index Scan Backward using test_prova_2 on fase  (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
        ->  Materialize  (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
              ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
                    Filter: agendafrontoffice
                    Rows Removed by Filter: 102
Planning time: 0.685 ms
Execution time: 173853.221 ms

Really really slow..... looks like the planner is not doing a good job.
PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit


I also run VACUUM  AND VACUUM ANALYZE on both table
I tried to play with the 
"alter table tipofase alter column agendafrontoffice set statistics 2"
but nothing.

Thanks in advance Marco

-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer

via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271

"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Marco Renzi

Date:

20 April 2017, 14:16:26

Thanks Philip, yes i tried, but that is not solving, still slow. Take a look at the log.

--------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.46..106.87 rows=10 width=4) (actual time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual time=396555.326..396555.326 rows=0 loops=1)
        Join Filter: (tipofase.id = fase.tipofase)
        -> Index Scan Backward using test_prova_2 on fase (cost=0.43..192654.24 rows=1474700 width=8) (actual time=1.147..395710.190 rows=1475146 loops=1)
        -> Materialize (cost=3.03..6.34 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1475146)
              -> Hash Semi Join (cost=3.03..6.33 rows=1 width=8) (actual time=0.081..0.081 rows=0 loops=1)
                    Hash Cond: (tipofase.id = tipofase_1.id)
                    -> Seq Scan on tipofase (cost=0.00..3.02 rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
                    -> Hash (cost=3.02..3.02 rows=1 width=4) (actual time=0.064..0.064 rows=0 loops=1)
                          Buckets: 1024 Batches: 1 Memory Usage: 0kB
                          -> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0 loops=1)
                                Filter: agendafrontoffice
                                Rows Removed by Filter: 102
Planning time: 1.254 ms
Execution time: 396555.499 ms

--------------------------------------------------------------------------------------------------------------------------

The only way to speedup i found is this one

SELECT fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = (SELECT tipofase.id FROM tipofase WHERE tipofase.agendafrontoffice = true))

ORDER BY        fase.id DESC   limit 10 offset 0

--------------------------------------------------------------------------------------------------------------------------

Limit (cost=3.45..3.58 rows=10 width=4) (actual time=0.082..0.082 rows=0 loops=1)
InitPlan 1 (returns $0)
    -> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 width=4) (actual time=0.072..0.072 rows=0 loops=1)
          Filter: agendafrontoffice
          Rows Removed by Filter: 102
-> Nested Loop (cost=0.43..27080.93 rows=2118540 width=4) (actual time=0.081..0.081 rows=0 loops=1)
        -> Index Only Scan Backward using fase_test_prova_4 on fase (cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080 rows=0 loops=1)
              Index Cond: (tipofase = $0)
              Heap Fetches: 0
        -> Materialize (cost=0.00..3.53 rows=102 width=0) (never executed)
              -> Seq Scan on tipofase (cost=0.00..3.02 rows=102 width=0) (never executed)
Planning time: 0.471 ms
Execution time: 0.150 ms

--------------------------------------------------------------------------------------------------------------------------

Anyone knows?

I'm a bit worried about performance in my web app beacause sometimes filters are written dinamically at the end, and i would like to avoid these problems.

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

vinny

Date:

20 April 2017, 14:54:47

On 2017-04-20 13:16, Marco Renzi wrote:
> Thanks Philip, yes i tried, but that is not solving, still slow. Take
> a look at the log.
>
>
--------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=3.46..106.87 rows=10 width=4) (actual
> time=396555.327..396555.327 rows=0 loops=1)
>   ->  Nested Loop  (cost=3.46..214781.07 rows=20770 width=4) (actual
> time=396555.326..396555.326 rows=0 loops=1)
>         Join Filter: (tipofase.id [1] = fase.tipofase)
>         ->  Index Scan Backward using test_prova_2 on fase
> (cost=0.43..192654.24 rows=1474700 width=8) (actual
> time=1.147..395710.190 rows=1475146 loops=1)
>         ->  Materialize  (cost=3.03..6.34 rows=1 width=8) (actual
> time=0.000..0.000 rows=0 loops=1475146)
>               ->  Hash Semi Join  (cost=3.03..6.33 rows=1 width=8)
> (actual time=0.081..0.081 rows=0 loops=1)
>                     Hash Cond: (tipofase.id [1] = tipofase_1.id [2])
>                     ->  Seq Scan on tipofase  (cost=0.00..3.02
> rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
>                     ->  Hash  (cost=3.02..3.02 rows=1 width=4) (actual
> time=0.064..0.064 rows=0 loops=1)
>                           Buckets: 1024  Batches: 1  Memory Usage: 0kB
>                           ->  Seq Scan on tipofase tipofase_1
> (cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0
> loops=1)
>                                 Filter: agendafrontoffice
>                                 Rows Removed by Filter: 102
> Planning time: 1.254 ms
> Execution time: 396555.499 ms
>
>
--------------------------------------------------------------------------------------------------------------------------
>
> THE ONLY WAY TO SPEEDUP I FOUND IS THIS ONE
>
> SELECT  fase.id [3]
> FROM            tipofase
> JOIN       fase
> ON         (fase.tipofase = (SELECT tipofase.id [1] FROM tipofase
> WHERE tipofase.agendafrontoffice = true))
>
> ORDER BY        fase.id [3] DESC   limit 10 offset 0
>
>
--------------------------------------------------------------------------------------------------------------------------
>
> Limit  (cost=3.45..3.58 rows=10 width=4) (actual time=0.082..0.082
> rows=0 loops=1)
>   InitPlan 1 (returns $0)
>     ->  Seq Scan on tipofase tipofase_1  (cost=0.00..3.02 rows=1
> width=4) (actual time=0.072..0.072 rows=0 loops=1)
>           Filter: agendafrontoffice
>           Rows Removed by Filter: 102
>   ->  Nested Loop  (cost=0.43..27080.93 rows=2118540 width=4) (actual
> time=0.081..0.081 rows=0 loops=1)
>         ->  Index Only Scan Backward using fase_test_prova_4 on fase
> (cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080
> rows=0 loops=1)
>               Index Cond: (tipofase = $0)
>               Heap Fetches: 0
>         ->  Materialize  (cost=0.00..3.53 rows=102 width=0) (never
> executed)
>               ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=102
> width=0) (never executed)
> Planning time: 0.471 ms
> Execution time: 0.150 ms
>
>
--------------------------------------------------------------------------------------------------------------------------
>
> Anyone knows?
> I'm a bit worried about performance in my web app beacause sometimes
> filters are written dinamically at the end, and i would like to avoid
> these problems.
>


What was it that Philip suggested? I can't find his reply in the list
and you didn't quote it...

Did you try reversing the order of the tables, so join fase to tipofase,
instead of tipofase to fase.
Also, did you try a partial index on tipofase.id where
tipofase.agendafrontoffice = true?

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Marco Renzi

Date:

20 April 2017, 15:16:41

Sorry Vinny, this was what Philip suggested:

Have you tried changing your query to:

SELECT id

FROM fase

WHERE tipofase IN (SELECT ID from tipofase WHERE agendafrontoffice = true)

ORDER BY id DESC

LIMIT 10 OFFSET 0

And this is my log:

--------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.46..106.87 rows=10 width=4) (actual
time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual
time=396555.326..396555.326 rows=0 loops=1)
Join Filter: (tipofase.id [1] = fase.tipofase)
-> Index Scan Backward using test_prova_2 on fase
(cost=0.43..192654.24 rows=1474700 width=8) (actual
time=1.147..395710.190 rows=1475146 loops=1)
-> Materialize (cost=3.03..6.34 rows=1 width=8) (actual
time=0.000..0.000 rows=0 loops=1475146)
-> Hash Semi Join (cost=3.03..6.33 rows=1 width=8)
(actual time=0.081..0.081 rows=0 loops=1)
Hash Cond: (tipofase.id [1] = tipofase_1.id [2])
-> Seq Scan on tipofase (cost=0.00..3.02
rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Hash (cost=3.02..3.02 rows=1 width=4) (actual
time=0.064..0.064 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Seq Scan on tipofase tipofase_1
(cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0
loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 1.254 ms
Execution time: 396555.499 ms

--------------------------------------------------------------------------------------------------------------------------

2017-04-20 13:54 GMT+02:00 vinny <vinny@xs4all.nl>:

On 2017-04-20 13:16, Marco Renzi wrote:
Thanks Philip, yes i tried, but that is not solving, still slow. Take
a look at the log.

--------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.46..106.87 rows=10 width=4) (actual
time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual
time=396555.326..396555.326 rows=0 loops=1)
Join Filter: (tipofase.id [1] = fase.tipofase)
-> Index Scan Backward using test_prova_2 on fase
(cost=0.43..192654.24 rows=1474700 width=8) (actual
time=1.147..395710.190 rows=1475146 loops=1)
-> Materialize (cost=3.03..6.34 rows=1 width=8) (actual
time=0.000..0.000 rows=0 loops=1475146)
-> Hash Semi Join (cost=3.03..6.33 rows=1 width=8)
(actual time=0.081..0.081 rows=0 loops=1)
Hash Cond: (tipofase.id [1] = tipofase_1.id [2])
-> Seq Scan on tipofase (cost=0.00..3.02
rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Hash (cost=3.02..3.02 rows=1 width=4) (actual
time=0.064..0.064 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Seq Scan on tipofase tipofase_1
(cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0
loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 1.254 ms
Execution time: 396555.499 ms

--------------------------------------------------------------------------------------------------------------------------

THE ONLY WAY TO SPEEDUP I FOUND IS THIS ONE

SELECT fase.id [3]
FROM tipofase
JOIN fase
ON (fase.tipofase = (SELECT tipofase.id [1] FROM tipofase
WHERE tipofase.agendafrontoffice = true))

ORDER BY fase.id [3] DESC limit 10 offset 0

--------------------------------------------------------------------------------------------------------------------------

Limit (cost=3.45..3.58 rows=10 width=4) (actual time=0.082..0.082
rows=0 loops=1)
InitPlan 1 (returns $0)
-> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1
width=4) (actual time=0.072..0.072 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Nested Loop (cost=0.43..27080.93 rows=2118540 width=4) (actual
time=0.081..0.081 rows=0 loops=1)
-> Index Only Scan Backward using fase_test_prova_4 on fase
(cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080
rows=0 loops=1)
Index Cond: (tipofase = $0)
Heap Fetches: 0
-> Materialize (cost=0.00..3.53 rows=102 width=0) (never
executed)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=102
width=0) (never executed)
Planning time: 0.471 ms
Execution time: 0.150 ms

--------------------------------------------------------------------------------------------------------------------------

Anyone knows?
I'm a bit worried about performance in my web app beacause sometimes
filters are written dinamically at the end, and i would like to avoid
these problems.

What was it that Philip suggested? I can't find his reply in the list and you didn't quote it...

Did you try reversing the order of the tables, so join fase to tipofase, instead of tipofase to fase.
Also, did you try a partial index on tipofase.id where tipofase.agendafrontoffice = true?

"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Pavel Stehule

Date:

20 April 2017, 18:57:02

2017-04-20 9:19 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:

Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query with no result set.
If i add the order by/limit clause it runs really really slow.

QUERY 1 FAST:
--------------------------------

SELECT  fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE   tipofase.agendafrontoffice = true

EXPLAIN ANALYZE:

Nested Loop  (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)

  ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
        Filter: agendafrontoffice
        Rows Removed by Filter: 102
  ->  Index Only Scan using fase_test_prova_4 on fase  (cost=0.43..595.59 rows=19158 width=8) (never executed)
        Index Cond: (tipofase = tipofase.id)
        Heap Fetches: 0
Planning time: 0.669 ms
Execution time: 0.141 ms

---

It's perfect because it starts from tipofase, where there are no agendafrontoffice = true

fase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
fase.id is PRIMARY key on fase, 
tipofase.id is PRIMARY key on tipofase, 
fase.tipofase is FK on tipofase.id
and tipofase.agendafrontoffice is a boolean.

I've also created a btree index on tipofase.agendafrontoffice.

**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)




QUERY 2 SLOW(WITH limit and order by):
--------------------------------


SELECT  fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE   tipofase.agendafrontoffice = true
ORDER BY        fase.id DESC limit 10 offset 0

Limit  (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
  ->  Nested Loop  (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
        Join Filter: (fase.tipofase = tipofase.id)
        ->  Index Scan Backward using test_prova_2 on fase  (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
        ->  Materialize  (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
              ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
                    Filter: agendafrontoffice
                    Rows Removed by Filter: 102
Planning time: 0.685 ms
Execution time: 173853.221 ms

I am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.

Regards

Pavel

Really really slow..... looks like the planner is not doing a good job.
PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit


I also run VACUUM  AND VACUUM ANALYZE on both table
I tried to play with the 
"alter table tipofase alter column agendafrontoffice set statistics 2"
but nothing.
Thanks in advance Marco
--
-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer

via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271

"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Pavel Stehule

Date:

20 April 2017, 19:05:59

2017-04-20 17:57 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-04-20 9:19 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:

Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query with no result set.
If i add the order by/limit clause it runs really really slow.

QUERY 1 FAST:
--------------------------------

SELECT  fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE   tipofase.agendafrontoffice = true

EXPLAIN ANALYZE:

Nested Loop  (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)

  ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
        Filter: agendafrontoffice
        Rows Removed by Filter: 102
  ->  Index Only Scan using fase_test_prova_4 on fase  (cost=0.43..595.59 rows=19158 width=8) (never executed)
        Index Cond: (tipofase = tipofase.id)
        Heap Fetches: 0
Planning time: 0.669 ms
Execution time: 0.141 ms

---

It's perfect because it starts from tipofase, where there are no agendafrontoffice = true

fase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
fase.id is PRIMARY key on fase, 
tipofase.id is PRIMARY key on tipofase, 
fase.tipofase is FK on tipofase.id
and tipofase.agendafrontoffice is a boolean.

I've also created a btree index on tipofase.agendafrontoffice.

**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)




QUERY 2 SLOW(WITH limit and order by):
--------------------------------


SELECT  fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE   tipofase.agendafrontoffice = true
ORDER BY        fase.id DESC limit 10 offset 0

Limit  (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
  ->  Nested Loop  (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
        Join Filter: (fase.tipofase = tipofase.id)
        ->  Index Scan Backward using test_prova_2 on fase  (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
        ->  Materialize  (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
              ->  Seq Scan on tipofase  (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
                    Filter: agendafrontoffice
                    Rows Removed by Filter: 102
Planning time: 0.685 ms
Execution time: 173853.221 ms

Regards

It can looks strange, but it can work

SELECT *

FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) s

ORDER BY ...

LIMIT 10;

Regards

Pavel

Pavel

Really really slow..... looks like the planner is not doing a good job.
PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit

I also run VACUUM AND VACUUM ANALYZE on both table
I tried to play with the
"alter table tipofase alter column agendafrontoffice set statistics 2"
but nothing.

Thanks in advance Marco

--
-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer

via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271

"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Marco Renzi

Date:

21 April 2017, 09:49:54

This could look strange, but is fast as hell!

The main problem is:

Is everytime ok doing query like this with order by and limit? Is ok using an upperlimit to 1.000.000.000 records?

SELECT * FROM (
SELECT fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE agendafrontoffice = true
ORDER BY        fase.id DESC   limit 1000000000 offset 0
) A
ORDER BY        A.id DESC   limit 10 offset 0

2017-04-20 18:05 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

I am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.

Regards

It can looks strange, but it can work

SELECT *
FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) s
ORDER BY ...
LIMIT 10;

Regards

Pavel

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Pavel Stehule

Date:

21 April 2017, 09:56:17

2017-04-21 8:49 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:

This could look strange, but is fast as hell!
The main problem is:
Is everytime ok doing query like this with order by and limit? Is ok using an upperlimit to 1.000.000.000 records?

I am thinking so limit 10000 should be ok. Too big number can be messy for optimizer similarly like too small number.

The planner is driven by statistics - and the statistics are not perfect - usually it is working on 80% - like weather forecasting.

Usually it is working, but sometimes not.

Regards

Pavel

SELECT * FROM (
SELECT fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = tipofase.id)
WHERE agendafrontoffice = true
ORDER BY        fase.id DESC   limit 1000000000 offset 0
) A
ORDER BY        A.id DESC   limit 10 offset 0

2017-04-20 18:05 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

I am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.

Regards

It can looks strange, but it can work

SELECT *
FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) s
ORDER BY ...
LIMIT 10;

Regards

Pavel

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Marco Renzi

Date:

21 April 2017, 10:05:36

I am thinking so limit 10000 should be ok. Too big number can be messy for optimizer similarly like too small number.

The planner is driven by statistics - and the statistics are not perfect - usually it is working on 80% - like weather forecasting.

Usually it is working, but sometimes not.

Regards

Pavel

Thanks Pavel, i almost found two solutions at the end:

One is to use an inner limit as you said, and the other, when you just know what the filter is,
is to try to join with SELECTS that have to be executed first from the planner.

Eg
SELECT fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = (SELECT tipofase.id FROM tipofase WHERE tipofase.agendafrontoffice = true))

ORDER BY        fase.id DESC   limit 10 offset 0

Thanks for the help

--

"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman

Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause

From

Pavel Stehule

Date:

21 April 2017, 10:12:53

2017-04-21 9:05 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:

I am thinking so limit 10000 should be ok. Too big number can be messy for optimizer similarly like too small number.

The planner is driven by statistics - and the statistics are not perfect - usually it is working on 80% - like weather forecasting.

Usually it is working, but sometimes not.

Regards

Pavel

Thanks Pavel, i almost found two solutions at the end:
One is to use an inner limit as you said, and the other, when you just know what the filter is,
is to try to join with SELECTS that have to be executed first from the planner.

Eg
SELECT fase.id
FROM            tipofase
JOIN       fase
ON         (fase.tipofase = (SELECT tipofase.id FROM tipofase WHERE tipofase.agendafrontoffice = true))

ORDER BY        fase.id DESC   limit 10 offset 0

Thanks for the help

yes, sometimes when the data are not homogeneous more queries are necessary

Regards

Pavel

--
-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer

via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271

"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman