Thread: [PERFORM] Query with no result set, really really slow adding ORBDER BY / LIMIT clause
[PERFORM] Query with no result set, really really slow adding ORBDER BY / LIMIT clause
From
Marco Renzi
Date:
Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query with no result set.
If i add the order by/limit clause it runs really really slow.
**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)
--
If i add the order by/limit clause it runs really really slow.
QUERY 1 FAST:
--------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE tipofase.agendafrontoffice = true
EXPLAIN ANALYZE:
Nested Loop (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Index Only Scan using fase_test_prova_4 on fase (cost=0.43..595.59 rows=19158 width=8) (never executed)
Index Cond: (tipofase = tipofase.id)
Heap Fetches: 0
Planning time: 0.669 ms
Execution time: 0.141 ms
---
It's perfect because it starts from tipofase, where there are no agendafrontoffice = true
fase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
fase.id is PRIMARY key on fase,
tipofase.id is PRIMARY key on tipofase,
fase.tipofase is FK on tipofase.id
and tipofase.agendafrontoffice is a boolean.
I've also created a btree index on tipofase.agendafrontoffice.
**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)
QUERY 2 SLOW(WITH limit and order by):
--------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE tipofase.agendafrontoffice = true
ORDER BY fase.id DESC limit 10 offset 0
Limit (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
-> Nested Loop (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
Join Filter: (fase.tipofase = tipofase.id)
-> Index Scan Backward using test_prova_2 on fase (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
-> Materialize (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 0.685 ms
Execution time: 173853.221 ms
Really really slow..... looks like the planner is not doing a good job.
PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit
I also run VACUUM AND VACUUM ANALYZE on both table
I tried to play with the
"alter table tipofase alter column agendafrontoffice set statistics 2"
but nothing.
Thanks in advance Marco
--
-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid HoffmanIng. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Marco Renzi
Date:
Thanks Philip, yes i tried, but that is not solving, still slow. Take a look at the log.
------------------------------ ------------------------------ ------------------------------ ------------------------------ --
Limit (cost=3.46..106.87 rows=10 width=4) (actual time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual time=396555.326..396555.326 rows=0 loops=1)
Join Filter: (tipofase.id = fase.tipofase)
-> Index Scan Backward using test_prova_2 on fase (cost=0.43..192654.24 rows=1474700 width=8) (actual time=1.147..395710.190 rows=1475146 loops=1)
-> Materialize (cost=3.03..6.34 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1475146)
-> Hash Semi Join (cost=3.03..6.33 rows=1 width=8) (actual time=0.081..0.081 rows=0 loops=1)
Hash Cond: (tipofase.id = tipofase_1.id)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Hash (cost=3.02..3.02 rows=1 width=4) (actual time=0.064..0.064 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 1.254 ms
Execution time: 396555.499 ms
------------------------------ ------------------------------ ------------------------------ ------------------------------ --
The only way to speedup i found is this one------------------------------
Limit (cost=3.46..106.87 rows=10 width=4) (actual time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual time=396555.326..396555.326 rows=0 loops=1)
Join Filter: (tipofase.id = fase.tipofase)
-> Index Scan Backward using test_prova_2 on fase (cost=0.43..192654.24 rows=1474700 width=8) (actual time=1.147..395710.190 rows=1475146 loops=1)
-> Materialize (cost=3.03..6.34 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1475146)
-> Hash Semi Join (cost=3.03..6.33 rows=1 width=8) (actual time=0.081..0.081 rows=0 loops=1)
Hash Cond: (tipofase.id = tipofase_1.id)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Hash (cost=3.02..3.02 rows=1 width=4) (actual time=0.064..0.064 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0 loops=1)
Planning time: 1.254 ms
Execution time: 396555.499 ms
------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = (SELECT tipofase.id FROM tipofase WHERE tipofase.agendafrontoffice = true))
ORDER BY fase.id DESC limit 10 offset 0
------------------------------
Limit (cost=3.45..3.58 rows=10 width=4) (actual time=0.082..0.082 rows=0 loops=1)
InitPlan 1 (returns $0)
-> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 width=4) (actual time=0.072..0.072 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Nested Loop (cost=0.43..27080.93 rows=2118540 width=4) (actual time=0.081..0.081 rows=0 loops=1)
-> Index Only Scan Backward using fase_test_prova_4 on fase (cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080 rows=0 loops=1)
Index Cond: (tipofase = $0)
Heap Fetches: 0
-> Materialize (cost=0.00..3.53 rows=102 width=0) (never executed)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=102 width=0) (never executed)
Planning time: 0.471 ms
Execution time: 0.150 ms
------------------------------ ------------------------------ ------------------------------ ------------------------------ --
InitPlan 1 (returns $0)
-> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 width=4) (actual time=0.072..0.072 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Nested Loop (cost=0.43..27080.93 rows=2118540 width=4) (actual time=0.081..0.081 rows=0 loops=1)
-> Index Only Scan Backward using fase_test_prova_4 on fase (cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080 rows=0 loops=1)
Index Cond: (tipofase = $0)
Heap Fetches: 0
-> Materialize (cost=0.00..3.53 rows=102 width=0) (never executed)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=102 width=0) (never executed)
Planning time: 0.471 ms
Execution time: 0.150 ms
------------------------------
Anyone knows?
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
vinny
Date:
On 2017-04-20 13:16, Marco Renzi wrote: > Thanks Philip, yes i tried, but that is not solving, still slow. Take > a look at the log. > > -------------------------------------------------------------------------------------------------------------------------- > Limit (cost=3.46..106.87 rows=10 width=4) (actual > time=396555.327..396555.327 rows=0 loops=1) > -> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual > time=396555.326..396555.326 rows=0 loops=1) > Join Filter: (tipofase.id [1] = fase.tipofase) > -> Index Scan Backward using test_prova_2 on fase > (cost=0.43..192654.24 rows=1474700 width=8) (actual > time=1.147..395710.190 rows=1475146 loops=1) > -> Materialize (cost=3.03..6.34 rows=1 width=8) (actual > time=0.000..0.000 rows=0 loops=1475146) > -> Hash Semi Join (cost=3.03..6.33 rows=1 width=8) > (actual time=0.081..0.081 rows=0 loops=1) > Hash Cond: (tipofase.id [1] = tipofase_1.id [2]) > -> Seq Scan on tipofase (cost=0.00..3.02 > rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1) > -> Hash (cost=3.02..3.02 rows=1 width=4) (actual > time=0.064..0.064 rows=0 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 0kB > -> Seq Scan on tipofase tipofase_1 > (cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0 > loops=1) > Filter: agendafrontoffice > Rows Removed by Filter: 102 > Planning time: 1.254 ms > Execution time: 396555.499 ms > > -------------------------------------------------------------------------------------------------------------------------- > > THE ONLY WAY TO SPEEDUP I FOUND IS THIS ONE > > SELECT fase.id [3] > FROM tipofase > JOIN fase > ON (fase.tipofase = (SELECT tipofase.id [1] FROM tipofase > WHERE tipofase.agendafrontoffice = true)) > > ORDER BY fase.id [3] DESC limit 10 offset 0 > > -------------------------------------------------------------------------------------------------------------------------- > > Limit (cost=3.45..3.58 rows=10 width=4) (actual time=0.082..0.082 > rows=0 loops=1) > InitPlan 1 (returns $0) > -> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1 > width=4) (actual time=0.072..0.072 rows=0 loops=1) > Filter: agendafrontoffice > Rows Removed by Filter: 102 > -> Nested Loop (cost=0.43..27080.93 rows=2118540 width=4) (actual > time=0.081..0.081 rows=0 loops=1) > -> Index Only Scan Backward using fase_test_prova_4 on fase > (cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080 > rows=0 loops=1) > Index Cond: (tipofase = $0) > Heap Fetches: 0 > -> Materialize (cost=0.00..3.53 rows=102 width=0) (never > executed) > -> Seq Scan on tipofase (cost=0.00..3.02 rows=102 > width=0) (never executed) > Planning time: 0.471 ms > Execution time: 0.150 ms > > -------------------------------------------------------------------------------------------------------------------------- > > Anyone knows? > I'm a bit worried about performance in my web app beacause sometimes > filters are written dinamically at the end, and i would like to avoid > these problems. > What was it that Philip suggested? I can't find his reply in the list and you didn't quote it... Did you try reversing the order of the tables, so join fase to tipofase, instead of tipofase to fase. Also, did you try a partial index on tipofase.id where tipofase.agendafrontoffice = true?
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Marco Renzi
Date:
Sorry Vinny, this was what Philip suggested:
Have you tried changing your query to:
And this is my log:Have you tried changing your query to:
SELECT id
FROM fase
WHERE tipofase IN (SELECT ID from tipofase WHERE agendafrontoffice = true)
ORDER BY id DESC
LIMIT 10 OFFSET 0------------------------------
Limit (cost=3.46..106.87 rows=10 width=4) (actual
time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual
time=396555.326..396555.326 rows=0 loops=1)
Join Filter: (tipofase.id [1] = fase.tipofase)
-> Index Scan Backward using test_prova_2 on fase
(cost=0.43..192654.24 rows=1474700 width=8) (actual
time=1.147..395710.190 rows=1475146 loops=1)
-> Materialize (cost=3.03..6.34 rows=1 width=8) (actual
time=0.000..0.000 rows=0 loops=1475146)
-> Hash Semi Join (cost=3.03..6.33 rows=1 width=8)
(actual time=0.081..0.081 rows=0 loops=1)
Hash Cond: (tipofase.id [1] = tipofase_1.id [2])
-> Seq Scan on tipofase (cost=0.00..3.02
rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Hash (cost=3.02..3.02 rows=1 width=4) (actual
time=0.064..0.064 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Seq Scan on tipofase tipofase_1
(cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0
loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 1.254 ms
Execution time: 396555.499 ms
------------------------------
2017-04-20 13:54 GMT+02:00 vinny <vinny@xs4all.nl>:
On 2017-04-20 13:16, Marco Renzi wrote:Thanks Philip, yes i tried, but that is not solving, still slow. Take
a look at the log.
------------------------------------------------------------ ------------------------------ ------------------------------ --
Limit (cost=3.46..106.87 rows=10 width=4) (actual
time=396555.327..396555.327 rows=0 loops=1)
-> Nested Loop (cost=3.46..214781.07 rows=20770 width=4) (actual
time=396555.326..396555.326 rows=0 loops=1)
Join Filter: (tipofase.id [1] = fase.tipofase)
-> Index Scan Backward using test_prova_2 on fase
(cost=0.43..192654.24 rows=1474700 width=8) (actual
time=1.147..395710.190 rows=1475146 loops=1)
-> Materialize (cost=3.03..6.34 rows=1 width=8) (actual
time=0.000..0.000 rows=0 loops=1475146)
-> Hash Semi Join (cost=3.03..6.33 rows=1 width=8)
(actual time=0.081..0.081 rows=0 loops=1)
Hash Cond: (tipofase.id [1] = tipofase_1.id [2])
-> Seq Scan on tipofase (cost=0.00..3.02
rows=102 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Hash (cost=3.02..3.02 rows=1 width=4) (actual
time=0.064..0.064 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Seq Scan on tipofase tipofase_1
(cost=0.00..3.02 rows=1 width=4) (actual time=0.063..0.063 rows=0
loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 1.254 ms
Execution time: 396555.499 ms
------------------------------------------------------------ ------------------------------ ------------------------------ --
THE ONLY WAY TO SPEEDUP I FOUND IS THIS ONE
SELECT fase.id [3]
FROM tipofase
JOIN fase
ON (fase.tipofase = (SELECT tipofase.id [1] FROM tipofase
WHERE tipofase.agendafrontoffice = true))
ORDER BY fase.id [3] DESC limit 10 offset 0
------------------------------------------------------------ ------------------------------ ------------------------------ --
Limit (cost=3.45..3.58 rows=10 width=4) (actual time=0.082..0.082
rows=0 loops=1)
InitPlan 1 (returns $0)
-> Seq Scan on tipofase tipofase_1 (cost=0.00..3.02 rows=1
width=4) (actual time=0.072..0.072 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Nested Loop (cost=0.43..27080.93 rows=2118540 width=4) (actual
time=0.081..0.081 rows=0 loops=1)
-> Index Only Scan Backward using fase_test_prova_4 on fase
(cost=0.43..595.90 rows=20770 width=4) (actual time=0.080..0.080
rows=0 loops=1)
Index Cond: (tipofase = $0)
Heap Fetches: 0
-> Materialize (cost=0.00..3.53 rows=102 width=0) (never
executed)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=102
width=0) (never executed)
Planning time: 0.471 ms
Execution time: 0.150 ms
------------------------------------------------------------ ------------------------------ ------------------------------ --
Anyone knows?
I'm a bit worried about performance in my web app beacause sometimes
filters are written dinamically at the end, and i would like to avoid
these problems.
What was it that Philip suggested? I can't find his reply in the list and you didn't quote it...
Did you try reversing the order of the tables, so join fase to tipofase, instead of tipofase to fase.
Also, did you try a partial index on tipofase.id where tipofase.agendafrontoffice = true?
--
-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid HoffmanIng. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Pavel Stehule
Date:
2017-04-20 9:19 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:
Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query with no result set.
If i add the order by/limit clause it runs really really slow.QUERY 1 FAST:
--------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE tipofase.agendafrontoffice = true
EXPLAIN ANALYZE:
Nested Loop (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)-> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Index Only Scan using fase_test_prova_4 on fase (cost=0.43..595.59 rows=19158 width=8) (never executed)
Index Cond: (tipofase = tipofase.id)
Heap Fetches: 0
Planning time: 0.669 ms
Execution time: 0.141 ms
---
It's perfect because it starts from tipofase, where there are no agendafrontoffice = truefase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
fase.id is PRIMARY key on fase,
tipofase.id is PRIMARY key on tipofase,
fase.tipofase is FK on tipofase.id
and tipofase.agendafrontoffice is a boolean.I've also created a btree index on tipofase.agendafrontoffice.
**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)
QUERY 2 SLOW(WITH limit and order by):
--------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE tipofase.agendafrontoffice = true
ORDER BY fase.id DESC limit 10 offset 0
Limit (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
-> Nested Loop (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
Join Filter: (fase.tipofase = tipofase.id)
-> Index Scan Backward using test_prova_2 on fase (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
-> Materialize (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 0.685 ms
Execution time: 173853.221 ms
I am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.
Regards
Pavel
Really really slow..... looks like the planner is not doing a good job.
PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit
I also run VACUUM AND VACUUM ANALYZE on both table
I tried to play with the
"alter table tipofase alter column agendafrontoffice set statistics 2"
but nothing.Thanks in advance Marco
--------------------------------"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman------------------------------ ------------------------------ ------------------------------ -------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Pavel Stehule
Date:
2017-04-20 17:57 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
2017-04-20 9:19 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:Hi!, i've currently a big problem using ORBDER BY / LIMIT in a query with no result set.
If i add the order by/limit clause it runs really really slow.QUERY 1 FAST:
--------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE tipofase.agendafrontoffice = true
EXPLAIN ANALYZE:
Nested Loop (cost=0.43..790.19 rows=14462 width=4) (actual time=0.079..0.079 rows=0 loops=1)-> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.077..0.077 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
-> Index Only Scan using fase_test_prova_4 on fase (cost=0.43..595.59 rows=19158 width=8) (never executed)
Index Cond: (tipofase = tipofase.id)
Heap Fetches: 0
Planning time: 0.669 ms
Execution time: 0.141 ms
---
It's perfect because it starts from tipofase, where there are no agendafrontoffice = truefase_test_prova_4 is a btree index ON (fase.tipofase, fase.id)
fase.id is PRIMARY key on fase,
tipofase.id is PRIMARY key on tipofase,
fase.tipofase is FK on tipofase.id
and tipofase.agendafrontoffice is a boolean.I've also created a btree index on tipofase.agendafrontoffice.
**fase** is a large table with 1.475.146 records. There are no rows in the table matching tipofase.agendafrontoffice = true, so the result set is empty(QUERY 1)
QUERY 2 SLOW(WITH limit and order by):
--------------------------------
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE tipofase.agendafrontoffice = true
ORDER BY fase.id DESC limit 10 offset 0
Limit (cost=0.43..149.66 rows=10 width=4) (actual time=173853.131..173853.131 rows=0 loops=1)
-> Nested Loop (cost=0.43..215814.25 rows=14462 width=4) (actual time=173853.130..173853.130 rows=0 loops=1)
Join Filter: (fase.tipofase = tipofase.id)
-> Index Scan Backward using test_prova_2 on fase (cost=0.43..193684.04 rows=1475146 width=8) (actual time=1.336..173128.418 rows=1475146 loops=1)
-> Materialize (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1475146)
-> Seq Scan on tipofase (cost=0.00..3.02 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
Filter: agendafrontoffice
Rows Removed by Filter: 102
Planning time: 0.685 ms
Execution time: 173853.221 msI am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.Regards
It can looks strange, but it can work
SELECT *
FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) s
ORDER BY ...
LIMIT 10;
Regards
Pavel
PavelReally really slow..... looks like the planner is not doing a good job.
PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit
I also run VACUUM AND VACUUM ANALYZE on both table
I tried to play with the
"alter table tipofase alter column agendafrontoffice set statistics 2"
but nothing.Thanks in advance Marco
--------------------------------"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman------------------------------ ------------------------------ ------------------------------ -------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Marco Renzi
Date:
This could look strange, but is fast as hell!
The main problem is:
Is everytime ok doing query like this with order by and limit? Is ok using an upperlimit to 1.000.000.000 records?
SELECT * FROM (
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE agendafrontoffice = true
ORDER BY fase.id DESC limit 1000000000 offset 0
) A
ORDER BY A.id DESC limit 10 offset 0
2017-04-20 18:05 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
I am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.RegardsIt can looks strange, but it can workSELECT *FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) sORDER BY ...LIMIT 10;RegardsPavel
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Pavel Stehule
Date:
2017-04-21 8:49 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:
This could look strange, but is fast as hell!The main problem is:Is everytime ok doing query like this with order by and limit? Is ok using an upperlimit to 1.000.000.000 records?
I am thinking so limit 10000 should be ok. Too big number can be messy for optimizer similarly like too small number.
The planner is driven by statistics - and the statistics are not perfect - usually it is working on 80% - like weather forecasting.
Usually it is working, but sometimes not.
Regards
Pavel
SELECT * FROM (
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = tipofase.id)
WHERE agendafrontoffice = true
ORDER BY fase.id DESC limit 1000000000 offset 0
) A
ORDER BY A.id DESC limit 10 offset 02017-04-20 18:05 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:I am afraid so is not possible to solve this issue by one query. In this case the planner expects early stop due finding few values. But because there are not any value, the LIMIT clause has not any benefit in executor time, but the planner is messed. Maybe try to increase LIMIT to some higher value .. 1000, 10000 so planner don't fall to this trap. PostgreSQL statistics are about most common values, but the values without any occurrence are not well registered by statistics.RegardsIt can looks strange, but it can workSELECT *FROM (your query ORDER BY .. OFFSET 0 LIMIT 10000) sORDER BY ...LIMIT 10;RegardsPavel
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Marco Renzi
Date:
I am thinking so limit 10000 should be ok. Too big number can be messy for optimizer similarly like too small number.The planner is driven by statistics - and the statistics are not perfect - usually it is working on 80% - like weather forecasting.Usually it is working, but sometimes not.RegardsPavel
Thanks Pavel, i almost found two solutions at the end:
One is to use an inner limit as you said, and the other, when you just know what the filter is,is to try to join with SELECTS that have to be executed first from the planner.
Eg
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = (SELECT tipofase.id FROM tipofase WHERE tipofase.agendafrontoffice = true))
ORDER BY fase.id DESC limit 10 offset 0
--
-------------------------------------------------------------------------------------------------------------------------------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid HoffmanIng. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271
Re: [PERFORM] Query with no result set, really really slow addingORBDER BY / LIMIT clause
From
Pavel Stehule
Date:
2017-04-21 9:05 GMT+02:00 Marco Renzi <renzi.mrc@gmail.com>:
I am thinking so limit 10000 should be ok. Too big number can be messy for optimizer similarly like too small number.The planner is driven by statistics - and the statistics are not perfect - usually it is working on 80% - like weather forecasting.Usually it is working, but sometimes not.RegardsPavel
Thanks for the helpThanks Pavel, i almost found two solutions at the end:One is to use an inner limit as you said, and the other, when you just know what the filter is,
is to try to join with SELECTS that have to be executed first from the planner.
Eg
SELECT fase.id
FROM tipofase
JOIN fase
ON (fase.tipofase = (SELECT tipofase.id FROM tipofase WHERE tipofase.agendafrontoffice = true))
ORDER BY fase.id DESC limit 10 offset 0
yes, sometimes when the data are not homogeneous more queries are necessary
Regards
Pavel
--------------------------------"The fastest way to change yourself is to hang out with people who are already the way you want to be" Reid Hoffman------------------------------ ------------------------------ ------------------------------ -------------------
Ing. Marco Renzi
OCA - Oracle Certified Associate Java SE7 Programmer
OCP - Oracle Certified Mysql 5 Developer
via Zegalara 57
62014 Corridonia(MC)
Mob: 3208377271