Thread: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
Hello
I have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.
I am using version 13 but soon 14.
I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d
I also have plans on a snapshot of the DB with real data.
- The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551
- I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do.
- The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result.
- The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946
It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array.
Is there a misusage of my indexes?
Is there a limitation when using ANY or IN operators and ordered LIMIT behind?
Thanks a lot
Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
On Mon, Jun 12, 2023 at 1:17 PM benoit <benoit@hopsandfork.com> wrote: > Is there a misusage of my indexes? > > Is there a limitation when using ANY or IN operators and ordered LIMIT behind? It's complicated. Do you find that you get satisfactory performance if you force a bitmap index scan? In other words, what is the effect of "set enable_indexscan = off" on your original query? Does that speed up execution at all? (I think that this approach ought to produce a plan that uses a bitmap index scan in place of the index scan, without changing anything else.) -- Peter Geoghegan
Sadly it doesn't help to disable indexscan. The plan : https://explain.dalibo.com/plan/3b3gfce5b29c3hh4
Envoyé : lundi 12 juin 2023 22:34:50
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
> Is there a misusage of my indexes?
>
> Is there a limitation when using ANY or IN operators and ordered LIMIT behind?
It's complicated. Do you find that you get satisfactory performance if
you force a bitmap index scan? In other words, what is the effect of
"set enable_indexscan = off" on your original query? Does that speed
up execution at all? (I think that this approach ought to produce a
plan that uses a bitmap index scan in place of the index scan, without
changing anything else.)
--
Peter Geoghegan
Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
SELECT * FROM docs WHERE status IN ('draft', 'sent') AND sender_reference IN ('Custom/1175', 'Client/362', 'Custom/280') ORDER BY sent_at DESC |
On Jun 12, 2023, at 4:17 PM, benoit <benoit@hopsandfork.com> wrote:HelloI have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.I am using version 13 but soon 14.I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04dI also have plans on a snapshot of the DB with real data.- The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551- I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do.- The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result.- The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array.Is there a misusage of my indexes?Is there a limitation when using ANY or IN operators and ordered LIMIT behind?Thanks a lot
This new index is used but still the read is 230mb.
https://explain.dalibo.com/plan/b0f28a9e8a136afd
Envoyé : lundi 12 juin 2023 22:55
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
SELECT * FROM docs WHERE status IN ('draft', 'sent') AND sender_reference IN ('Custom/1175', 'Client/362', 'Custom/280') ORDER BY sent_at DESC |
On Jun 12, 2023, at 4:17 PM, benoit <benoit@hopsandfork.com> wrote:HelloI have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.I am using version 13 but soon 14.I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04dI also have plans on a snapshot of the DB with real data.- The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551- I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do.- The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result.- The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array.Is there a misusage of my indexes?Is there a limitation when using ANY or IN operators and ordered LIMIT behind?Thanks a lot
Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
On Jun 12, 2023, at 5:34 PM, benoit <benoit@hopsandfork.com> wrote:This new index is used but still the read is 230mb.De : Chris Hoover <chrish@aweber.com>
Envoyé : lundi 12 juin 2023 22:55
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMITI normally create my indexes to match the where clause of the query. While technically, it should not matter, I find a lot of time, it does.I would create an index on (status, sender_reference, sent_at) and see if the improves your query performance.
SELECT * FROM docs WHERE status IN ('draft', 'sent') AND sender_reference IN ('Custom/1175', 'Client/362', 'Custom/280') ORDER BY sent_at DESC Thanks,Chris HooverSenior DBAAWeber.comCell: (803) 528-2269Email: chrish@aweber.comOn Jun 12, 2023, at 4:17 PM, benoit <benoit@hopsandfork.com> wrote:HelloI have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.I am using version 13 but soon 14.I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04dI also have plans on a snapshot of the DB with real data.- The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551- I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do.- The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result.- The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array.Is there a misusage of my indexes?Is there a limitation when using ANY or IN operators and ordered LIMIT behind?Thanks a lot
Hi,
Do you really need to do “select *”?
In other words, is it necessary to have all columns in the result?
Michel SALAIS
De : benoit <benoit@hopsandfork.com>
Envoyé : lundi 12 juin 2023 23:35
À : Chris Hoover <chrish@aweber.com>
Cc : pgsql-performance@lists.postgresql.org
Objet : RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
This new index is used but still the read is 230mb.
https://explain.dalibo.com/plan/b0f28a9e8a136afd
De : Chris Hoover <chrish@aweber.com>
Envoyé : lundi 12 juin 2023 22:55
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
I normally create my indexes to match the where clause of the query. While technically, it should not matter, I find a lot of time, it does.
I would create an index on (status, sender_reference, sent_at) and see if the improves your query performance.
SELECT * FROM docs WHERE status IN ('draft', 'sent') AND sender_reference IN ('Custom/1175', 'Client/362', 'Custom/280') ORDER BY sent_at DESC
|
On Jun 12, 2023, at 4:17 PM, benoit <benoit@hopsandfork.com> wrote:
Hello
I have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.
I am using version 13 but soon 14.
I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d
I also have plans on a snapshot of the DB with real data.
- The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551
- I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do.
- The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result.
- The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946
It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array.
Is there a misusage of my indexes?
Is there a limitation when using ANY or IN operators and ordered LIMIT behind?
Thanks a lot
Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face {font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; font-size:11.0pt; font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;}span.pl-k {mso-style-name:pl-k;}span.pl-s {mso-style-name:pl-s;}span.pl-pds {mso-style-name:pl-pds;}span.EmailStyle23 {mso-style-type:personal-reply; font-family:"Calibri",sans-serif; color:windowtext;}.MsoChpDefault {mso-style-type:export-only; font-size:10.0pt; mso-ligatures:none;}div.WordSection1 {page:WordSection1;} Hi,
Do you really need to do “select *”?
In other words, is it necessary to have all columns in the result?
Michel SALAIS
De : benoit <benoit@hopsandfork.com>
Envoyé : lundi 12 juin 2023 23:35
À : Chris Hoover <chrish@aweber.com>
Cc : pgsql-performance@lists.postgresql.org
Objet : RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
This new index is used but still the read is 230mb.
https://explain.dalibo.com/plan/b0f28a9e8a136afd
De : Chris Hoover <chrish@aweber.com>
Envoyé : lundi 12 juin 2023 22:55
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
I normally create my indexes to match the where clause of the query. While technically, it should not matter, I find a lot of time, it does.
I would create an index on (status, sender_reference, sent_at) and see if the improves your query performance.
SELECT * FROM docs WHERE status
IN ('draft',
'sent')
AND sender_reference
IN ('Custom/1175',
'Client/362',
'Custom/280')
ORDER BY sent_at DESC
On Jun 12, 2023, at 4:17 PM, benoit <benoit@hopsandfork.com> wrote:
Hello
I have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.
I am using version 13 but soon 14.
I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d
I also have plans on a snapshot of the DB with real data.
- The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551
- I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do.
- The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result.
- The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946
It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array.
Is there a misusage of my indexes?
Is there a limitation when using ANY or IN operators and ordered LIMIT behind?
Thanks a lot