Thread: Measuring the Query Optimizer Effect: Turning off the QO?

Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Mercha

Date:

07 July 2019, 23:22:21

Hi All

As we know, a query goes through number of stages before it is executed. 
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions 
and costs. But I wanted to measure the effect of such a stage by turning 
it off completely and I can't find such a parameter which explicitly 
does that. Then I could execute a query to get the effect of "QO active 
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I 
am just interested in measuring the differences for various types of 
queries. I am also aware that this is a simple comparison - there are 
more interesting comparisons to perform with QO tweaks, but right now I 
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of 
generating the worst plan possible, ideally without touching many 
parameters?

Best,
Tom

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Rob Sargent

Date:

07 July 2019, 23:46:28


> On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com> wrote:
> 
> Hi All
> 
> As we know, a query goes through number of stages before it is executed. 
> One of these stages is query optimization (QO).
> 
> There are various parameters to try and influence optimizer decisions 
> and costs. But I wanted to measure the effect of such a stage by turning 
> it off completely and I can't find such a parameter which explicitly 
> does that. Then I could execute a query to get the effect of "QO active 
> and "QO inactive" and compare.
> 
> Obviously, I know well what the results would generally look like but I 
> am just interested in measuring the differences for various types of 
> queries. I am also aware that this is a simple comparison - there are 
> more interesting comparisons to perform with QO tweaks, but right now I 
> am interested in something basic.
> 
> So how would one shut down QO? Or at least, obtaining the guarantee of 
> generating the worst plan possible, ideally without touching many 
> parameters?
> 
> Best,
> Tom

Drop all indices?

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Mercha

Date:

07 July 2019, 23:49:40

On 08/07/2019 01:46, Rob Sargent wrote:
> 
> 
>> On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com> wrote:
>>
>> Hi All
>>
>> As we know, a query goes through number of stages before it is executed.
>> One of these stages is query optimization (QO).
>>
>> There are various parameters to try and influence optimizer decisions
>> and costs. But I wanted to measure the effect of such a stage by turning
>> it off completely and I can't find such a parameter which explicitly
>> does that. Then I could execute a query to get the effect of "QO active
>> and "QO inactive" and compare.
>>
>> Obviously, I know well what the results would generally look like but I
>> am just interested in measuring the differences for various types of
>> queries. I am also aware that this is a simple comparison - there are
>> more interesting comparisons to perform with QO tweaks, but right now I
>> am interested in something basic.
>>
>> So how would one shut down QO? Or at least, obtaining the guarantee of
>> generating the worst plan possible, ideally without touching many
>> parameters?
>>
>> Best,
>> Tom
> 
> Drop all indices?
> 

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of 
relational algebra operators, as well as their parallelisation. I am 
talking about turning off this query optimization, so I am already 
assuming that indexes aren't present.

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Rob Sargent

Date:

08 July 2019, 00:01:51

On Jul 7, 2019, at 5:49 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

On 08/07/2019 01:46, Rob Sargent wrote:

On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
https://gitlab.com/camplab/jpsgcs interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();

                                                 version                                                 

---------------------------------------------------------------------------------------------------------

 PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit

(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';

              name               | setting | unit |                                             short_desc                                             

---------------------------------+---------+------+----------------------------------------------------------------------------------------------------

 force_parallel_mode             | off     |      | Forces use of parallel query facilities.

 max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.

 max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.

 min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.

 min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.

 parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.

 parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.

 ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.

(8 rows)

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Rob Sargent

Date:

08 July 2019, 00:02:33

On Jul 7, 2019, at 6:01 PM, Rob Sargent <robjsargent@gmail.com> wrote:

On Jul 7, 2019, at 5:49 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

On 08/07/2019 01:46, Rob Sargent wrote:

On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
https://gitlab.com/camplab/jpsgcs interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
name | setting | unit | short_desc
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
force_parallel_mode | off | | Forces use of parallel query facilities.
max_parallel_workers | 16 | | Sets the maximum number of parallel workers that can be active at one time.
max_parallel_workers_per_gather | 8 | | Sets the maximum number of parallel processes per executor node.
min_parallel_index_scan_size | 64 | 8kB | Sets the minimum amount of index data for a parallel scan.
min_parallel_table_scan_size | 1024 | 8kB | Sets the minimum amount of table data for a parallel scan.
parallel_setup_cost | 1000 | | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
parallel_tuple_cost | 0.1 | | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
ssl_dh_params_file | | | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Rob Sargent

Date:

08 July 2019, 00:04:57

On Jul 7, 2019, at 6:02 PM, Rob Sargent <robjsargent@gmail.com> wrote:

On Jul 7, 2019, at 6:01 PM, Rob Sargent <robjsargent@gmail.com> wrote:

On Jul 7, 2019, at 5:49 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

On 08/07/2019 01:46, Rob Sargent wrote:

On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
https://gitlab.com/camplab/jpsgcs interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
name | setting | unit | short_desc
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
force_parallel_mode | off | | Forces use of parallel query facilities.
max_parallel_workers | 16 | | Sets the maximum number of parallel workers that can be active at one time.
max_parallel_workers_per_gather | 8 | | Sets the maximum number of parallel processes per executor node.
min_parallel_index_scan_size | 64 | 8kB | Sets the minimum amount of index data for a parallel scan.
min_parallel_table_scan_size | 1024 | 8kB | Sets the minimum amount of table data for a parallel scan.
parallel_setup_cost | 1000 | | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
parallel_tuple_cost | 0.1 | | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
ssl_dh_params_file | | | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Better yet, “where category ~* ‘planner’"

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Mercha

Date:

08 July 2019, 00:14:59

On 08/07/2019 02:04, Rob Sargent wrote:
> 
> 
>> On Jul 7, 2019, at 6:02 PM, Rob Sargent <robjsargent@gmail.com> wrote:
>>
>>
>>
>>> On Jul 7, 2019, at 6:01 PM, Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>> wrote:
>>>
>>>
>>>
>>>> On Jul 7, 2019, at 5:49 PM, Tom Mercha <mercha_t@hotmail.com <mailto:mercha_t@hotmail.com>> wrote:
>>>>
>>>> On 08/07/2019 01:46, Rob Sargent wrote:
>>>>>
>>>>>
>>>>>> On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com <mailto:mercha_t@hotmail.com>> wrote:
>>>>>>
>>>>>> Hi All
>>>>>>
>>>>>> As we know, a query goes through number of stages before it is executed.
>>>>>> One of these stages is query optimization (QO).
>>>>>>
>>>>>> There are various parameters to try and influence optimizer decisions
>>>>>> and costs. But I wanted to measure the effect of such a stage by turning
>>>>>> it off completely and I can't find such a parameter which explicitly
>>>>>> does that. Then I could execute a query to get the effect of "QO active
>>>>>> and "QO inactive" and compare.
>>>>>>
>>>>>> Obviously, I know well what the results would generally look like but I
>>>>>> am just interested in measuring the differences for various types of
>>>>>> queries. I am also aware that this is a simple comparison - there are
>>>>>> https://gitlab.com/camplab/jpsgcs <https://gitlab.com/camplab/jpsgcs> interesting comparisons to perform with QO
tweaks,but right now I
 
>>>>>> am interested in something basic.
>>>>>>
>>>>>> So how would one shut down QO? Or at least, obtaining the guarantee of
>>>>>> generating the worst plan possible, ideally without touching many
>>>>>> parameters?
>>>>>>
>>>>>> Best,
>>>>>> Tom
>>>>>
>>>>> Drop all indices?
>>>>>
>>>>
>>>> Sorry, maybe my question wasn't clear enough.
>>>>
>>>> A query can be rewritten in various ways by applying rules and costs of
>>>> relational algebra operators, as well as their parallelisation. I am
>>>> talking about turning off this query optimization, so I am already
>>>> assuming that indexes aren't present.
>>>
>>> Have you played with any of these settings?
>>>
>>> postgres=# select version();
>>>                                                   version
>>> ---------------------------------------------------------------------------------------------------------
>>>   PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
>>> (1 row)
>>>
>>> postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
>>>                name               | setting | unit |                                             short_desc
>>>
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
>>>   force_parallel_mode             | off     |      | Forces use of parallel query facilities.
>>>   max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active
atone time.
 
>>>   max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor
node.
>>>   min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
>>>   min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
>>>   parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker
processesfor parallel query.
 
>>>   parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple
(row)from worker to master backend.
 
>>>   ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
>>> (8 rows)
>>>
>> Well not the last one of course.
> 
> Better yet, “where category ~* ‘planner’"
> 

Hi

Yes, I've taken a look. I'm just interested in turning off the whole 
module and there is no parameter to do just that - an on and off switch 
- from what I can understand. What I can do instead is to go over the 
entire list of parameters and configure them each to generate a bad 
plan, but I'm not sure in general how to make a configuration which 
gives me the worst plan possible under all situations.

I was hoping that someone has the on/off switch I'm looking for or a 
parameter template which has been used for the same purpose or something 
along these lines...

This document describes the module I'm talking about: 
https://www.postgresql.org/docs/current/planner-optimizer.html

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Rob Sargent

Date:

08 July 2019, 00:29:47

O
Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
force_parallel_mode             | off     |      | Forces use of parallel query facilities.
max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
min_parallel_index_scan_size    | 64      | 8kB | Sets the minimum amount of index data for a parallel scan.
min_parallel_table_scan_size    | 1024    | 8kB | Sets the minimum amount of table data for a parallel scan.
parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Better yet, “where category ~* ‘planner’"

Hi

Yes, I've taken a look. I'm just interested in turning off the whole
module and there is no parameter to do just that - an on and off switch
- from what I can understand. What I can do instead is to go over the
entire list of parameters and configure them each to generate a bad
plan, but I'm not sure in general how to make a configuration which
gives me the worst plan possible under all situations.

I was hoping that someone has the on/off switch I'm looking for or a
parameter template which has been used for the same purpose or something
along these lines...

This document describes the module I'm talking about:
https://www.postgresql.org/docs/current/planner-optimizer.html

Yeah, I guess I read that to say postgres will optimize generically (after geqo_threshold). Now maybe that’s the worst plan? Per force a lot of seq-scans in the absence of any indices. And if it isn’t the worst possible plan, so what, you’re using postgres, you won’t get a worse plan without lying to the optimizer.

But I’m clearly in over my head. The pros will be along shortly, I’m sure.

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Rob Sargent

Date:

08 July 2019, 00:31:36

On Jul 7, 2019, at 6:29 PM, Rob Sargent <robjsargent@gmail.com> wrote:

O
Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
force_parallel_mode             | off     |      | Forces use of parallel query facilities.
max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
min_parallel_index_scan_size    | 64      | 8kB | Sets the minimum amount of index data for a parallel scan.
min_parallel_table_scan_size    | 1024    | 8kB | Sets the minimum amount of table data for a parallel scan.
parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Better yet, “where category ~* ‘planner’"

Hi

Yes, I've taken a look. I'm just interested in turning off the whole
module and there is no parameter to do just that - an on and off switch
- from what I can understand. What I can do instead is to go over the
entire list of parameters and configure them each to generate a bad
plan, but I'm not sure in general how to make a configuration which
gives me the worst plan possible under all situations.

I was hoping that someone has the on/off switch I'm looking for or a
parameter template which has been used for the same purpose or something
along these lines...

This document describes the module I'm talking about:
https://www.postgresql.org/docs/current/planner-optimizer.html

Yeah, I guess I read that to say postgres will optimize generically (after geqo_threshold). Now maybe that’s the worst plan? Per force a lot of seq-scans in the absence of any indices. And if it isn’t the worst possible plan, so what, you’re using postgres, you won’t get a worse plan without lying to the optimizer.
But I’m clearly in over my head. The pros will be along shortly, I’m sure.

And I didn’t notice I had pasted a url to a git project. Totally irrelevant to the issue at hand. My apologies.

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Ron

Date:

08 July 2019, 00:39:02

On 7/7/19 6:49 PM, Tom Mercha wrote:

On 08/07/2019 01:46, Rob Sargent wrote:

On Jul 7, 2019, at 5:22 PM, Tom Mercha <mercha_t@hotmail.com> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
more interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of 
relational algebra operators, as well as their parallelisation. I am 
talking about turning off this query optimization, so I am already 
assuming that indexes aren't present.

It seems as though you're asking what the "First Approximation" plan is, before it tries to get Too Clever. However, I don't think there's a separation between Query Planner and Query Optimizer. https://www.postgresql.org/docs/9.6/planner-optimizer.html

--
Angular momentum makes the world go 'round.

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Lane

Date:

08 July 2019, 01:26:44

Tom Mercha <mercha_t@hotmail.com> writes:
> As we know, a query goes through number of stages before it is executed. 
> One of these stages is query optimization (QO).
> There are various parameters to try and influence optimizer decisions 
> and costs. But I wanted to measure the effect of such a stage by turning 
> it off completely and I can't find such a parameter which explicitly 
> does that. Then I could execute a query to get the effect of "QO active 
> and "QO inactive" and compare.

There is no such parameter because the code can't support that.  For
efficiency reasons, query optimization is bound up pretty tightly with
essential plan-preparation activities.  As an example, you can't turn
off constant-folding because eval_const_expressions also takes care of
some non-optional activities like filling in default parameter values
in function calls.

However, there are some knobs you can twiddle, as others have already
pointed out.  Two I'd particularly draw your attention to are
join_collapse_limit and from_collapse_limit --- if you set both to 1,
that'll effectively disable searching for a good join order, causing
the join order to match the syntactic structure of the FROM clause.
For instance "FROM a,b,c" will always be done by joining a to b first
then joining to c.  The code will still consider all possible ways
to do each of those joins, though you can shut off consideration of
some possibilities with parameters like enable_hashjoin.

            regards, tom lane

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Andrew Gierth

Date:

08 July 2019, 14:23:34

>>>>> "Tom" == Tom Mercha <mercha_t@hotmail.com> writes:

 Tom> Hi All

 Tom> As we know, a query goes through number of stages before it is
 Tom> executed. One of these stages is query optimization (QO).

That's not really true at all. One of the stages is query _planning_,
which takes the (rewritten) query as input and converts it to something
that the executor can take action on. There isn't actually any separate
"optimization" phase.

-- 
Andrew (irc:RhodiumToad)

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Andrew Gierth

Date:

08 July 2019, 16:14:51

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> Two I'd particularly draw your attention to are
 Tom> join_collapse_limit and from_collapse_limit --- if you set both to
 Tom> 1, that'll effectively disable searching for a good join order,
 Tom> causing the join order to match the syntactic structure of the
 Tom> FROM clause. For instance "FROM a,b,c" will always be done by
 Tom> joining a to b first

FROM a,b,c can always be planned in any join order. If you want to force
the join order you have to set join_collapse_limit=1 AND write it as
FROM a JOIN b ON ... JOIN c ON ...

For an example, try:

explain select * from onek o1, tenk1 t, onek o2
         where o1.unique1=t.unique1 and t.unique1=o2.unique1
           and o1.unique2<10 and o2.unique2<10;

which (at least for me) joins o1 and o2 together first even with the
collapse limits set to 1.

-- 
Andrew (irc:RhodiumToad)

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Lane

Date:

08 July 2019, 16:29:28

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Two I'd particularly draw your attention to are
>  Tom> join_collapse_limit and from_collapse_limit --- if you set both to
>  Tom> 1, that'll effectively disable searching for a good join order,
>  Tom> causing the join order to match the syntactic structure of the
>  Tom> FROM clause. For instance "FROM a,b,c" will always be done by
>  Tom> joining a to b first

> FROM a,b,c can always be planned in any join order.

Ah, right, -ENOCAFFEINE.  What from_collapse_limit really does is
prevent flattening sub-SELECTs when doing so would add more join-order
freedom in the parent query.  But ignoring sub-SELECTs, using explicit
JOIN syntax with join_collapse_limit=1 will let you control the
join order.

            regards, tom lane

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Mercha

Date:

08 July 2019, 19:17:00

On 08/07/2019 18:29, Tom Lane wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>>   Tom> Two I'd particularly draw your attention to are
>>   Tom> join_collapse_limit and from_collapse_limit --- if you set both to
>>   Tom> 1, that'll effectively disable searching for a good join order,
>>   Tom> causing the join order to match the syntactic structure of the
>>   Tom> FROM clause. For instance "FROM a,b,c" will always be done by
>>   Tom> joining a to b first
> 
>> FROM a,b,c can always be planned in any join order.
> 
> Ah, right, -ENOCAFFEINE.  What from_collapse_limit really does is
> prevent flattening sub-SELECTs when doing so would add more join-order
> freedom in the parent query.  But ignoring sub-SELECTs, using explicit
> JOIN syntax with join_collapse_limit=1 will let you control the
> join order.
> 
>             regards, tom lane
> 

Thanks for the clarification. I really appreciate taking the time to 
provide such in depth info about my query!

Re: Measuring the Query Optimizer Effect: Turning off the QO?

From

Tom Mercha

Date:

08 July 2019, 19:20:25

On 08/07/2019 16:23, Andrew Gierth wrote:
>>>>>> "Tom" == Tom Mercha <mercha_t@hotmail.com> writes:
> 
>   Tom> Hi All
> 
>   Tom> As we know, a query goes through number of stages before it is
>   Tom> executed. One of these stages is query optimization (QO).
> 
> That's not really true at all. One of the stages is query _planning_,
> which takes the (rewritten) query as input and converts it to something
> that the executor can take action on. There isn't actually any separate
> "optimization" phase.
> 

Hi

I was just loosely speaking - I am merely referring to the concept that 
a query can be optimized vs unoptimized. Of course it follows that we 
have a phase or a subset of stages which are responsible for this purpose.