Thread: BUG #15232: Query execution changes based on using 'explain analyze'or not

BUG #15232: Query execution changes based on using 'explain analyze'or not

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      15232
Logged by:          durandal mk2
Email address:      jgardner@comprehend.com
PostgreSQL version: 10.4
Operating system:   ubuntu 18.04
Description:

Testing behavior of parallel scans in 10.4 (looking to upgrade from 9.4) and
I'm only seeing parallel execution if I use "explain analyze".

Table setup is document_id int, version_id string 46million records

select
        document_id, 
        max(version_id) version_id 
from
        document_metadata
group by 1 

18 seconds, only a single process running

explain analyze
select
        document_id, 
        max(version_id) version_id 
from
        document_metadata
group by 1

same query but with explain analyze in front, I see multiple bgworker
processes spawn, 4 cores full loaded and query returns in 9 seconds

Finalize HashAggregate  (cost=927376.47..927830.76 rows=45429 width=38)
(actual time=8724.608..9001.714 rows=702294 loops=1)
  Group Key: document_id
  ->  Gather  (cost=907842.00..926467.89 rows=181716 width=38) (actual
time=7162.089..7962.289 rows=1460816 loops=1)
        Workers Planned: 4
        Workers Launched: 3
        ->  Partial HashAggregate  (cost=906842.00..907296.29 rows=45429
width=38) (actual time=7157.322..7303.906 rows=365204 loops=4)
              Group Key: document_id
              ->  Parallel Seq Scan on document_metadata
(cost=0.00..849274.00 rows=11513600 width=16) (actual time=0.043..3380.886
rows=11513600 loops=4)
Planning time: 0.122 ms
Execution time: 9032.855 ms

Scanning the lists and googling didn't really turn up anything, if this is
unexpected I'll try regressing this against 10.3.

No combination of the following parameters seems to impact the query planner
when not using explain analyze.  Thanks in advance.

set max_parallel_degree = 4;
set min_parallel_relation_size = 0;
set parallel_setup_cost to 0;
set parallel_tuple_cost to 0;
set force_parallel_mode to on;


=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> Testing behavior of parallel scans in 10.4 (looking to upgrade from 9.4) and
> I'm only seeing parallel execution if I use "explain analyze".

I can't reproduce this here, at least not when submitting the queries
through psql.  I speculate that you're using some client-side environment
that is somehow disabling parallelism for DML but not within a utility
command like EXPLAIN ... but it's not very clear what that could be.
Please provide a more worked-out example, and be concrete about how
you're submitting both the EXPLAIN and the actual query.

            regards, tom lane


Jud Gardner <jgardner@comprehend.com> writes:
> When submitted through psql I don't see the issue.
> I also don't see it when submitting via jdbc (42.2.2) directly.

> I'm seeing this via jdbc through dbVisualizer, haven't seen this kind of
> behavior before.

Interesting.  It could be that dbVisualizer is setting some parameter
or transaction mode that prevents parallelism.  SERIALIZABLE mode
does so, for example, because the serializable-transaction logic
hasn't been taught about parallelism yet.

            regards, tom lane


On Fri, Jun 8, 2018 at 5:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jud Gardner <jgardner@comprehend.com> writes:
>> When submitted through psql I don't see the issue.
>> I also don't see it when submitting via jdbc (42.2.2) directly.
>
>> I'm seeing this via jdbc through dbVisualizer, haven't seen this kind of
>> behavior before.
>
> Interesting.  It could be that dbVisualizer is setting some parameter
> or transaction mode that prevents parallelism.  SERIALIZABLE mode
> does so, for example, because the serializable-transaction logic
> hasn't been taught about parallelism yet.

It plans a parallel query, but it's sending an Execute message with
max_rows = 1000.  That causes PortalRun() to be called with run_once =
false, which causes execute_once to be false here:

        /*
         * If the plan might potentially be executed multiple times,
we must force
         * it to run without parallelism, because we might exit early.
         */
        if (!execute_once)
                use_parallel_mode = false;

-- 
Thomas Munro
http://www.enterprisedb.com


On Fri, Jun 8, 2018 at 1:24 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Jun 8, 2018 at 5:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jud Gardner <jgardner@comprehend.com> writes:
>>> When submitted through psql I don't see the issue.
>>> I also don't see it when submitting via jdbc (42.2.2) directly.
>>
>>> I'm seeing this via jdbc through dbVisualizer, haven't seen this kind of
>>> behavior before.
>>
>> Interesting.  It could be that dbVisualizer is setting some parameter
>> or transaction mode that prevents parallelism.  SERIALIZABLE mode
>> does so, for example, because the serializable-transaction logic
>> hasn't been taught about parallelism yet.
>
> It plans a parallel query, but it's sending an Execute message with
> max_rows = 1000.  That causes PortalRun() to be called with run_once =
> false [...]

The quick fix for dbVisualizer is to type -1 into the "Max Rows" box
in the top right hand corner of the SQL query window.

The execute_once flag came in with commit 691b8d59281b, which added
this logic for 'E' messages:

       !execute_is_fetch && max_rows == FETCH_ALL

In JDBC there are two separate concepts, statement.setFetchSize() and
statement.setMaxRows().  The first is for fetching in batches
(implying that you might need to execute more than once) and the
second is for imposing a total row limit (possibly executing only once
and discarding the rest, the case here AFAICT).  It doesn't look like
there is any way to tell the difference from our end, because they
both finish up as max_rows in an 'E' message (in our JDBC driver
sendOneQuery() and fetch() both call sendExecute()).  So... I don't
see how we could improve this situation without changing the wire
protocol so that the client could state and intention to execute only
once (independently of max_rows).  I hadn't looked into that code
before today; maybe someone with more knowledge of the protocol has an
idea.

-- 
Thomas Munro
http://www.enterprisedb.com