Thread: Volatile Functions in Parallel Plans
Hi,
I test some SQL in the latest Postgres master branch code (we find these issues when
developing Greenplum database in the PR https://github.com/greenplum-db/gpdb/pull/10418,
and my colleague come up with the following cases in Postgres):
create table t3 (c1 text, c2 text);CREATE TABLEinsert into t3select'fhufehwiohewiuewhuhwiufhwifhweuhfwu', --random data'fiowehufwhfyegygfewpfwwfeuhwhufwh' --random datafrom generate_series(1, 10000000) i;INSERT 0 10000000analyze t3;ANALYZEcreate table t4 (like t3);CREATE TABLEinsert into t4 select * from t4;INSERT 0 0insert into t4 select * from t3;INSERT 0 10000000analyze t4;ANALYZEset enable_hashjoin to off;SETexplain (costs off)select count(*) from t3, t4where t3.c1 like '%sss'and timeofday() = t4.c1 and t3.c1 = t4.c1;QUERY PLAN--------------------------------------------------------Finalize Aggregate-> GatherWorkers Planned: 2-> Partial Aggregate-> Nested LoopJoin Filter: (t3.c1 = t4.c1)-> Parallel Seq Scan on t3Filter: (c1 ~~ '%sss'::text)-> Seq Scan on t4Filter: (timeofday() = c1)(10 rows)explain (verbose, costs off)select count(*)fromt3,(select *, timeofday() as x from t4 ) t4where t3.c1 like '%sss' andtimeofday() = t4.c1 and t3.c1 = t4.c1;QUERY PLAN------------------------------------------------------------------Finalize AggregateOutput: count(*)-> GatherOutput: (PARTIAL count(*))Workers Planned: 2-> Partial AggregateOutput: PARTIAL count(*)-> Nested LoopJoin Filter: (t3.c1 = t4.c1)-> Parallel Seq Scan on public.t3Output: t3.c1, t3.c2Filter: (t3.c1 ~~ '%sss'::text)-> Seq Scan on public.t4Output: t4.c1, NULL::text, timeofday()Filter: (timeofday() = t4.c1)(15 rows)
Focus on the last two plans, the function timeofday is
volatile but paralle-safe. And Postgres outputs two parallel
plan.
The first plan:
The join's left tree is parallel scan and the right tree is seq scan.Finalize Aggregate-> GatherWorkers Planned: 2-> Partial Aggregate-> Nested LoopJoin Filter: (t3.c1 = t4.c1)-> Parallel Seq Scan on t3Filter: (c1 ~~ '%sss'::text)-> Seq Scan on t4Filter: (timeofday() = c1)
This algorithm is correct using the distribute distributive law of
distributed join:
A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) )
The correctness of the above law should have a pre-assumption:
The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B)
But things get complicated when volatile functions come in. Timeofday is just
an example to show the idea. The core is volatile functions can return different
results on successive calls with the same arguments. Thus the following piece,
the right tree of the join
-> Seq Scan on t4
Filter: (timeofday() = c1)
can not be considered consistent everywhere in the scan workers.
The second plan
Finalize AggregateOutput: count(*)-> GatherOutput: (PARTIAL count(*))Workers Planned: 2-> Partial AggregateOutput: PARTIAL count(*)-> Nested LoopJoin Filter: (t3.c1 = t4.c1)-> Parallel Seq Scan on public.t3Output: t3.c1, t3.c2Filter: (t3.c1 ~~ '%sss'::text)-> Seq Scan on public.t4Output: t4.c1, NULL::text, timeofday()Filter: (timeofday() = t4.c1)
-> Seq Scan on public.t4
Output: t4.c1, NULL::text, timeofday()
Filter: (timeofday() = t4.c1)
It should not be taken as consistent in different workers.
------------------------------------------------------------------------------------------
The above are just two cases we find today. And it should be enough to
show the core issue to have a discussion here.
The question is, should we consider volatile functions when generating
parallel plans?
------------------------------------------------------------------------------------------
FYI, some plan diffs of Greenplum can be found here: https://www.diffnow.com/report/etulf
On Wed, Jul 15, 2020 at 6:14 PM Zhenghua Lyu <zlyu@vmware.com> wrote: > > > The first plan: > > Finalize Aggregate > -> Gather > Workers Planned: 2 > -> Partial Aggregate > -> Nested Loop > Join Filter: (t3.c1 = t4.c1) > -> Parallel Seq Scan on t3 > Filter: (c1 ~~ '%sss'::text) > -> Seq Scan on t4 > Filter: (timeofday() = c1) > > The join's left tree is parallel scan and the right tree is seq scan. > This algorithm is correct using the distribute distributive law of > distributed join: > A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) ) > > The correctness of the above law should have a pre-assumption: > The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B) > > But things get complicated when volatile functions come in. Timeofday is just > an example to show the idea. The core is volatile functions can return different > results on successive calls with the same arguments. Thus the following piece, > the right tree of the join > -> Seq Scan on t4 > Filter: (timeofday() = c1) > can not be considered consistent everywhere in the scan workers. > But this won't be consistent even for non-parallel plans. I mean to say for each loop of join the "Seq Scan on t4" would give different results. Currently, we don't consider volatile functions as parallel-safe by default. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi, thanks for your reply.
But this won't be consistent even for non-parallel plans.
If we do not use the distributed law of parallel join, it seems
OK.
If we generate a parallel plan using the distributed law of the join,
then this transformation's pre-assumption might be broken.
Currently, we don't consider volatile functions as
parallel-safe by default.
I run the SQL in pg12:
zlv=# select count(proname) from pg_proc where provolatile = 'v' and proparallel ='s';count-------100(1 row)zlv=# select proname from pg_proc where provolatile = 'v' and proparallel ='s';proname----------------------------------------timeofdaybthandlerhashhandlergisthandlerginhandlerspghandlerbrinhandler
It seems there are many functions which is both volatile and parallel safe.
From: Amit Kapila <amit.kapila16@gmail.com>
Sent: Thursday, July 16, 2020 12:07 PM
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: Volatile Functions in Parallel Plans
Sent: Thursday, July 16, 2020 12:07 PM
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: Volatile Functions in Parallel Plans
On Wed, Jul 15, 2020 at 6:14 PM Zhenghua Lyu <zlyu@vmware.com> wrote:
>
>
> The first plan:
>
> Finalize Aggregate
> -> Gather
> Workers Planned: 2
> -> Partial Aggregate
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on t3
> Filter: (c1 ~~ '%sss'::text)
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
>
> The join's left tree is parallel scan and the right tree is seq scan.
> This algorithm is correct using the distribute distributive law of
> distributed join:
> A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) )
>
> The correctness of the above law should have a pre-assumption:
> The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B)
>
> But things get complicated when volatile functions come in. Timeofday is just
> an example to show the idea. The core is volatile functions can return different
> results on successive calls with the same arguments. Thus the following piece,
> the right tree of the join
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
> can not be considered consistent everywhere in the scan workers.
>
But this won't be consistent even for non-parallel plans. I mean to
say for each loop of join the "Seq Scan on t4" would give different
results. Currently, we don't consider volatile functions as
parallel-safe by default.
--
With Regards,
Amit Kapila.
EnterpriseDB: https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com%2F&data=02%7C01%7Czlyu%40vmware.com%7C825aa0c2259c4da0112008d8293dcd1c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637304692698598521&sdata=LWZnJ43KQML3EBwB2DoPGE0KHA2t6A3%2FIS9KSLx%2Bcn4%3D&reserved=0
>
>
> The first plan:
>
> Finalize Aggregate
> -> Gather
> Workers Planned: 2
> -> Partial Aggregate
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on t3
> Filter: (c1 ~~ '%sss'::text)
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
>
> The join's left tree is parallel scan and the right tree is seq scan.
> This algorithm is correct using the distribute distributive law of
> distributed join:
> A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) )
>
> The correctness of the above law should have a pre-assumption:
> The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B)
>
> But things get complicated when volatile functions come in. Timeofday is just
> an example to show the idea. The core is volatile functions can return different
> results on successive calls with the same arguments. Thus the following piece,
> the right tree of the join
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
> can not be considered consistent everywhere in the scan workers.
>
But this won't be consistent even for non-parallel plans. I mean to
say for each loop of join the "Seq Scan on t4" would give different
results. Currently, we don't consider volatile functions as
parallel-safe by default.
--
With Regards,
Amit Kapila.
EnterpriseDB: https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com%2F&data=02%7C01%7Czlyu%40vmware.com%7C825aa0c2259c4da0112008d8293dcd1c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637304692698598521&sdata=LWZnJ43KQML3EBwB2DoPGE0KHA2t6A3%2FIS9KSLx%2Bcn4%3D&reserved=0
Hi Zhenghua, On Wed, Jul 15, 2020 at 5:44 AM Zhenghua Lyu wrote: > > Hi, > I test some SQL in the latest Postgres master branch code (we find these issues when > developing Greenplum database in the PR https://github.com/greenplum-db/gpdb/pull/10418, > and my colleague come up with the following cases in Postgres): > > > create table t3 (c1 text, c2 text); > CREATE TABLE > insert into t3 > select > 'fhufehwiohewiuewhuhwiufhwifhweuhfwu', --random data > 'fiowehufwhfyegygfewpfwwfeuhwhufwh' --random data > from generate_series(1, 10000000) i; > INSERT 0 10000000 > analyze t3; > ANALYZE > create table t4 (like t3); > CREATE TABLE > insert into t4 select * from t4; > INSERT 0 0 > insert into t4 select * from t3; > INSERT 0 10000000 > analyze t4; > ANALYZE > set enable_hashjoin to off; > SET > explain (costs off) > select count(*) from t3, t4 > where t3.c1 like '%sss' > and timeofday() = t4.c1 and t3.c1 = t4.c1; > QUERY PLAN > -------------------------------------------------------- > Finalize Aggregate > -> Gather > Workers Planned: 2 > -> Partial Aggregate > -> Nested Loop > Join Filter: (t3.c1 = t4.c1) > -> Parallel Seq Scan on t3 > Filter: (c1 ~~ '%sss'::text) > -> Seq Scan on t4 > Filter: (timeofday() = c1) > (10 rows) > > explain (verbose, costs off) > select count(*) > from > t3, > (select *, timeofday() as x from t4 ) t4 > where t3.c1 like '%sss' and > timeofday() = t4.c1 and t3.c1 = t4.c1; > QUERY PLAN > ------------------------------------------------------------------ > Finalize Aggregate > Output: count(*) > -> Gather > Output: (PARTIAL count(*)) > Workers Planned: 2 > -> Partial Aggregate > Output: PARTIAL count(*) > -> Nested Loop > Join Filter: (t3.c1 = t4.c1) > -> Parallel Seq Scan on public.t3 > Output: t3.c1, t3.c2 > Filter: (t3.c1 ~~ '%sss'::text) > -> Seq Scan on public.t4 > Output: t4.c1, NULL::text, timeofday() > Filter: (timeofday() = t4.c1) > (15 rows) > > > > Focus on the last two plans, the function timeofday is > volatile but paralle-safe. And Postgres outputs two parallel > plan. > > > The first plan: > > Finalize Aggregate > -> Gather > Workers Planned: 2 > -> Partial Aggregate > -> Nested Loop > Join Filter: (t3.c1 = t4.c1) > -> Parallel Seq Scan on t3 > Filter: (c1 ~~ '%sss'::text) > -> Seq Scan on t4 > Filter: (timeofday() = c1) > > The join's left tree is parallel scan and the right tree is seq scan. > This algorithm is correct using the distribute distributive law of > distributed join: > A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) ) > > The correctness of the above law should have a pre-assumption: > The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B) > > But things get complicated when volatile functions come in. Timeofday is just > an example to show the idea. The core is volatile functions can return different > results on successive calls with the same arguments. Thus the following piece, > the right tree of the join > -> Seq Scan on t4 > Filter: (timeofday() = c1) > can not be considered consistent everywhere in the scan workers. > > The second plan > > Finalize Aggregate > Output: count(*) > -> Gather > Output: (PARTIAL count(*)) > Workers Planned: 2 > -> Partial Aggregate > Output: PARTIAL count(*) > -> Nested Loop > Join Filter: (t3.c1 = t4.c1) > -> Parallel Seq Scan on public.t3 > Output: t3.c1, t3.c2 > Filter: (t3.c1 ~~ '%sss'::text) > -> Seq Scan on public.t4 > Output: t4.c1, NULL::text, timeofday() > Filter: (timeofday() = t4.c1) > > > have voltile projections in the right tree of the nestloop: > > -> Seq Scan on public.t4 > Output: t4.c1, NULL::text, timeofday() > Filter: (timeofday() = t4.c1) > > It should not be taken as consistent in different workers. You are right, no they are not consistent. But Neither plans is incorrect: 1. In the first query, it's semantically permissible to evaluate timeofday() for each pair of (c1, c2), and the plan reflects that. (Notice that the parallel nature of the plan is just noise here, the planner could have gone with a Nested Loop of which the inner side is _not_ materialized). 2. In the second query -- again -- in a canonical "outside-in" evaluation, it's perfectly permissible to evaluate the subquery for each value of t3. Again, the parallelism here is hardly relevant, a serial plan without a material node on the inner side of a nested loop would just as well (or as badly as you would feel) project different timeofday() values for the same tuple from t4. In short, the above plans seem fine. P.S. the two plans you posted look identical to me, maybe I'm blind late at night? Cheers, Jesse
Hi Jesse,
you are right.
For the nestloop case, they are identical.
I do not come up with hash join or mergejoin case in pg now.
From: Jesse Zhang <sbjesse@gmail.com>
Sent: Thursday, July 16, 2020 2:16 PM
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: Volatile Functions in Parallel Plans
Sent: Thursday, July 16, 2020 2:16 PM
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: Volatile Functions in Parallel Plans
Hi Zhenghua,
On Wed, Jul 15, 2020 at 5:44 AM Zhenghua Lyu wrote:
>
> Hi,
> I test some SQL in the latest Postgres master branch code (we find these issues when
> developing Greenplum database in the PR https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgreenplum-db%2Fgpdb%2Fpull%2F10418&data=02%7C01%7Czlyu%40vmware.com%7C41eeef401fb746757bc108d8294fe8d5%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637304770468321177&sdata=XTxtrbJMO2d15WwQH9q4tXMzLPGFyk5hF8Tzs%2Bj3KlA%3D&reserved=0,
> and my colleague come up with the following cases in Postgres):
>
>
> create table t3 (c1 text, c2 text);
> CREATE TABLE
> insert into t3
> select
> 'fhufehwiohewiuewhuhwiufhwifhweuhfwu', --random data
> 'fiowehufwhfyegygfewpfwwfeuhwhufwh' --random data
> from generate_series(1, 10000000) i;
> INSERT 0 10000000
> analyze t3;
> ANALYZE
> create table t4 (like t3);
> CREATE TABLE
> insert into t4 select * from t4;
> INSERT 0 0
> insert into t4 select * from t3;
> INSERT 0 10000000
> analyze t4;
> ANALYZE
> set enable_hashjoin to off;
> SET
> explain (costs off)
> select count(*) from t3, t4
> where t3.c1 like '%sss'
> and timeofday() = t4.c1 and t3.c1 = t4.c1;
> QUERY PLAN
> --------------------------------------------------------
> Finalize Aggregate
> -> Gather
> Workers Planned: 2
> -> Partial Aggregate
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on t3
> Filter: (c1 ~~ '%sss'::text)
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
> (10 rows)
>
> explain (verbose, costs off)
> select count(*)
> from
> t3,
> (select *, timeofday() as x from t4 ) t4
> where t3.c1 like '%sss' and
> timeofday() = t4.c1 and t3.c1 = t4.c1;
> QUERY PLAN
> ------------------------------------------------------------------
> Finalize Aggregate
> Output: count(*)
> -> Gather
> Output: (PARTIAL count(*))
> Workers Planned: 2
> -> Partial Aggregate
> Output: PARTIAL count(*)
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on public.t3
> Output: t3.c1, t3.c2
> Filter: (t3.c1 ~~ '%sss'::text)
> -> Seq Scan on public.t4
> Output: t4.c1, NULL::text, timeofday()
> Filter: (timeofday() = t4.c1)
> (15 rows)
>
>
>
> Focus on the last two plans, the function timeofday is
> volatile but paralle-safe. And Postgres outputs two parallel
> plan.
>
>
> The first plan:
>
> Finalize Aggregate
> -> Gather
> Workers Planned: 2
> -> Partial Aggregate
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on t3
> Filter: (c1 ~~ '%sss'::text)
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
>
> The join's left tree is parallel scan and the right tree is seq scan.
> This algorithm is correct using the distribute distributive law of
> distributed join:
> A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) )
>
> The correctness of the above law should have a pre-assumption:
> The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B)
>
> But things get complicated when volatile functions come in. Timeofday is just
> an example to show the idea. The core is volatile functions can return different
> results on successive calls with the same arguments. Thus the following piece,
> the right tree of the join
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
> can not be considered consistent everywhere in the scan workers.
>
> The second plan
>
> Finalize Aggregate
> Output: count(*)
> -> Gather
> Output: (PARTIAL count(*))
> Workers Planned: 2
> -> Partial Aggregate
> Output: PARTIAL count(*)
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on public.t3
> Output: t3.c1, t3.c2
> Filter: (t3.c1 ~~ '%sss'::text)
> -> Seq Scan on public.t4
> Output: t4.c1, NULL::text, timeofday()
> Filter: (timeofday() = t4.c1)
>
>
> have voltile projections in the right tree of the nestloop:
>
> -> Seq Scan on public.t4
> Output: t4.c1, NULL::text, timeofday()
> Filter: (timeofday() = t4.c1)
>
> It should not be taken as consistent in different workers.
You are right, no they are not consistent. But Neither plans is
incorrect:
1. In the first query, it's semantically permissible to evaluate
timeofday() for each pair of (c1, c2), and the plan reflects that.
(Notice that the parallel nature of the plan is just noise here, the
planner could have gone with a Nested Loop of which the inner side is
_not_ materialized).
2. In the second query -- again -- in a canonical "outside-in"
evaluation, it's perfectly permissible to evaluate the subquery for each
value of t3. Again, the parallelism here is hardly relevant, a serial
plan without a material node on the inner side of a nested loop would
just as well (or as badly as you would feel) project different
timeofday() values for the same tuple from t4.
In short, the above plans seem fine.
P.S. the two plans you posted look identical to me, maybe I'm blind late
at night?
Cheers,
Jesse
On Wed, Jul 15, 2020 at 5:44 AM Zhenghua Lyu wrote:
>
> Hi,
> I test some SQL in the latest Postgres master branch code (we find these issues when
> developing Greenplum database in the PR https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgreenplum-db%2Fgpdb%2Fpull%2F10418&data=02%7C01%7Czlyu%40vmware.com%7C41eeef401fb746757bc108d8294fe8d5%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637304770468321177&sdata=XTxtrbJMO2d15WwQH9q4tXMzLPGFyk5hF8Tzs%2Bj3KlA%3D&reserved=0,
> and my colleague come up with the following cases in Postgres):
>
>
> create table t3 (c1 text, c2 text);
> CREATE TABLE
> insert into t3
> select
> 'fhufehwiohewiuewhuhwiufhwifhweuhfwu', --random data
> 'fiowehufwhfyegygfewpfwwfeuhwhufwh' --random data
> from generate_series(1, 10000000) i;
> INSERT 0 10000000
> analyze t3;
> ANALYZE
> create table t4 (like t3);
> CREATE TABLE
> insert into t4 select * from t4;
> INSERT 0 0
> insert into t4 select * from t3;
> INSERT 0 10000000
> analyze t4;
> ANALYZE
> set enable_hashjoin to off;
> SET
> explain (costs off)
> select count(*) from t3, t4
> where t3.c1 like '%sss'
> and timeofday() = t4.c1 and t3.c1 = t4.c1;
> QUERY PLAN
> --------------------------------------------------------
> Finalize Aggregate
> -> Gather
> Workers Planned: 2
> -> Partial Aggregate
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on t3
> Filter: (c1 ~~ '%sss'::text)
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
> (10 rows)
>
> explain (verbose, costs off)
> select count(*)
> from
> t3,
> (select *, timeofday() as x from t4 ) t4
> where t3.c1 like '%sss' and
> timeofday() = t4.c1 and t3.c1 = t4.c1;
> QUERY PLAN
> ------------------------------------------------------------------
> Finalize Aggregate
> Output: count(*)
> -> Gather
> Output: (PARTIAL count(*))
> Workers Planned: 2
> -> Partial Aggregate
> Output: PARTIAL count(*)
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on public.t3
> Output: t3.c1, t3.c2
> Filter: (t3.c1 ~~ '%sss'::text)
> -> Seq Scan on public.t4
> Output: t4.c1, NULL::text, timeofday()
> Filter: (timeofday() = t4.c1)
> (15 rows)
>
>
>
> Focus on the last two plans, the function timeofday is
> volatile but paralle-safe. And Postgres outputs two parallel
> plan.
>
>
> The first plan:
>
> Finalize Aggregate
> -> Gather
> Workers Planned: 2
> -> Partial Aggregate
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on t3
> Filter: (c1 ~~ '%sss'::text)
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
>
> The join's left tree is parallel scan and the right tree is seq scan.
> This algorithm is correct using the distribute distributive law of
> distributed join:
> A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) )
>
> The correctness of the above law should have a pre-assumption:
> The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B)
>
> But things get complicated when volatile functions come in. Timeofday is just
> an example to show the idea. The core is volatile functions can return different
> results on successive calls with the same arguments. Thus the following piece,
> the right tree of the join
> -> Seq Scan on t4
> Filter: (timeofday() = c1)
> can not be considered consistent everywhere in the scan workers.
>
> The second plan
>
> Finalize Aggregate
> Output: count(*)
> -> Gather
> Output: (PARTIAL count(*))
> Workers Planned: 2
> -> Partial Aggregate
> Output: PARTIAL count(*)
> -> Nested Loop
> Join Filter: (t3.c1 = t4.c1)
> -> Parallel Seq Scan on public.t3
> Output: t3.c1, t3.c2
> Filter: (t3.c1 ~~ '%sss'::text)
> -> Seq Scan on public.t4
> Output: t4.c1, NULL::text, timeofday()
> Filter: (timeofday() = t4.c1)
>
>
> have voltile projections in the right tree of the nestloop:
>
> -> Seq Scan on public.t4
> Output: t4.c1, NULL::text, timeofday()
> Filter: (timeofday() = t4.c1)
>
> It should not be taken as consistent in different workers.
You are right, no they are not consistent. But Neither plans is
incorrect:
1. In the first query, it's semantically permissible to evaluate
timeofday() for each pair of (c1, c2), and the plan reflects that.
(Notice that the parallel nature of the plan is just noise here, the
planner could have gone with a Nested Loop of which the inner side is
_not_ materialized).
2. In the second query -- again -- in a canonical "outside-in"
evaluation, it's perfectly permissible to evaluate the subquery for each
value of t3. Again, the parallelism here is hardly relevant, a serial
plan without a material node on the inner side of a nested loop would
just as well (or as badly as you would feel) project different
timeofday() values for the same tuple from t4.
In short, the above plans seem fine.
P.S. the two plans you posted look identical to me, maybe I'm blind late
at night?
Cheers,
Jesse
Jesse Zhang <sbjesse@gmail.com> writes: > You are right, no they are not consistent. But Neither plans is > incorrect: > 1. In the first query, it's semantically permissible to evaluate > timeofday() for each pair of (c1, c2), and the plan reflects that. > (Notice that the parallel nature of the plan is just noise here, the > planner could have gone with a Nested Loop of which the inner side is > _not_ materialized). Yeah, exactly. The short answer here is that refusing to parallelize the plan would not make things any more consistent. In general, using a volatile function in a WHERE clause is problematic because we make no guarantees about how often it will be evaluated. It could be anywhere between "never" and "once per row of the cross-product of the FROM tables". AFAIR, the only concession we've made to make that less unpredictable is to avoid using volatile functions in index quals. But even that will only make things noticeably more predictable for single-table queries. As soon as you get into join cases, you don't have much control over when the function will get evaluated. regards, tom lane
Hi Tom and Zhenghua, On Thu, Jul 16, 2020 at 8:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Jesse Zhang <sbjesse@gmail.com> writes: > > You are right, no they are not consistent. But Neither plans is > > incorrect: > > > 1. In the first query, it's semantically permissible to evaluate > > timeofday() for each pair of (c1, c2), and the plan reflects that. > > (Notice that the parallel nature of the plan is just noise here, the > > planner could have gone with a Nested Loop of which the inner side is > > _not_ materialized). > > Yeah, exactly. The short answer here is that refusing to parallelize > the plan would not make things any more consistent. > > In general, using a volatile function in a WHERE clause is problematic > because we make no guarantees about how often it will be evaluated. > It could be anywhere between "never" and "once per row of the > cross-product of the FROM tables". AFAIR, the only concession we've made > to make that less unpredictable is to avoid using volatile functions in > index quals. But even that will only make things noticeably more > predictable for single-table queries. As soon as you get into join cases, > you don't have much control over when the function will get evaluated. > > regards, tom lane For more kicks, I don't even think this is restricted to volatile functions only. To stir the pot, it's conceivable that planner might produce the following plan Seq Scan on pg_temp_3.foo Output: foo.a Filter: (SubPlan 1) SubPlan 1 -> WindowAgg Output: sum(bar.d) OVER (?) -> Seq Scan on pg_temp_3.bar Output: bar.d For the following query SELECT a FROM foo WHERE b = ALL ( SELECT sum(d) OVER (ROWS UNBOUNDED PRECEDING) FROM bar ); N.B. that the WindowAgg might produce a different set of numbers each time depending on the scan order of bar, which means that for two different "foo" tuples of equal b value, one might be rejected by the filter whereas another survives. I think the crux of the discussion should be whether we can reasonably expect a subquery (subquery-like structure, for example the inner side of nest loops upthread) to be evaluated only once. IMHO, no. The SQL standard only broadly mandates that each "execution" of a subquery to be "atomic". Zhenghua and Tom, would you suggest the above plan is wrong (not suboptimal, but wrong) just because we don't materialize the WindowAgg under the subplan? Cheers, Jesse
Jesse Zhang <sbjesse@gmail.com> writes: > For more kicks, I don't even think this is restricted to volatile > functions only. To stir the pot, it's conceivable that planner might > produce the following plan > Seq Scan on pg_temp_3.foo > Output: foo.a > Filter: (SubPlan 1) > SubPlan 1 > -> WindowAgg > Output: sum(bar.d) OVER (?) > -> Seq Scan on pg_temp_3.bar > Output: bar.d > For the following query > SELECT a FROM foo WHERE b = ALL ( > SELECT sum(d) OVER (ROWS UNBOUNDED PRECEDING) FROM bar > ); Interesting example. Normally you'd expect that repeated executions of the inner seqscan would produce the same output in the same order ... but if the table were big enough to allow the synchronize_seqscans logic to kick in, that might not be true. You could argue about whether or not synchronize_seqscans breaks any fundamental SQL guarantees, but my feeling is that it doesn't: if the above query produces unstable results, that's the user's fault for having written an underspecified windowing query. > Zhenghua and Tom, would you suggest the above plan is wrong (not > suboptimal, but wrong) just because we don't materialize the WindowAgg > under the subplan? I would not, per above: the query is buggy, not the implementation. (In standard-ese, the results of that query are undefined, not implementation-defined, meaning that we don't have to produce consistent results.) regards, tom lane