Unwanted expression simplification in PG12b2 - Mailing list pgsql-hackers

From Darafei "Komяpa" Praliaskouski
Subject Unwanted expression simplification in PG12b2
Date
Msg-id CAC8Q8tJkKaG8CirjKV_7bHBXJYcwdW11faTLyZDGB5CFKXTzQg@mail.gmail.com
Whole thread Raw
Responses Re: Unwanted expression simplification in PG12b2  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

Many thanks for the parallel improvements in Postgres 12. Here is one of cases where a costy function gets moved from a parallel worker into main one, rendering spatial processing single core once again on some queries. Perhaps an assumption "expressions should be mashed together as much as possible" should be reviewed and something along "biggest part of expression should be pushed down into parallel worker"?

PostgreSQL 12beta2 (Ubuntu 12~beta2-1.pgdg19.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, 64-bit

Here is a reproducer:


-- setup
create extension postgis;
create table postgis_test_table (a geometry, b geometry, id int);
set force_parallel_mode to on;
insert into postgis_test_table (select 'POINT EMPTY', 'POINT EMPTY', generate_series(0,1000) );


-- unwanted inlining moves difference and unary union calculation into master worker
21:43:06 [gis] > explain verbose select ST_Collect(geom), id from (select ST_Difference(a,ST_UnaryUnion(b)) as geom, id from postgis_test_table) z group by id;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
QUERY PLAN │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Gather (cost=159.86..42668.93 rows=200 width=36) │
Output: (st_collect(st_difference(postgis_test_table.a, st_unaryunion(postgis_test_table.b)))), postgis_test_table.id
│ Workers Planned: 1
│ Single Copy: true
│ -> GroupAggregate (cost=59.86..42568.73 rows=200 width=36) │
Output: st_collect(st_difference(postgis_test_table.a, st_unaryunion(postgis_test_table.b))), postgis_test_table.id
Group Key: postgis_test_table.id
│ -> Sort (cost=59.86..61.98 rows=850 width=68) │
Output: postgis_test_table.id, postgis_test_table.a, postgis_test_table.b │
│ Sort Key: postgis_test_table.id
│ -> Seq Scan on public.postgis_test_table (cost=0.00..18.50 rows=850 width=68) │
Output: postgis_test_table.id, postgis_test_table.a, postgis_test_table.b │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(12 rows)

-- when constrained by OFFSET 0, costy calculation is kept in parallel workers
21:43:12 [gis] > explain verbose select ST_Collect(geom), id from (select ST_Difference(a,ST_UnaryUnion(b)) as geom, id from postgis_test_table offset 0) z group by id;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
QUERY PLAN │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ GroupAggregate (cost=13863.45..13872.33 rows=200 width=36) │
Output: st_collect(z.geom), z.id
Group Key: z.id
│ -> Sort (cost=13863.45..13865.58 rows=850 width=36) │
Output: z.id, z.geom │
│ Sort Key: z.id
│ -> Subquery Scan on z (cost=100.00..13822.09 rows=850 width=36) │
Output: z.id, z.geom │
│ -> Gather (cost=100.00..13813.59 rows=850 width=36) │
Output: (st_difference(postgis_test_table.a, st_unaryunion(postgis_test_table.b))), postgis_test_table.id
│ Workers Planned: 3
│ -> Parallel Seq Scan on public.postgis_test_table (cost=0.00..13712.74 rows=274 width=36) │
Output: st_difference(postgis_test_table.a, st_unaryunion(postgis_test_table.b)), postgis_test_table.id
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(13 rows)

-- teardown
drop table postgis_test_table;



--
Darafei Praliaskouski

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: refactoring - share str2*int64 functions
Next
From: Andres Freund
Date:
Subject: Re: Allow simplehash to use already-calculated hash values