Thread: Effect of the WindowAgg on the Nested Loop

Effect of the WindowAgg on the Nested Loop

From
Виктор Егоров
Date:
Greetings.

I've been playing with a small query that I've been asked to optimize
and noticed a strange (for me) effect.
Query uses this table:

                                   Table "clc06_tiles"
   Column   |         Type          |
Modifiers
------------+-----------------------+-----------------------------------------------------------
 geometry   | geometry              |
 code_06    | character varying(3)  |
 gid        | bigint                | not null default
nextval('clc06_tiles_gid_seq'::regclass)
Indexes:
    "clc06_tiles_pkey" PRIMARY KEY, btree (gid)
    "i_clc06_tiles_geometry" gist (geometry)
Check constraints:
    "enforce_dims_geom" CHECK (st_ndims(geometry) = 2)
    "enforce_geotype_geom" CHECK (geometrytype(geometry) =
'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR
geometry IS NULL)
    "enforce_srid_geom" CHECK (st_srid(geometry) = 3035)

and this function:
CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8,
y2 float8) RETURNS geometry AS $my_trans$
    SELECT st_Transform(
        st_GeomFromText('LINESTRING('||x1::text||' '||y1::text||
                         ', '||x2::text||' '||y2::text||')',4326),3035);
$my_trans$ LANGUAGE sql IMMUTABLE STRICT;

and these constants:
\set x1 4.56
\set y1 52.54
\set x2 5.08
\set y2 53.34


Original query looks like this ( http://explain.depesz.com/s/pzv ):

SELECT n, i.*, st_NumGeometries(i.geom)
  FROM (
    SELECT a.code_06 as code_06,
           st_Multi(st_Intersection(a.geometry,
my_trans(:x1,:y1,:x2,:y2))) as geom
      FROM clc06_tiles a
     WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
  JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);


After a while I added row_number() to the inner part (
http://explain.depesz.com/s/hfs ):

SELECT n, i.*, st_NumGeometries(i.geom)
  FROM (
    SELECT row_number() OVER () AS rn, a.code_06 as code_06,
           st_Multi(st_Intersection(a.geometry,
my_trans(:x1,:y1,:x2,:y2))) as geom
      FROM clc06_tiles a
     WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
  JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);


It was really surprising to see a "side" effect of 8x performance boost.
The only difference I can see is an extra WindowAgg step in the second variant.



Could you kindly explain how WindowAgg node affects the overall
performance, please?



PostgreSQL 9.2.1 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
 archive_command            | (disabled)             | configuration file
 bgwriter_delay             | 100ms                  | configuration file
 bgwriter_lru_maxpages      | 200                    | configuration file
 checkpoint_segments        | 30                     | configuration file
 default_text_search_config | pg_catalog.english     | configuration file
 effective_cache_size       | 3GB                    | configuration file
 listen_addresses           | *                      | configuration file
 log_checkpoints            | on                     | configuration file
 log_connections            | on                     | configuration file
 log_destination            | csvlog                 | configuration file
 log_disconnections         | on                     | configuration file
 log_lock_waits             | on                     | configuration file
 log_min_duration_statement | 100ms                  | configuration file
 log_rotation_age           | 1d                     | configuration file
 log_temp_files             | 20MB                   | configuration file
 log_timezone               | UTC                    | configuration file
 logging_collector          | on                     | configuration file
 maintenance_work_mem       | 64MB                   | configuration file
 max_connections            | 100                    | configuration file
 max_stack_depth            | 2MB                    | environment variable
 max_wal_senders            | 2                      | configuration file
 port                       | 5432                   | configuration file
 shared_buffers             | 768MB                  | configuration file
 temp_buffers               | 32MB                   | configuration file
 TimeZone                   | UTC                    | configuration file
 wal_level                  | hot_standby            | configuration file
 work_mem                   | 8MB                    | configuration file


--
Victor Y. Yegorov


Re: Effect of the WindowAgg on the Nested Loop

From
Robert Haas
Date:
On Tue, Jan 22, 2013 at 3:57 PM, Виктор Егоров <vyegorov@gmail.com> wrote:
> Greetings.
>
> I've been playing with a small query that I've been asked to optimize
> and noticed a strange (for me) effect.
> Query uses this table:
>
>                                    Table "clc06_tiles"
>    Column   |         Type          |
> Modifiers
> ------------+-----------------------+-----------------------------------------------------------
>  geometry   | geometry              |
>  code_06    | character varying(3)  |
>  gid        | bigint                | not null default
> nextval('clc06_tiles_gid_seq'::regclass)
> Indexes:
>     "clc06_tiles_pkey" PRIMARY KEY, btree (gid)
>     "i_clc06_tiles_geometry" gist (geometry)
> Check constraints:
>     "enforce_dims_geom" CHECK (st_ndims(geometry) = 2)
>     "enforce_geotype_geom" CHECK (geometrytype(geometry) =
> 'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR
> geometry IS NULL)
>     "enforce_srid_geom" CHECK (st_srid(geometry) = 3035)
>
> and this function:
> CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8,
> y2 float8) RETURNS geometry AS $my_trans$
>     SELECT st_Transform(
>         st_GeomFromText('LINESTRING('||x1::text||' '||y1::text||
>                          ', '||x2::text||' '||y2::text||')',4326),3035);
> $my_trans$ LANGUAGE sql IMMUTABLE STRICT;
>
> and these constants:
> \set x1 4.56
> \set y1 52.54
> \set x2 5.08
> \set y2 53.34
>
>
> Original query looks like this ( http://explain.depesz.com/s/pzv ):
>
> SELECT n, i.*, st_NumGeometries(i.geom)
>   FROM (
>     SELECT a.code_06 as code_06,
>            st_Multi(st_Intersection(a.geometry,
> my_trans(:x1,:y1,:x2,:y2))) as geom
>       FROM clc06_tiles a
>      WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
>   JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);
>
>
> After a while I added row_number() to the inner part (
> http://explain.depesz.com/s/hfs ):
>
> SELECT n, i.*, st_NumGeometries(i.geom)
>   FROM (
>     SELECT row_number() OVER () AS rn, a.code_06 as code_06,
>            st_Multi(st_Intersection(a.geometry,
> my_trans(:x1,:y1,:x2,:y2))) as geom
>       FROM clc06_tiles a
>      WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
>   JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);
>
>
> It was really surprising to see a "side" effect of 8x performance boost.
> The only difference I can see is an extra WindowAgg step in the second variant.
>
> Could you kindly explain how WindowAgg node affects the overall
> performance, please?

Apologies for resurrecting an old thread, but I just came across this
post while doing some research and I don't see any responses.

This seems like a mighty interesting example.  I'm not sure what's
going on here, but let me guess.  I think that the WindowAgg is
forcing some operation - detoasting, maybe? - to happen under the
materialize node.  As a result, it only gets done once.  But in the
other plan, the detoast happens at the nested loop level, above the
materialize node, and therefore it happens 10x instead of 1x.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Effect of the WindowAgg on the Nested Loop

From
Victor Yegorov
Date:
2013/5/15 Robert Haas <robertmhaas@gmail.com>
> Original query looks like this ( http://explain.depesz.com/s/pzv ):
>
> After a while I added row_number() to the inner part (
> http://explain.depesz.com/s/hfs ):
>
> It was really surprising to see a "side" effect of 8x performance boost.
> The only difference I can see is an extra WindowAgg step in the second variant.

Apologies for resurrecting an old thread, but I just came across this
post while doing some research and I don't see any responses.

This seems like a mighty interesting example.  I'm not sure what's
going on here, but let me guess.  I think that the WindowAgg is
forcing some operation - detoasting, maybe? - to happen under the
materialize node.  As a result, it only gets done once.  But in the
other plan, the detoast happens at the nested loop level, above the
materialize node, and therefore it happens 10x instead of 1x.

I was playing with the query a while ago and put it aside since then,
need time to come back to this thing.

I will try to put together a testcase for this example, I'd like to achieve
the same behavior on a non-GIS data set.

--
Victor Y. Yegorov