Thread: Unfortunate expansion of composite types in union

Unfortunate expansion of composite types in union

From
Jens-Wolfhard Schicke
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

I have the following query:

explain analyze
SELECT
  (cast((
    SELECT cast(row(1, o.id, NULL, NULL, NULL, NULL) as something)
      FROM ONLY object o WHERE o.id = l.e
    UNION ALL
      SELECT cast(row(2, l2.id, l2.s, l2.e, l2.intensity, NULL) as something)
        FROM ONLY link l2  WHERE l2.id = l.e
    UNION ALL
      SELECT cast(row(3, r.id, r.o, r.format, NULL, r.data) as something)
        FROM ONLY representation r WHERE r.id = l.e
  ) as something)).*,
  l.id, l.s, l.intensity
FROM link l
WHERE l.s = 8692
;


and the execution plan:

                                                                       QUERY PLAN
-
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on link l  (cost=6.33..27932.79 rows=178 width=32) (actual time=0.172..12.366 rows=149 loops=1)
   Recheck Cond: (s = 8692)
   ->  Bitmap Index Scan on link_s_idx  (cost=0.00..6.29 rows=178 width=0) (actual time=0.050..0.050 rows=149 loops=1)
         Index Cond: (s = 8692)
   SubPlan
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.002..0.002rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.006..0.013 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.006..0.006
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.002..0.002rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.006..0.013 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.006..0.006
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
                 Index Cond: (id = $0)
     ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.006..0.015 rows=1 loops=149)
           ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.006..0.006
rows=1loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.004..0.004
rows=0loops=149) 
                 Index Cond: (id = $0)
           ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.004..0.004rows=0 loops=149) 
                 Index Cond: (id = $0)
 Total runtime: 12.594 ms
(48 rows)

(hope it wasn't mangled)...

The "something" type is:

create type something as (
  t integer,
  id bigint,
  ref1 bigint,
  ref2 bigint,
  intensity double precision,
  data bytea
);

Problem: It looks as if every column of "something" is fetched seperately. I'd think a plan which only did one
indexscanfor the row on each table and then returns 
the complete row at once better. (Especially if the dataset is in memory)

What I want to achive is to load a set of somethings of yet unknown type (any of link, object, representation) at once,
namelythose objects which reside 
at the ends of the links. Is there any better way?

My version:
                                                         version
-
-------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.3beta2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.2.3 20071014 (prerelease) (Debian
4.2.2-3)
(1 row)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHKvAszhchXT4RR5ARAniBAJ9bk6noLG6tIb2NKmAS7bk6Fpig9QCeNEzF
YND1waoDKi46BjjNEKwFMF0=
=/h36
-----END PGP SIGNATURE-----

Re: Unfortunate expansion of composite types in union

From
"Pavel Stehule"
Date:
Hello

I am not sure, but this query will be evaluated efectivelly, because
all necessary data will be in cache.

PostgreSQL doesn't support Common Table Expressions - you can write
SRF function:

CREATE OR REPLACE FUNCTION c(integer)
RETURNS SETOF something AS $$
DECLARE r RECORD;
  o something;
BEGIN
  FOR r IN SELECT * FROM link WHERE link.s = $1 LOOP
    o := (1, r.id, NULL, NULL, NULL);
    RETURN NEXT o;
    o := (2, r.id, r.s, r.e, r.intensity, NULL);
    RETURN NEXT o;
    o := (3, r.id, r.o, r.format, NULL, r.data);
    RETURN NEXT o;
    RETURN;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

SELECT * FROM c(8692);

Regards
Pavel Stehule

On 02/11/2007, Jens-Wolfhard Schicke <drahflow@gmx.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> I have the following query:
>
> explain analyze
> SELECT
>   (cast((
>     SELECT cast(row(1, o.id, NULL, NULL, NULL, NULL) as something)
>       FROM ONLY object o WHERE o.id = l.e
>     UNION ALL
>       SELECT cast(row(2, l2.id, l2.s, l2.e, l2.intensity, NULL) as something)
>         FROM ONLY link l2  WHERE l2.id = l.e
>     UNION ALL
>       SELECT cast(row(3, r.id, r.o, r.format, NULL, r.data) as something)
>         FROM ONLY representation r WHERE r.id = l.e
>   ) as something)).*,
>   l.id, l.s, l.intensity
> FROM link l
> WHERE l.s = 8692
> ;
>
>
> and the execution plan:
>
>                                                                        QUERY PLAN
> -
---------------------------------------------------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on link l  (cost=6.33..27932.79 rows=178 width=32) (actual time=0.172..12.366 rows=149 loops=1)
>    Recheck Cond: (s = 8692)
>    ->  Bitmap Index Scan on link_s_idx  (cost=0.00..6.29 rows=178 width=0) (actual time=0.050..0.050 rows=149
loops=1)
>          Index Cond: (s = 8692)
>    SubPlan
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.002..0.002rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.006..0.013 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.006..0.006
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.002..0.002rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.006..0.013 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.006..0.006
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.004..0.011 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.004..0.004
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.003..0.003
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.003..0.003rows=0 loops=149) 
>                  Index Cond: (id = $0)
>      ->  Append  (cost=0.00..25.52 rows=3 width=34) (actual time=0.006..0.015 rows=1 loops=149)
>            ->  Index Scan using object_id_idx on object o  (cost=0.00..8.30 rows=1 width=8) (actual time=0.006..0.006
rows=1loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using link_id_idx on link l2  (cost=0.00..8.86 rows=1 width=32) (actual time=0.004..0.004
rows=0loops=149) 
>                  Index Cond: (id = $0)
>            ->  Index Scan using representation_id_idx on representation r  (cost=0.00..8.32 rows=1 width=34) (actual
time=0.004..0.004rows=0 loops=149) 
>                  Index Cond: (id = $0)
>  Total runtime: 12.594 ms
> (48 rows)
>
> (hope it wasn't mangled)...
>
> The "something" type is:
>
> create type something as (
>   t integer,
>   id bigint,
>   ref1 bigint,
>   ref2 bigint,
>   intensity double precision,
>   data bytea
> );
>
> Problem: It looks as if every column of "something" is fetched seperately. I'd think a plan which only did one
indexscanfor the row on each table and then returns 
> the complete row at once better. (Especially if the dataset is in memory)
>
> What I want to achive is to load a set of somethings of yet unknown type (any of link, object, representation) at
once,namely those objects which reside 
> at the ends of the links. Is there any better way?
>
> My version:
>                                                          version
> -
-------------------------------------------------------------------------------------------------------------------------
>  PostgreSQL 8.3beta2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.2.3 20071014 (prerelease) (Debian
4.2.2-3)
> (1 row)
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFHKvAszhchXT4RR5ARAniBAJ9bk6noLG6tIb2NKmAS7bk6Fpig9QCeNEzF
> YND1waoDKi46BjjNEKwFMF0=
> =/h36
> -----END PGP SIGNATURE-----
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

Re: Unfortunate expansion of composite types in union

From
Jens-Wolfhard Schicke
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Pavel Stehule wrote:
> PostgreSQL doesn't support Common Table Expressions - you can write
> SRF function:
>
> CREATE OR REPLACE FUNCTION c(integer)
> RETURNS SETOF something AS $$
> DECLARE r RECORD;
>   o something;
> BEGIN
>   FOR r IN SELECT * FROM link WHERE link.s = $1 LOOP
>     o := (1, r.id, NULL, NULL, NULL);
>     RETURN NEXT o;
>     o := (2, r.id, r.s, r.e, r.intensity, NULL);
>     RETURN NEXT o;
>     o := (3, r.id, r.o, r.format, NULL, r.data);
>     RETURN NEXT o;
>     RETURN;
>   END LOOP;
> END;
> $$ LANGUAGE plpgsql;
>
> SELECT * FROM c(8692);
This is a completely different query from my one. (That is, the results are different.)

My problem is that I have a schema like
fastgraph=# \d object
                     Table "public.object"
 Column |  Type  |                  Modifiers
- --------+--------+----------------------------------------------
 id     | bigint | not null default nextval('id_seq'::regclass)
Indexes:
    "object_id_idx" UNIQUE, btree (id)

fastgraph=# \d link
                             Table "public.link"
  Column   |       Type       |                  Modifiers
- -----------+------------------+----------------------------------------------
 id        | bigint           | not null default nextval('id_seq'::regclass)
 s         | bigint           | not null
 e         | bigint           | not null
 intensity | double precision | not null
Indexes:
    "link_id_idx" UNIQUE, btree (id)
    "link_e_idx" btree (e)
    "link_s_idx" btree (s)
    "link_se_idx" btree (s, e)
Inherits: object

fastgraph=# \d representation
                 Table "public.representation"
 Column |  Type  |                  Modifiers
- --------+--------+----------------------------------------------
 id     | bigint | not null default nextval('id_seq'::regclass)
 o      | bigint | not null
 format | bigint | not null
 data   | bytea  | not null
Indexes:
    "representation_id_idx" UNIQUE, btree (id)
    "representation_o_idx" btree (o)
    "representation_text" hash (data) WHERE format = 1
Inherits: object

now I want those "objects" (with inheritance) which are connected to some other. So I tried the
query in the original post, and found the execution plan to be suboptimal. Today I tried to do it with OUTER JOINs
but failed utterly. So what is the best way to get the results? The original query is exactly what I need, only the
plan
is bad. Any Ideas?

Regards,
  Jens-Wolfhard Schicke

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHLfp+zhchXT4RR5ARAorgAKDr2grqWnxbvFMYOPiLJuHpjco30ACgswQB
9/qW9rz+ZngkBYdR0RLsils=
=LdBJ
-----END PGP SIGNATURE-----