Thread: full join in view

full join in view

From
"Tambet Matiisen"
Date:
I'm making a report in Crystal Reports, which makes use of full join. As Crystal Reports does not support full join
natively,I created a view which contains the join and based my report on that view. The report has also a parameter to
filteronly subset of rows from view. My problem is, that when selecting from this view, optimizer never uses indexes of
neitherof two tables.  

I understand, that optimizing the filter to the bottom of query tree may not always give the same result with full join
(althoughit should in my case). Alternative could be to use function returning a table, but I doubt I could use this
functionfrom Crystal Reports. What other options I have?  
 Tambet


Re: full join in view

From
Tomasz Myrta
Date:
Tambet Matiisen wrote:

> I'm making a report in Crystal Reports, which makes use of full join. 
> As Crystal Reports does not support full join natively, I created a 
> view which contains the join and based my report on that view. The 
> report has also a parameter to filter only subset of rows from view. 
> My problem is, that when selecting from this view, optimizer never 
> uses indexes of neither of two tables.
>
> I understand, that optimizing the filter to the bottom of query tree 
> may not always give the same result with full join (although it should 
> in my case). Alternative could be to use function returning a table, 
> but I doubt I could use this function from Crystal Reports. What other 
> options I have?
>
>   Tambet

Can you add some sql examples - table & index definition, view definition?
If your view doesn't contain other views or sub-selects, postgres should 
use indexes.
Tomasz Myrta



Re: full join in view

From
"Tambet Matiisen"
Date:
>
> Can you add some sql examples - table & index definition,
> view definition?
> If your view doesn't contain other views or sub-selects,
> postgres should
> use indexes.
> Tomasz Myrta
>

You are right. After disabling seq_scan, it uses indexes just as you described. Unfortunately my view happens to use
subquery:

CREATE OR REPLACE VIEW v_tegelikud_kulud AS
SELECT   *,   COALESCE(dor_kst_id,kdt_kst_id) AS kst_id,   COALESCE(dor_mat_id,kdt_mat_id) AS mat_id
FROM (   SELECT       dor.dor_kst_id,       dor.dor_mat_id,       sum(dor.kogus * koefitsent::numeric) AS kogus,
sum(dor.kokku)AS kokku   FROM dokumentide_read dor   JOIN dokumendid dok       ON dor.dor_dok_id = dok.dok_id AND
dok.tyyp= 30 AND dok.kinnitaja IS NOT NULL   GROUP BY       dor.dor_kst_id,       dor.dor_mat_id   ) dor 
FULL JOIN koostude_detailid kdt   ON dor.dor_mat_id = kdt.kdt_mat_id AND dor.dor_kst_id = kdt.kdt_kst_id;

The idea behind the view is to show supposed expenses (in table koostude_detailid) compared to actual expenses (in
tablesdokumendid and dokumentide_read). Both refer to materials (foreign keys kdt_mat_id and dor_mat_id) and belong to
anassembly unit (foreign keys kdt_kst_id and dor_kst_id). The report will show supposed and actual expenses side by
sidefor one assemby unit. So the view is queried like this: 

explain select count(1) from v_tegelikud_kulud where kdt_kst_id = 1125 or dor_kst_id = 1125;
                         QUERY PLAN 

------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=42.31..42.31 rows=1 width=16)  ->  Merge Join  (cost=41.42..42.03 rows=113 width=16)        Merge Cond:
(("outer".dor_kst_id= "inner".kdt_kst_id) AND ("outer".dor_mat_id = "inner".kdt_mat_id))        Filter:
(("inner".kdt_kst_id= 1125) OR ("outer".dor_kst_id = 1125))        ->  Sort  (cost=34.44..34.46 rows=8 width=41)
     Sort Key: dor.dor_kst_id, dor.dor_mat_id              ->  Subquery Scan dor  (cost=33.25..34.31 rows=8 width=41)
                ->  Aggregate  (cost=33.25..34.31 rows=8 width=41)                          ->  Group
(cost=33.25..33.89rows=84 width=41)                                ->  Sort  (cost=33.25..33.47 rows=84 width=41)
                              Sort Key: dor.dor_kst_id, dor.dor_mat_id                                      ->  Hash
Join (cost=8.19..30.56 rows=84 width=41)                                            Hash Cond: ("outer".dor_dok_id =
"inner".dok_id)                                           ->  Seq Scan on dokumentide_read dor  (cost=0.00..15.61
rows=761width=37)                                            ->  Hash  (cost=8.10..8.10 rows=36 width=4)
                                 ->  Seq Scan on dokumendid dok  (cost=0.00..8.10 rows=36 width=4)
                                 Filter: ((tyyp = 30) AND (kinnitaja IS NOT NULL))        ->  Sort  (cost=6.98..7.27
rows=113width=8)              Sort Key: kdt.kdt_kst_id, kdt.kdt_mat_id              ->  Seq Scan on koostude_detailid
kdt (cost=0.00..3.13 rows=113 width=8) 
(20 rows)

When I disable seqscan (I don't have many rows in our development database), I get following result:

explain select count(1) from v_tegelikud_kulud where kdt_kst_id = 1125 or dor_kst_id = 1125;
                                   QUERY PLAN 

---------------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=51.13..51.13 rows=1 width=16)  ->  Merge Join  (cost=40.39..50.85 rows=113 width=16)        Merge Cond:
(("outer".kdt_kst_id= "inner".dor_kst_id) AND ("outer".kdt_mat_id = "inner".dor_mat_id))        Filter:
(("outer".kdt_kst_id= 1125) OR ("inner".dor_kst_id = 1125))        ->  Index Scan using kdt_uk on koostude_detailid kdt
(cost=0.00..10.13 rows=113 width=8)        ->  Sort  (cost=40.39..40.41 rows=8 width=41)              Sort Key:
dor.dor_kst_id,dor.dor_mat_id              ->  Subquery Scan dor  (cost=39.20..40.26 rows=8 width=41)
->  Aggregate  (cost=39.20..40.26 rows=8 width=41)                          ->  Group  (cost=39.20..39.83 rows=84
width=41)                               ->  Sort  (cost=39.20..39.41 rows=84 width=41)
   Sort Key: dor.dor_kst_id, dor.dor_mat_id                                      ->  Merge Join  (cost=0.00..36.51
rows=84width=41)                                            Merge Cond: ("outer".dor_dok_id = "inner".dok_id)
                                ->  Index Scan using dor_dok_fk_i on dokumentide_read dor  (cost=0.00..20.91 rows=761
width=37)                                           ->  Index Scan using dok_pk on dokumendid dok  (cost=0.00..12.56
rows=36width=4)                                                  Filter: ((tyyp = 30) AND (kinnitaja IS NOT NULL)) 
(17 rows)

As you see, condition dor_kst_id = 1125 is used after the subquery has done it's job (which may take a while...). And
asI understand, currently I can do nothing about it? 
 Tambet

PS. Just to be complete, here are the descriptions of the tables used:
         Table "public.koostude_detailid"   Column     |          Type          | Modifiers
---------------+------------------------+-----------kdt_id        | integer                | not nullkdt_kst_id    |
integer               | not nullkdt_mat_id    | integer                | not nulldetaili_nr    | character varying(255)
|not nullarv           | numeric(5,0)           | not nullkulu          | numeric(16,6)          | not nullyhik
| character varying(10)  | not nullkoefitsent    | real                   | not nullerikaal       | numeric(16,6)
  | not nulleeldatav_hind | numeric(12,2)          | not nullmarkused      | character varying(255) | 
Indexes: kdt_pk primary key btree (kdt_id),        kdt_detaili_nr_uk unique btree (kdt_kst_id, detaili_nr),
kdt_ukunique btree (kdt_kst_id, kdt_mat_id),        kdt_kst_fk_i btree (kdt_kst_id),        kdt_mat_fk_i btree
(kdt_mat_id)
Check constraints: "kdt_arv_ck" (arv > 0::numeric)                  "kdt_koefitsent_ck" (koefitsent > 0::double
precision)
Foreign Key constraints: kdt_kst_fk FOREIGN KEY (kdt_kst_id) REFERENCES koostud(kst_id) ON UPDATE NO ACTION ON DELETE
CASCADE,                       kdt_mat_fk FOREIGN KEY (kdt_mat_id) REFERENCES materjalid(mat_id) ON UPDATE NO ACTION ON
DELETENO ACTION 
Triggers: kdt_summa_juurde_trg,         kdt_summa_maha_trg
        Table "public.dokumentide_read"  Column   |          Type          | Modifiers
------------+------------------------+-----------dor_id     | integer                | not nulldor_dok_id | integer
          | not nulldor_kst_id | integer                |dor_mat_id | integer                | not nullkogus      |
numeric(16,6)         |koefitsent | real                   | not nullyhik       | character varying(10)  | not
nullyhiku_hind| numeric(12,2)          |kokku      | numeric(12,2)          |markused   | character varying(255) | 
Indexes: dor_pk primary key btree (dor_id),        dor_dok_fk_i btree (dor_dok_id),        dor_kst_fk_i btree
(dor_kst_id),       dor_mat_fk_i btree (dor_mat_id) 
Check constraints: "dor_koefitsent_ck" (koefitsent > 0::double precision)
Foreign Key constraints: dor_dok_fk FOREIGN KEY (dor_dok_id) REFERENCES dokumendid(dok_id) ON UPDATE NO ACTION ON
DELETECASCADE,                        dor_mat_fk FOREIGN KEY (dor_mat_id) REFERENCES materjalid(mat_id) ON UPDATE NO
ACTIONON DELETE NO ACTION,                        dor_kst_fk FOREIGN KEY (dor_kst_id) REFERENCES koostud(kst_id) ON
UPDATENO ACTION ON DELETE NO ACTION 
Triggers: dor_summa_suurendamine_trg,         dor_summa_vahendamine_trg
                Table "public.dokumendid"     Column      |            Type             | Modifiers
------------------+-----------------------------+-----------dok_id           | integer                     | not
nulldok_prt_id      | integer                     |dok_tot_id       | integer                     |dok_dok_id       |
integer                    |tyyp             | smallint                    | not nulldokumendi_nr     | character
varying(255)     |alusdokumendi_nr | character varying(255)      |kuupaev          | date                        | not
nulltahtaeg         | date                        |taidetud         | date                        |summa            |
numeric(12,2)              | not nullmarkused         | character varying(255)      |kinnitaja        | character
varying(255)     |kinnitamise_aeg  | timestamp without time zone | 
Indexes: dok_pk primary key btree (dok_id),        dok_dok_fk_i btree (dok_dok_id),        dok_dokumendi_nr_i btree
(dokumendi_nr),       dok_kuupaev_i btree (kuupaev),        dok_prt_fk_i btree (dok_prt_id),        dok_tot_fk_i btree
(dok_tot_id)
Check constraints: "dok_tyyp_ck" ((((((tyyp = 10) AND (dok_prt_id IS NOT NULL)) OR ((tyyp = 20) AND (dok_prt_id IS NOT
NULL)))OR ((tyyp = 30 
) AND (((dok_prt_id IS NOT NULL) AND (dok_tot_id IS NULL)) OR ((dok_tot_id IS NOT NULL) AND (dok_prt_id IS NULL))))) OR
(tyyp= 40)) OR (tyyp= 50)) 
Foreign Key constraints: dok_dok_fk FOREIGN KEY (dok_dok_id) REFERENCES dokumendid(dok_id) ON UPDATE NO ACTION ON
DELETENO ACTION,                        dok_prt_fk FOREIGN KEY (dok_prt_id) REFERENCES partnerid(prt_id) ON UPDATE NO
ACTIONON DELETE NO ACTION,                        dok_tot_fk FOREIGN KEY (dok_tot_id) REFERENCES tootajad(tot_id) ON
UPDATENO ACTION ON DELETE NO ACTION 


Re: full join in view

From
Tomasz Myrta
Date:
Tambet Matiisen wrote:

> You are right. After disabling seq_scan, it uses indexes just as you 
> described. Unfortunately my view happens to use subquery:

Don't disable seq_scan - sometimes it is better than indexscan.
I had the same problem as you - find subject "sub-select with aggregate" 
on pgsql-sql mailing list dated on 2002-10-23.

In my case exposing fields from subquery solved my problem.

There is one more problem in your query - coalesce, which possibly 
disables any indexing in your view. Try to rewrite your view - subquery 
shouldn't return dor_kst_id and dor_mat_id null.

Is dor_kst_id the same as kdt_kst_id and as mat_id? After some database 
practicing I found, that using the same name in all tables is much more 
comfortably

For each material (materjalid) and koostud (koostud) you want to find 
some current value (koostude_detaild) and compare it to some sum 
(documentid...)?
I'm not sure if I understand well your view, but here is my version of 
this view - without subquery:

CREATE OR REPLACE VIEW v_tegelikud_kulud AS

SELECT koostud.kst_id, materjalid.mat_id, sum(dor.kogus * koefitsent::numeric) AS kogus, sum(dor.kokku) AS kokku,
<fields>
FROM koostud k , materjalid m left join koostude_detailid kdt     ON (m.mat_id = kdt.kdt_mat_id AND k.kst_id =
kdt.kdt_kst_id)left join dokumentide_read dor     ON (m.mat_id = dor.dor_mat_id AND k.kst_id = dor.dor_kst_id) left
JOINdokumendid dok     ON (dor.dor_dok_id = dok.dok_id AND dok.tyyp = 30 AND 
 
dok.kinnitaja IS NOT NULL)
group by koostud.kst_id, materjalid.mat_id, <fields>;


One more hint - create two-fields-indexes on koostude_detailid and 
dokuemntide_read (kdt_mat_id,kdt_kst_id)

Regards,
Tomasz Myrta



Re: full join in view

From
"Tambet Matiisen"
Date:
First I would like to say, that I'm quite amazed. You even guessed table names right! :) I did not expect such an
in-depthanalysis in such a short time. Thanks, Tomasz! 

> -----Original Message-----
> From: Tomasz Myrta [mailto:jasiek@klaster.net]
> Sent: Tuesday, January 14, 2003 11:51 AM
> To: Tambet Matiisen
> Cc: pgsql-sql@postgresql.org
> Subject: Re: [SQL] full join in view
>
>
> Tambet Matiisen wrote:
>
> > You are right. After disabling seq_scan, it uses indexes
> just as you
> > described. Unfortunately my view happens to use subquery:
>
> Don't disable seq_scan - sometimes it is better than indexscan.

I just did it for testing and for one session. I never disable it on production server.

...
>
> There is one more problem in your query - coalesce, which possibly
> disables any indexing in your view. Try to rewrite your view
> - subquery
> shouldn't return dor_kst_id and dor_mat_id null.
>
Coalesce did not pose any problems. Unless I tried to filter using one of the coalesce-fields, which does not use
indecesof course. 

> Is dor_kst_id the same as kdt_kst_id and as mat_id? After
> some database
> practicing I found, that using the same name in all tables is
> much more
> comfortably
>
This way I can refer most columns without prefixing them with table alias. But it's anyway good habit to use table
aliases,so this is not that important. I think in next project I try it in your way. 

> For each material (materjalid) and koostud (koostud) you want to find
> some current value (koostude_detaild) and compare it to some sum
> (documentid...)?
> I'm not sure if I understand well your view, but here is my
> version of
> this view - without subquery:
>

I tried to save few scans by not including "koostud" and "materjalid" in my original query. Based on yours, I created a
newversion: 

CREATE OR REPLACE VIEW v_tegelikud_kulud2 AS
SELECT   kst.kst_id,   mat.mat_id,   max(kdt.detaili_nr) AS detaili_nr,   max(kdt.arv) AS arv,   max(kdt.kulu) AS kulu,
 max(kdt.yhik) AS yhik,   max(kdt.koefitsent) AS koefitsent,   max(kdt.eeldatav_hind) AS eeldatav_hind,   sum(dor.kogus
*dor.koefitsent::numeric) AS kogus,   sum(dor.kokku) AS kokku 
FROM koostud kst
CROSS JOIN materjalid mat
LEFT JOIN koostude_detailid kdt   ON kst.kst_id = kdt.kdt_kst_id AND mat.mat_id = kdt.kdt_mat_id
LEFT JOIN dokumentide_read dor   ON kst.kst_id = dor.dor_kst_id AND mat.mat_id = dor.dor_mat_id AND EXISTS       (
SELECT 1       FROM dokumendid dok        WHERE dor.dor_dok_id = dok.dok_id AND dok.tyyp = 30 AND dok.kinnitaja IS NOT
NULL      ) 
WHERE kdt.kdt_id IS NOT NULL AND dor.dor_id IS NOT NULL
GROUP BY kst.kst_id, mat.mat_id;

I do a cross join between "koostud" and "materjalid", because "FROM koostud kst, materjalid mat" gave me syntax errors.
AlsoI had to move "dokumendid" table to EXISTS subquery, to get equivalent results with the original query. Just LEFT
JOIN-ingit is not enough and subquery troubled optimizer. There is also a WHERE condition to show only those materials,
thatappear in one of the tables "koostude_detailid" or "dokumentide_read". Here is the execution plan: 

explain select count(1) from v_tegelikud_kulud2 where kst_id = 1125;
    QUERY PLAN 

-------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=92.60..92.60 rows=1 width=107)  ->  Subquery Scan v_tegelikud_kulud2  (cost=69.58..92.58 rows=8 width=107)
-> Aggregate  (cost=69.58..92.58 rows=8 width=107)              ->  Group  (cost=69.58..91.00 rows=79 width=107)
           ->  Merge Join  (cost=69.58..90.61 rows=79 width=107)                          Merge Cond: (("outer".kst_id
="inner".dor_kst_id) AND ("outer".mat_id = "inner".dor_mat_id))                          Join Filter: (subplan)
                Filter: ("inner".dor_id IS NOT NULL)                          ->  Merge Join  (cost=17.54..18.52
rows=79width=66)                                Merge Cond: (("outer".kst_id = "inner".kdt_kst_id) AND ("outer".mat_id
="inner".kdt_mat_id))                                Filter: ("inner".kdt_id IS NOT NULL)
->  Sort  (cost=10.56..10.76 rows=79 width=8)                                      Sort Key: kst.kst_id, mat.mat_id
                                ->  Nested Loop  (cost=0.00..8.07 rows=79 width=8)
     ->  Index Scan using kst_pk on koostud kst  (cost=0.00..4.49 rows=1 width=4)
          Index Cond: (kst_id = 1125)                                            ->  Seq Scan on materjalid mat
(cost=0.00..2.79rows=79 width=4)                                ->  Sort  (cost=6.98..7.27 rows=113 width=58)
                          Sort Key: kdt.kdt_kst_id, kdt.kdt_mat_id                                      ->  Seq Scan on
koostude_detailidkdt  (cost=0.00..3.13 rows=113 width=58)                          ->  Sort  (cost=52.03..53.93
rows=761width=41)                                Sort Key: dor.dor_kst_id, dor.dor_mat_id
->  Seq Scan on dokumentide_read dor  (cost=0.00..15.61 rows=761 width=41)                          SubPlan
              ->  Index Scan using dok_pk on dokumendid dok  (cost=0.00..3.47 rows=1 width=0)
      Index Cond: ($0 = dok_id)                                  Filter: ((tyyp = 30) AND (kinnitaja IS NOT NULL)) 
(27 rows)

But there are still few things that worry me:
1. Cross join between koostud and materjalid. Table "materjalid" may have up to 10000 rows and only 20-30 of them are
actuallyneeded. 
2. Indeces on "koostude_detailid" and "dokumentide_read" are not used. Probably my tables do not contain enough rows.
MaybeI should generate more test data first. 
3. The cost of this query is twice as big, as my original query. It seems to me, that SubPlan is causing this. I tried
tomove it to subquery, but then the optimizer chose a totally different execution plan and seemingly was not able to
useindeces of "dokumentide_read" table. The version with subquery: 

CREATE OR REPLACE VIEW v_tegelikud_kulud3 AS
SELECT   kst.kst_id,   mat.mat_id,   max(kdt.detaili_nr) AS detaili_nr,   max(kdt.arv) AS arv,   max(kdt.kulu) AS kulu,
 max(kdt.yhik) AS yhik,   max(kdt.koefitsent) AS koefitsent,   max(kdt.eeldatav_hind) AS eeldatav_hind,   sum(dor.kogus
*dor.koefitsent::numeric) AS kogus,   sum(dor.kokku) AS kokku 
FROM koostud kst
CROSS JOIN materjalid mat
LEFT JOIN koostude_detailid kdt   ON kst.kst_id = kdt.kdt_kst_id AND mat.mat_id = kdt.kdt_mat_id
LEFT JOIN    (   SELECT dor.*   FROM dokumentide_read dor   JOIN dokumendid dok ON dor.dor_dok_id = dok.dok_id    WHERE
dok.tyyp= 30 AND dok.kinnitaja IS NOT NULL   ) dor   ON kst.kst_id = dor.dor_kst_id AND mat.mat_id = dor.dor_mat_id 
WHERE kdt.kdt_id IS NOT NULL AND dor.dor_id IS NOT NULL
GROUP BY kst.kst_id, mat.mat_id;

And execution plan:

hekotek=# explain select count(1) from v_tegelikud_kulud3 where kst_id = 1125;
                  QUERY PLAN 

---------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=56.52..56.52 rows=1 width=111)  ->  Subquery Scan v_tegelikud_kulud3  (cost=53.31..56.50 rows=8 width=111)
-> Aggregate  (cost=53.31..56.50 rows=8 width=111)              ->  Group  (cost=53.31..54.92 rows=79 width=111)
           ->  Merge Join  (cost=53.31..54.53 rows=79 width=111)                          Merge Cond: (("outer".kst_id
="inner".dor_kst_id) AND ("outer".mat_id = "inner".dor_mat_id))                          Filter: ("inner".dor_id IS NOT
NULL)                         ->  Merge Join  (cost=17.53..18.50 rows=79 width=66)                                Merge
Cond:(("outer".kst_id = "inner".kdt_kst_id) AND ("outer".mat_id = "inner".kdt_mat_id))
Filter:("inner".kdt_id IS NOT NULL)                                ->  Sort  (cost=10.54..10.74 rows=79 width=8)
                             Sort Key: kst.kst_id, mat.mat_id                                      ->  Nested Loop
(cost=0.00..8.05rows=79 width=8)                                            ->  Index Scan using kst_pk on koostud kst
(cost=0.00..4.47rows=1 width=4)                                                  Index Cond: (kst_id = 1125)
                               ->  Seq Scan on materjalid mat  (cost=0.00..2.79 rows=79 width=4)
       ->  Sort  (cost=6.98..7.27 rows=113 width=58)                                      Sort Key: kdt.kdt_kst_id,
kdt.kdt_mat_id                                     ->  Seq Scan on koostude_detailid kdt  (cost=0.00..3.13 rows=113
width=58)                         ->  Sort  (cost=35.78..35.99 rows=84 width=45)                                Sort
Key:dor.dor_kst_id, dor.dor_mat_id                                ->  Merge Join  (cost=9.04..33.09 rows=84 width=45)
                                  Merge Cond: ("outer".dor_dok_id = "inner".dok_id)
->  Index Scan using dor_dok_fk_i on dokumentide_read dor  (cost=0.00..20.91 rows=761 width=41)
            ->  Sort  (cost=9.04..9.13 rows=36 width=4)                                            Sort Key: dok.dok_id
                                          ->  Seq Scan on dokumendid dok  (cost=0.00..8.10 rows=36 width=4)
                                    Filter: ((tyyp = 30) AND (kinnitaja IS NOT NULL)) 
(28 rows)

Uh.. Mails are getting very lengthy.
 Tambet


Re: full join in view

From
jasiek@klaster.net
Date:
On Tue, Jan 14, 2003 at 04:27:22PM +0200, Tambet Matiisen wrote:
> 
> First I would like to say, that I'm quite amazed. You even guessed table names right! :) I did not expect such an
in-depthanalysis in such a short time. Thanks, Tomasz!
 
It wasn't difficult - these names where in foreign keys definition.
> 
> > 
> > 
> > Tambet Matiisen wrote:
> > 
> > 
> > Is dor_kst_id the same as kdt_kst_id and as mat_id? After 
> > some database 
> > practicing I found, that using the same name in all tables is 
> > much more 
> > comfortably
> > 
> This way I can refer most columns without prefixing them with table alias. But it's anyway good habit to use table
aliases,so this is not that important. I think in next project I try it in your way.
 

If you have joins like this:
table1 join table2 using (field1)
duplicates of field1 disappears and you don't need table name.
> 
> > For each material (materjalid) and koostud (koostud) you want to find 
> > some current value (koostude_detaild) and compare it to some sum 
> > (documentid...)?
> > I'm not sure if I understand well your view, but here is my 
> > version of 
> > this view - without subquery:
> > 
> 
> I tried to save few scans by not including "koostud" and "materjalid" in my original query. Based on yours, I created
anew version:
 

If you need only not null-kdt_id and dor_id, then just change your joins
into inner joins.

If you are sure, that you will get only not-null results, you don't need
to include koostud and marerjalid.

The result is:

CREATE OR REPLACE VIEW v_tegelikud_kulud2 AS
SELECT   kdt.kdt_kst_id as kst_id,   kdt.kdt_mat_id as mat_id,   max(kdt.detaili_nr) AS detaili_nr,   max(kdt.arv) AS
arv,  max(kdt.kulu) AS kulu,   max(kdt.yhik) AS yhik,   max(kdt.koefitsent) AS koefitsent,   max(kdt.eeldatav_hind) AS
eeldatav_hind,  sum(dor.kogus * dor.koefitsent::numeric) AS kogus,   sum(dor.kokku) AS kokku
 
FROM
koostude_detailid kdt
JOIN dokumentide_read dor   ON kdt.kdt_kst_id = dor.dor_kst_id AND kdt.kdt_mat_id = dor.dor_mat_id AND EXISTS       (
   SELECT 1       FROM dokumendid dok        WHERE dor.dor_dok_id = dok.dok_id AND dok.tyyp = 30 AND dok.kinnitaja IS
NOTNULL       )
 
GROUP BY kst.kst_id, mat.mat_id;

> But there are still few things that worry me:
> 1. Cross join between koostud and materjalid. Table "materjalid" may have up to 10000 rows and only 20-30 of them are
actuallyneeded.
 
You don't need it anymore. Anyway I thought, that you have in your query
"mat_id=.. and kst_id=.."
> 2. Indeces on "koostude_detailid" and "dokumentide_read" are not used. Probably my tables do not contain enough rows.
MaybeI should generate more test data first.
 
> 3. The cost of this query is twice as big, as my original query. It seems to me, that SubPlan is causing this. I
triedto move it to subquery, but then the optimizer chose a totally different execution plan and seemingly was not able
touse indeces of "dokumentide_read" table. The version with subquery:
 
>
Now it should work better.

Tomasz Myrta


Re: full join in view

From
"Tambet Matiisen"
Date:

>
> If you need only not null-kdt_id and dor_id, then just change
> your joins
> into inner joins.
>

The whole idea is to show expected expenses (in "koostude_detailid") even if they were actually not spent (the same
materialis not listed in "dokumentide_read"). And also show actual expenses (in "dokumentide_read"), even if we did not
expectthem (the same material is not listed in "koostude_detailid"). Uh, my english is bit rough, but hope you get the
idea. 

Anyway, that's why full join seemed exactly the right thing. As I understand now, the reason why my original query does
notuse indexes, is because of sub-query, not full join. And I think I understood the problem of exposing the right
fieldin sub-query, but my query doesn't seem to have the same problem. 

> If you are sure, that you will get only not-null results, you
> don't need
> to include koostud and marerjalid.
>

The problem is, that both kdt_kst_id and dor_kst_id can be null, but they never are at the same time. It's not correct
toexpose either of them as kst_id. That's why my original query used coalesce to get kst_id, which is always not null.
Butusing coalesce field for filtering of course disabled indexes. Including koostud table in query was good idea,
becausenow I have kst_id, which is always not null.  

I was not able to eliminate "materjalid" from my query, because that would have forced me to use full join between
"koostude_detailid"and "dokumentide_read" again. Which is not automatically bad thing, but this forces me to write
queryfrom "dokumentide_read" as sub-query (whether row in "dokumentide_read" is active or not depends if corresponding
rowin "dokumendid" is approved or not (kinnitaja is not null)). And this sub-query does not use indexes. And cross join
isbad. 

I think I have to experiment bit more. Does anyone know a good tool (preferably free) to generate test data? I've got
intohabit disabling seqscan to see what indexes get used. More data would give more adequate execution plans. 
 Tambet