Re: MERGE: performance advices - Mailing list pgsql-general

From Steve Clark
Subject Re: MERGE: performance advices
Date
Msg-id 48BD6E25.4050507@netwolves.com
Whole thread Raw
In response to Re: MERGE: performance advices  ("Richard Broersma" <richard.broersma@gmail.com>)
Responses Re: MERGE: performance advices
List pgsql-general
Richard Broersma wrote:
> On Tue, Sep 2, 2008 at 8:10 AM, Steve Clark <sclark@netwolves.com> wrote:
>
>
>>Is there a way to do something similar with the following? I am an SQL noob
>>and the
>>following takes longer to run than is reasonable, on the order of hours.
>>
>>insert into myevents select * from t_unit_event_log a where exists
>>      (select b.event_log_no from myevents b
>>       where a.event_status = 1 and a.event_ref_log_no IS NOT NULL
>>              and a.event_ref_log_no = b.event_log_no and a.event_log_no not
>>in
>>                      (select event_log_no from myevents)
>>      )
>
>
>
> To start off with, this SQL statement can be refined a bit.  Many of
> the sub-query WHERE clause constraints have nothing to do with the
> Correlated sub-query.  The refinement would look like so:
>
> INSERT INTO Myevents
>      SELECT *
>        FROM T_unit_event_log AS A
>       WHERE A.event_status = 1
>         AND A.event_ref_log_no IS NOT NULL
>         AND A.event_log_no NOT IN ( SELECT event_log_no
>                                       FROM Myevents)
>         AND EXISTS ( SELECT B.event_log_no
>                        FROM Myevents AS B
>                       WHERE A.event_ref_log_no = B.event_log_no );
>
>
> The next step would be to rework the NOT IN sub-query into a LEFT JOIN
> WHERE IS NULL;
>
> INSERT INTO Myevents
>      SELECT *
>        FROM T_unit_event_log AS A
>   LEFT JOIN Myevents AS C
>          ON A.event_log_no = C.event_log_no
>       WHERE A.event_status = 1
>         AND A.event_ref_log_no IS NOT NULL
>         AND C.event_log_no IS NULL
>         AND EXISTS ( SELECT B.event_log_no
>                        FROM Myevents AS B
>                       WHERE A.event_ref_log_no = B.event_log_no );
>
> There is one possible alteration that may or many not improve
> performance.  This would be to replace the EXISTS with a LEFT JOIN
> WHERE IS NOT NULL;
>
> INSERT INTO Myevents
>      SELECT *
>        FROM T_unit_event_log AS A
>   LEFT JOIN Myevents AS C
>          ON A.event_log_no = C.event_log_no
>   LEFT JOIN Myevents AS B
>          ON A.event_ref_log_no = B.event_log_no
>       WHERE C.event_log_no IS NULL
>         AND B.event_log_no IS NOT NULL
>         AND A.event_status = 1
>         AND A.event_ref_log_no IS NOT NULL;
>
>

Hi Richard and thanks for the response. When I try the last two queries i get and error. I have listed
the results of explain on all three.

srm2=# explain
srm2-# INSERT INTO Myevents
srm2-#      SELECT *
srm2-#        FROM T_unit_event_log AS A
srm2-#       WHERE A.event_status = 1
srm2-#         AND A.event_ref_log_no IS NOT NULL
srm2-#         AND A.event_log_no NOT IN ( SELECT event_log_no
srm2(#                                       FROM Myevents)
srm2-#         AND EXISTS ( SELECT B.event_log_no
srm2(#                        FROM Myevents AS B
srm2(#                       WHERE A.event_ref_log_no = B.event_log_no );
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Index Scan using indx_tuel_usn_ec_es on t_unit_event_log a  (cost=31711.73..3607445990.61 rows=51844 width=146)
   Index Cond: (event_status = 1)
   Filter: ((event_ref_log_no IS NOT NULL) AND (subplan) AND (NOT (subplan)))
   SubPlan
     ->  Materialize  (cost=31711.73..42857.85 rows=830612 width=4)
           ->  Seq Scan on myevents  (cost=0.00..28041.12 rows=830612 width=4)
     ->  Index Scan using indx1myevents on myevents b  (cost=0.00..8.37 rows=1 width=4)
           Index Cond: ($1 = event_log_no)
(8 rows)

srm2=#
srm2=#
srm2=# explain
srm2-# INSERT INTO Myevents
srm2-#      SELECT *
srm2-#        FROM T_unit_event_log AS A
srm2-#   LEFT JOIN Myevents AS C
srm2-#          ON A.event_log_no = C.event_log_no
srm2-#       WHERE A.event_status = 1
srm2-#         AND A.event_ref_log_no IS NOT NULL
srm2-#         AND C.event_log_no IS NULL
srm2-#         AND EXISTS ( SELECT B.event_log_no
srm2(#                        FROM Myevents AS B
srm2(#                       WHERE A.event_ref_log_no = B.event_log_no );
ERROR:  INSERT has more expressions than target columns
srm2=#
srm2=# explain
srm2-# INSERT INTO Myevents
srm2-#      SELECT *
srm2-#        FROM T_unit_event_log AS A
srm2-#   LEFT JOIN Myevents AS C
srm2-#          ON A.event_log_no = C.event_log_no
srm2-#   LEFT JOIN Myevents AS B
srm2-#          ON A.event_ref_log_no = B.event_log_no
srm2-#       WHERE C.event_log_no IS NULL
srm2-#         AND B.event_log_no IS NOT NULL
srm2-#         AND A.event_status = 1
srm2-#         AND A.event_ref_log_no IS NOT NULL;
ERROR:  INSERT has more expressions than target columns

I really appreciate your help.

Steve

pgsql-general by date:

Previous
From: David Fetter
Date:
Subject: Re: pg_catalog forward compatibility
Next
From: "Richard Broersma"
Date:
Subject: Re: MERGE: performance advices