Re: Problems with an update-from statement and pg-8.1.4 - Mailing list pgsql-performance

From Rafael Martinez
Subject Re: Problems with an update-from statement and pg-8.1.4
Date
Msg-id 1165436967.9606.32.camel@bbking.linux
Whole thread Raw
In response to Re: Problems with an update-from statement and pg-8.1.4  (Erik Jones <erik@myemma.com>)
Responses Re: Problems with an update-from statement and pg-8.1.4
List pgsql-performance
On Wed, 2006-12-06 at 14:19 -0600, Erik Jones wrote:
> Rafael Martinez wrote:
> > On Wed, 2006-12-06 at 14:55 -0500, Ted Allen wrote:
> >
> >> Stephan Szabo wrote:
> >>
> >>> On Wed, 6 Dec 2006, Rafael Martinez wrote:
> >>>
> >>>> mailstats=# EXPLAIN update mail SET spamscore = '-5.026'  FROM mail m,
> >>>> mail_received mr where mr.mail_id = m.mail_id AND mr.queue_id =
> >>>> '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
> >>>>
> >>>>
> >>> I don't think this statement does what you expect. You're ending up with
> >>> two copies of mail in the above one as "mail" and one as "m". You probably
> >>> want to remove the mail m in FROM and use mail rather than m in the
> >>> where clause.
> >>>
> >>>
> >>>
> >> Worse yet I think your setting "spamcore" for EVERY row in mail to
> >> '-5.026'.  The above solution should fix it though.
> >>
> >> -- Ted
> >>
> >>
> >
> > Thanks for the answers. I think the 'problem' is explain in the
> > documentation:
> >
> > "fromlist
> >
> > A list of table expressions, allowing columns from other tables to
> > appear in the WHERE condition and the update expressions. This is
> > similar to the list of tables that can be specified in the FROMClause of
> > a SELECT statement. Note that the target table must not appear in the
> > fromlist, unless you intend a self-join (in which case it must appear
> > with an alias in the fromlist)".
> >
> > And as you said, we can not have 'mail m' in the FROM clause. I have
> > contacted the developers and they will change the statement. I gave then
> > these 2 examples:
> >
> > -------------------------------------------------------------------------------
> > mailstats=# EXPLAIN update mail SET spamscore = '-5.026'  FROM
> > mail_received mr where mr.mail_id = mail.mail_id AND mr.queue_id =
> > '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
> >                                               QUERY PLAN
> > ------------------------------------------------------------------------------------------------------
> >  Nested Loop  (cost=0.00..6.54 rows=1 width=57)
> >    ->  Index Scan using received_queue_id_index on mail_received mr
> > (cost=0.00..3.20 rows=1 width=4)
> >          Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
> >          Filter: (mailhost = '129.240.10.47'::inet)
> >    ->  Index Scan using mail_pkey on mail  (cost=0.00..3.32 rows=1
> > width=57)
> >          Index Cond: ("outer".mail_id = mail.mail_id)
> > (6 rows)
> >
> > mailstats=# explain  update mail SET spamscore = '-5.026' where mail_id
> > = (select mail_id from mail_received where queue_id = '1GrxLs-0004N9-I1'
> > and mailhost = '129.240.10.47');
> >                                              QUERY PLAN
> > -----------------------------------------------------------------------------------------------------
> >  Index Scan using mail_pkey on mail  (cost=3.20..6.52 rows=1 width=57)
> >    Index Cond: (mail_id = $0)
> >    InitPlan
> >      ->  Index Scan using received_queue_id_index on mail_received
> > (cost=0.00..3.20 rows=1 width=4)
> >            Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
> >            Filter: (mailhost = '129.240.10.47'::inet)
> > (6 rows)
> > -------------------------------------------------------------------------------
> >
> Look again at the estimated costs of those two query plans.  You haven't
> gained anything there.  Try this out:
>
> EXPLAIN UPDATE mail
> SET spamscore = '-5.026'
> FROM mail_received mr
> WHERE mail.mail_id = mr.mail_id AND mr.queue_id = '1GrxLs-0004N9-I1' ;
>

Haven't we?

* In the statement with problems we got this:
Nested Loop  (cost=0.00..932360.78 rows=7184312 width=57)

* In the ones I sent:
Nested Loop  (cost=0.00..6.54 rows=1 width=57)
Index Scan using mail_pkey on mail  (cost=3.20..6.52 rows=1 width=57)

* And in the last one you sent me:
------------------------------------------------------
Nested Loop  (cost=0.00..6.53 rows=1 width=57)
   ->  Index Scan using received_queue_id_index on mail_received mr
(cost=0.00..3.20 rows=1 width=4)
         Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
   ->  Index Scan using mail_pkey on mail  (cost=0.00..3.32 rows=1
width=57)
         Index Cond: (mail.mail_id = "outer".mail_id)
(5 rows)
------------------------------------------------------

I can not see the different.

regards,
--
Rafael Martinez, <r.m.guerrero@usit.uio.no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/



pgsql-performance by date:

Previous
From: Brian Hurt
Date:
Subject: Re: File Systems Compared
Next
From: Bruno Wolff III
Date:
Subject: Re: File Systems Compared