Re: Problems with an update-from statement and pg-8.1.4 - Mailing list pgsql-performance

From Erik Jones
Subject Re: Problems with an update-from statement and pg-8.1.4
Date
Msg-id 4577288A.8080800@myemma.com
Whole thread Raw
In response to Re: Problems with an update-from statement and pg-8.1.4  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
List pgsql-performance
Rafael Martinez wrote:
> On Wed, 2006-12-06 at 14:19 -0600, Erik Jones wrote:
>
>> Rafael Martinez wrote:
>>
>>> On Wed, 2006-12-06 at 14:55 -0500, Ted Allen wrote:
>>>
>>>
>>>> Stephan Szabo wrote:
>>>>
>>>>
>>>>> On Wed, 6 Dec 2006, Rafael Martinez wrote:
>>>>>
>>>>>
>>>>>> mailstats=# EXPLAIN update mail SET spamscore = '-5.026'  FROM mail m,
>>>>>> mail_received mr where mr.mail_id = m.mail_id AND mr.queue_id =
>>>>>> '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
>>>>>>
>>>>>>
>>>>>>
>>>>> I don't think this statement does what you expect. You're ending up with
>>>>> two copies of mail in the above one as "mail" and one as "m". You probably
>>>>> want to remove the mail m in FROM and use mail rather than m in the
>>>>> where clause.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Worse yet I think your setting "spamcore" for EVERY row in mail to
>>>> '-5.026'.  The above solution should fix it though.
>>>>
>>>> -- Ted
>>>>
>>>>
>>>>
>>> Thanks for the answers. I think the 'problem' is explain in the
>>> documentation:
>>>
>>> "fromlist
>>>
>>> A list of table expressions, allowing columns from other tables to
>>> appear in the WHERE condition and the update expressions. This is
>>> similar to the list of tables that can be specified in the FROMClause of
>>> a SELECT statement. Note that the target table must not appear in the
>>> fromlist, unless you intend a self-join (in which case it must appear
>>> with an alias in the fromlist)".
>>>
>>> And as you said, we can not have 'mail m' in the FROM clause. I have
>>> contacted the developers and they will change the statement. I gave then
>>> these 2 examples:
>>>
>>> -------------------------------------------------------------------------------
>>> mailstats=# EXPLAIN update mail SET spamscore = '-5.026'  FROM
>>> mail_received mr where mr.mail_id = mail.mail_id AND mr.queue_id =
>>> '1GrxLs-0004N9-I1' and mr.mailhost = '129.240.10.47';
>>>                                               QUERY PLAN
>>> ------------------------------------------------------------------------------------------------------
>>>  Nested Loop  (cost=0.00..6.54 rows=1 width=57)
>>>    ->  Index Scan using received_queue_id_index on mail_received mr
>>> (cost=0.00..3.20 rows=1 width=4)
>>>          Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
>>>          Filter: (mailhost = '129.240.10.47'::inet)
>>>    ->  Index Scan using mail_pkey on mail  (cost=0.00..3.32 rows=1
>>> width=57)
>>>          Index Cond: ("outer".mail_id = mail.mail_id)
>>> (6 rows)
>>>
>>> mailstats=# explain  update mail SET spamscore = '-5.026' where mail_id
>>> = (select mail_id from mail_received where queue_id = '1GrxLs-0004N9-I1'
>>> and mailhost = '129.240.10.47');
>>>                                              QUERY PLAN
>>> -----------------------------------------------------------------------------------------------------
>>>  Index Scan using mail_pkey on mail  (cost=3.20..6.52 rows=1 width=57)
>>>    Index Cond: (mail_id = $0)
>>>    InitPlan
>>>      ->  Index Scan using received_queue_id_index on mail_received
>>> (cost=0.00..3.20 rows=1 width=4)
>>>            Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
>>>            Filter: (mailhost = '129.240.10.47'::inet)
>>> (6 rows)
>>> -------------------------------------------------------------------------------
>>>
>>>
>> Look again at the estimated costs of those two query plans.  You haven't
>> gained anything there.  Try this out:
>>
>> EXPLAIN UPDATE mail
>> SET spamscore = '-5.026'
>> FROM mail_received mr
>> WHERE mail.mail_id = mr.mail_id AND mr.queue_id = '1GrxLs-0004N9-I1' ;
>>
>>
>
> Haven't we?
>
> * In the statement with problems we got this:
> Nested Loop  (cost=0.00..932360.78 rows=7184312 width=57)
>
> * In the ones I sent:
> Nested Loop  (cost=0.00..6.54 rows=1 width=57)
> Index Scan using mail_pkey on mail  (cost=3.20..6.52 rows=1 width=57)
>
> * And in the last one you sent me:
> ------------------------------------------------------
> Nested Loop  (cost=0.00..6.53 rows=1 width=57)
>    ->  Index Scan using received_queue_id_index on mail_received mr
> (cost=0.00..3.20 rows=1 width=4)
>          Index Cond: ((queue_id)::text = '1GrxLs-0004N9-I1'::text)
>    ->  Index Scan using mail_pkey on mail  (cost=0.00..3.32 rows=1
> width=57)
>          Index Cond: (mail.mail_id = "outer".mail_id)
> (5 rows)
> ------------------------------------------------------
>
> I can not see the different.
>
> regards,
>
Ah, sorry, I was just looking at the two that you sent in your last
message thinking that they were 'old' and 'new',  not both 'new'.  My bad...

--
erik jones <erik@myemma.com>
software development
emma(r)


pgsql-performance by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: File Systems Compared
Next
From: Matthew Schumacher
Date:
Subject: Disk storage and san questions (was File Systems Compared)