Re: Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function - Mailing list pgsql-general

From Greg Stark
Subject Re: Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function
Date
Msg-id 407d949e0906272002m3a208defhf374b9b3612fee26@mail.gmail.com
Whole thread Raw
In response to Re: Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function  (justin <justin@emproshunts.com>)
List pgsql-general
On Sun, Jun 28, 2009 at 2:13 AM, justin<justin@emproshunts.com> wrote:
>
> if you want to do something like this either do a test first to see if the
> key is present in the table, update or do an insert like this
> There is no reason to do a loop in the function waiting for a lock to
> clear.   Postgresql Locks do not work like MySQL.

The latter is exactly what he posted, you *do* have to loop because
two sessions can try to do the update, find no records, and then both
try to insert and fail.


The problem is that the example posted is for a single update/insert.
The problem you're solving is for merging in a whole set of changes.
That's a bit more painful.

I think you're going to want something like;

UPDATE forums_readposts
       SET lastpostread=(
                   select lastpost
                      from forums_topics
                    where id=threadid
               )
 WHERE userid=_id

INSERT INTO forums_readposts
    (userid,threadid,lastpostread)
    (select _userid, id, lastpost
       from forums_topics
     where id not in (
              select threadid
                 from forum_readposts existing
               where existing.userid=_userid
             )
    )


(you might want to experiment with that as an NOT EXISTS as there are
still cases where one is optimized better than the other due to the
standard's required null behaviour)

You have a few options here. You could just decide concurrency for
this operation really isn't important and use something to serialize
this operation. For example you could lock the user record with an
explicit select for update on the user record and the commit
immediately afterward.

Or you could catch the exception around the insert and assume if that
happened you don't have to bother retrying because the other
transaction you collided with is presumably doing the same thing. That
would break if a user hit "catch up" and simultaneously clicked on a
new thread he hadn't read before in another window.

Or you could do the same kind of loop around this, just always doing
the insert since it should insert 0 records if there are no missing
threads.

You could skip the insert entirely if the number of records updated in
matches the number of threads and you have that number handy. That
would be an especially good idea if you catch the exception around the
insert since exceptions are moderately expensive. They create a
subtransaction. Probably not a factor for an operation like this which
isn't dominating the workload.

--
greg
http://mit.edu/~gsstark/resume.pdf

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: [Q] sequence and index name limits in 8.4
Next
From: Adam Rich
Date:
Subject: Date math