Thread: Do I need serializable for this query?

Do I need serializable for this query?

From
William Garrison
Date:
I have a table that keeps running totals.  It is possible that this
would get called twice simultaneously for the same UserID.  Do I need to
put this in a serializable transaction?  Since this gets called often,
could it be a performance problem if I make it serializable?

CREATE FUNCTION UpdateTotals(IN UserID int,
                              IN AddToCount int,
                              IN AddToSize bigint)
RETURNS void AS $$
BEGIN
     UPDATE
         Totals
     SET
         TotalSize = TotalSize + $2,
         TotalCount = TotalCount + $3
     WHERE
         UserID = $1;
     END IF;
END
$$ LANGUAGE 'plpgsql' VOLATILE;

Re: Do I need serializable for this query?

From
"Florian G. Pflug"
Date:
William Garrison wrote:
> I have a table that keeps running totals.  It is possible that this
> would get called twice simultaneously for the same UserID.  Do I need to
> put this in a serializable transaction?  Since this gets called often,
> could it be a performance problem if I make it serializable?
>
> CREATE FUNCTION UpdateTotals(IN UserID int,
>                              IN AddToCount int,
>                              IN AddToSize bigint)
> RETURNS void AS $$
> BEGIN
>     UPDATE
>         Totals
>     SET
>         TotalSize = TotalSize + $2,
>         TotalCount = TotalCount + $3
>     WHERE
>         UserID = $1;
>     END IF;
> END
> $$ LANGUAGE 'plpgsql' VOLATILE;

Where is that stray "END IF;" comming from?
Anyway, this should be safe, and work even more
reliably in read-committed mode than in serializable
mode. In serializeable mode, if the Total of the same
user is updated by two transactions simultanously,
you'll get a SerializationError. Read-Committed mode
OTOH will make sure that it uses the latest version of
the tuple for calculating the new values.

Usually, you need serializable mode if you do updates
based on the results of previous selects.

greetings, Florian Pflug

Re: Do I need serializable for this query?

From
Jeff Davis
Date:
On Tue, 2007-04-10 at 14:45 -0400, William Garrison wrote:
> I have a table that keeps running totals.  It is possible that this
> would get called twice simultaneously for the same UserID.  Do I need to
> put this in a serializable transaction?  Since this gets called often,
> could it be a performance problem if I make it serializable?
>
> CREATE FUNCTION UpdateTotals(IN UserID int,
>                               IN AddToCount int,
>                               IN AddToSize bigint)
> RETURNS void AS $$
> BEGIN
>      UPDATE
>          Totals
>      SET
>          TotalSize = TotalSize + $2,
>          TotalCount = TotalCount + $3
>      WHERE
>          UserID = $1;
>      END IF;
> END
> $$ LANGUAGE 'plpgsql' VOLATILE;
>

You only have one command to execute; you don't need to do anything
special at all. If two UPDATEs happen to the same tuple at the same
time, one will block waiting for the other to commit.

Regards,
    Jeff Davis



Re: Do I need serializable for this query?

From
Peter Eisentraut
Date:
William Garrison wrote:
> I have a table that keeps running totals.  It is possible that this
> would get called twice simultaneously for the same UserID.  Do I need
> to put this in a serializable transaction?

Transaction isolation is only a question of interest if you have more
than one statement in a transaction.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Do I need serializable for this query?

From
Michael Fuhr
Date:
On Tue, Apr 10, 2007 at 10:52:11PM +0200, Peter Eisentraut wrote:
> William Garrison wrote:
> > I have a table that keeps running totals.  It is possible that this
> > would get called twice simultaneously for the same UserID.  Do I need
> > to put this in a serializable transaction?
>
> Transaction isolation is only a question of interest if you have more
> than one statement in a transaction.

Not true; the isolation level is also relevant if you're doing
concurrent updates or deletes (although, as the documentation
mentions, PostgreSQL's serializable isolation level doesn't guarantee
true serializability).

create table test (id integer primary key, t text);
insert into test (id, t) values (1, 'a');

T1: begin isolation level serializable;
T2: begin isolation level serializable;
T1: update test set t = 'b' where id = 1;
T2: update test set t = 'c' where id = 1;  -- blocks
T1: commit;
T2: ERROR:  could not serialize access due to concurrent update

--
Michael Fuhr

Do I need serializable for this query? -- Part 2

From
William Garrison
Date:
This is part 2 of my "serializable" question from earlier.  Thanks to
everyone who answered the first part.  (Hopefully there will be no part 3)

When a user adds a record, I update a table with totals.  But when a
user deletes data, I subtract the totals.  That presents a more
interesting situation since it is theoretically possible to add a record
and delete one at the same time:

For simplicity sake, the two tables are something like:
   item(itemid int, data...)
   history(itemid int, versionid int, data...)

-- This deletes the item and all the history records for the item
CREATE FUNCTION DeleteItem(IN _UserID int,
                            IN _ItemID int)
RETURNS void AS $$
DECLARE
    mycount int;
    total bigint;
BEGIN
    -- Cotal and total of data to be deleted
    SELECT
        COUNT(*), COALESCE(SUM(clientsize),0)
    INTO
        mycount,total
    FROM
        history
    WHERE
        userid= $1 AND itemid=$2;

    -- Potential problem point #1

    -- Call the stored proc to update the count and total
    -- This is the one from my last email
    PERFORM sbp_UpdateTotals($1,-mycount,-total);

    -- Potential problem point #2

    -- Delete the records
    DELETE FROM history WHERE userid = $1 AND itemid=$2;
    DELETE FROM items WHERE userid = $1 AND itemid=$2;
END
$$ LANGUAGE 'plpgsql' VOLATILE;

Suppose the user adds another record to the history table with the same
itemid as the one being deleted.  If they do this at one of the two
"problem points" then that history record will be deleted, but it will
not be subtracted from the totals.  Am I understanding this properly?

In my system, this is highly unlikely, and probably not even something
we care about very much.  But I still want to know what to do.  It seems
to me I could a few things to fix this:

1) Make this serializable
- I'm not even sure this will help, since adding a new history record
won't change any records that this touches.  Would PostgreSQL even
realize that adding a history record would have changed the results of
the select?
2) Table lock
3) Make the first select statement store the PK(itemid,versionid) of the
history records and then only delete those records at the end.
4) Somehow calculate the count and total during the delete statement
5) Compare the # of records deleted with the # of records selected and
throw/rollback if they mismatch.

I'm curious which one I should do, and if my solutions are all valid.

Also, I'm under the impression that I don't need to do any sort of
begin/end transaction stuff in here, since PostgreSql does that
implicitly with any stored procedure.  Is that correct?

Thanks to anyone who can assist.

Florian G. Pflug wrote:
> William Garrison wrote:
>> I have a table that keeps running totals.  It is possible that this
>> would get called twice simultaneously for the same UserID.  Do I need
>> to put this in a serializable transaction?  Since this gets called
>> often, could it be a performance problem if I make it serializable?
>>
>> CREATE FUNCTION UpdateTotals(IN UserID int,
>>                              IN AddToCount int,
>>                              IN AddToSize bigint)
>> RETURNS void AS $$
>> BEGIN
>>     UPDATE
>>         Totals
>>     SET
>>         TotalSize = TotalSize + $2,
>>         TotalCount = TotalCount + $3
>>     WHERE
>>         UserID = $1;
>>     END IF;
>> END
>> $$ LANGUAGE 'plpgsql' VOLATILE;
>
> Where is that stray "END IF;" comming from?
> Anyway, this should be safe, and work even more
> reliably in read-committed mode than in serializable
> mode. In serializeable mode, if the Total of the same
> user is updated by two transactions simultanously,
> you'll get a SerializationError. Read-Committed mode
> OTOH will make sure that it uses the latest version of
> the tuple for calculating the new values.
>
> Usually, you need serializable mode if you do updates
> based on the results of previous selects.
>
> greetings, Florian Pflug
>


Re: Do I need serializable for this query? -- Part 2

From
William Garrison
Date:
Should I just use a trigger to update these totals?  That way concurrent
deletes/updates would be guaranteed to update the totals.  The only down
side is that deleting 10 records would result in 10 updates to the
totals.  But deleting is rare enough that it probably isn't a problem.

William Garrison wrote:
> This is part 2 of my "serializable" question from earlier.  Thanks to
> everyone who answered the first part.  (Hopefully there will be no part 3)
>
> When a user adds a record, I update a table with totals.  But when a
> user deletes data, I subtract the totals.  That presents a more
> interesting situation since it is theoretically possible to add a record
> and delete one at the same time:
>
> For simplicity sake, the two tables are something like:
>   item(itemid int, data...)
>   history(itemid int, versionid int, data...)
>
> -- This deletes the item and all the history records for the item
> CREATE FUNCTION DeleteItem(IN _UserID int,
>                            IN _ItemID int)
> RETURNS void AS $$
> DECLARE
>     mycount int;
>     total bigint;
> BEGIN
>     -- Cotal and total of data to be deleted
>     SELECT
>         COUNT(*), COALESCE(SUM(clientsize),0)
>     INTO
>         mycount,total
>     FROM
>         history
>     WHERE
>         userid= $1 AND itemid=$2;
>
>     -- Potential problem point #1
>
>     -- Call the stored proc to update the count and total
>     -- This is the one from my last email
>     PERFORM sbp_UpdateTotals($1,-mycount,-total);
>
>     -- Potential problem point #2
>
>     -- Delete the records
>     DELETE FROM history WHERE userid = $1 AND itemid=$2;
>     DELETE FROM items WHERE userid = $1 AND itemid=$2;
> END
> $$ LANGUAGE 'plpgsql' VOLATILE;
>
> Suppose the user adds another record to the history table with the same
> itemid as the one being deleted.  If they do this at one of the two
> "problem points" then that history record will be deleted, but it will
> not be subtracted from the totals.  Am I understanding this properly?
>
> In my system, this is highly unlikely, and probably not even something
> we care about very much.  But I still want to know what to do.  It seems
> to me I could a few things to fix this:
>
> 1) Make this serializable
> - I'm not even sure this will help, since adding a new history record
> won't change any records that this touches.  Would PostgreSQL even
> realize that adding a history record would have changed the results of
> the select?
> 2) Table lock
> 3) Make the first select statement store the PK(itemid,versionid) of the
> history records and then only delete those records at the end.
> 4) Somehow calculate the count and total during the delete statement
> 5) Compare the # of records deleted with the # of records selected and
> throw/rollback if they mismatch.
>
> I'm curious which one I should do, and if my solutions are all valid.
>
> Also, I'm under the impression that I don't need to do any sort of
> begin/end transaction stuff in here, since PostgreSql does that
> implicitly with any stored procedure.  Is that correct?
>
> Thanks to anyone who can assist.
>
> Florian G. Pflug wrote:
>> William Garrison wrote:
>>> I have a table that keeps running totals.  It is possible that this
>>> would get called twice simultaneously for the same UserID.  Do I need
>>> to put this in a serializable transaction?  Since this gets called
>>> often, could it be a performance problem if I make it serializable?
>>>
>>> CREATE FUNCTION UpdateTotals(IN UserID int,
>>>                              IN AddToCount int,
>>>                              IN AddToSize bigint)
>>> RETURNS void AS $$
>>> BEGIN
>>>     UPDATE
>>>         Totals
>>>     SET
>>>         TotalSize = TotalSize + $2,
>>>         TotalCount = TotalCount + $3
>>>     WHERE
>>>         UserID = $1;
>>>     END IF;
>>> END
>>> $$ LANGUAGE 'plpgsql' VOLATILE;
>>
>> Where is that stray "END IF;" comming from?
>> Anyway, this should be safe, and work even more
>> reliably in read-committed mode than in serializable
>> mode. In serializeable mode, if the Total of the same
>> user is updated by two transactions simultanously,
>> you'll get a SerializationError. Read-Committed mode
>> OTOH will make sure that it uses the latest version of
>> the tuple for calculating the new values.
>>
>> Usually, you need serializable mode if you do updates
>> based on the results of previous selects.
>>
>> greetings, Florian Pflug
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>