Thread: Row locking during UPDATE

Row locking during UPDATE

From
"David F. Skoll"
Date:
Hi,

I have a weird problem and want to know if I understand what's happening.
I have a table like this:

create table statistics (
        date            DATE DEFAULT current_date,
        key             TEXT,
        value           INTEGER DEFAULT 0,
        UNIQUE(date, key)
);

and I have a bunch of clients running.  Often, a client does
something like this:

UPDATE statistics SET value = value + 1
    WHERE key = 'KEY' AND date = current_date;

What appears to be happening is this:

1) Sometimes, a whole bunch of clients try updating the same row.  I
see sevaral postgresql processes marked "UPDATE waiting"

2) Because the client processes must complete rather quickly, a supervisor
process kills them if they don't complete soon.

3) The postgresql processes in an "UPDATE waiting" state seem to be
blocked waiting for a semaphore, and they do not notice the closing of
the connection when the client is killed.

4) New client processes are spawned and eventually try to update the row.

As a result, we end up with more and more postgresql processes until the
connection limit is reached, because processes in the "UPDATE waiting"
state don't notice the connection has been broken.

Questions:

1) Am I on the right track?  Is this how PostgreSQL works?

2) I plan on solving it by making a low-contention table like this:

create table low_contention_stats (
        date            DATE DEFAULT current_date,
        key             TEXT,
        randomizer      INTEGER,
        value           INTEGER DEFAULT 0,
        summarized      BOOLEAN DEFAULT FALSE,
        UNIQUE(date, key, randomizer, summarized)
);

Each client will pick a random "randomizer" value, so the odds of two
clients trying to update the same row at the same time are low.  Once
a day, I summarize the low_contention_stats table and put the summary
in statistics.

A statistics update looks like this:

UPDATE low_contention_stats SET value = value + 1
    WHERE date = current_date AND key = 'KEY'
    AND randomizer = my_random_val AND not summarized;
(If the update updates 0 rows, we do an INSERT)

The summary process when we copy to statistics looks like this:

# "Freeze" existing rows -- any updates during the summary process
# will have to insert new rows where summarized is false
UPDATE low_contention_stats SET summarized = 't';

# Summarize stats
SELECT date, key, sum(value) FROM low_contention_stats WHERE summarized
    GROUP BY date, key;

# Then we iterate over the results, updating "statistics"
DELETE FROM low_contention_stats WHERE summarized;

Will this help?

I can't easily test it, because I only see the problem under high load,
and the only high-load environment I have access to is a production one. :-(

Regards,

David.

Re: Row locking during UPDATE

From
Tom Lane
Date:
"David F. Skoll" <dfs@roaringpenguin.com> writes:
> What appears to be happening is this:

> 1) Sometimes, a whole bunch of clients try updating the same row.  I
> see sevaral postgresql processes marked "UPDATE waiting"

Any process that arrives at the row and finds it already modified by
some concurrent transaction will wait for that concurrent transaction
to complete.

> 2) Because the client processes must complete rather quickly, a supervisor
> process kills them if they don't complete soon.

Zapping clients that are in the middle of database operations is bad
design IMHO.

> 3) The postgresql processes in an "UPDATE waiting" state seem to be
> blocked waiting for a semaphore, and they do not notice the closing of
> the connection when the client is killed.

That's correct, a backend will generally not notice client disconnect
until it next waits for a client command.  It's not totally clear why
you've got so many processes waiting to update the same row, though.
Which process does have the row lock, and why isn't it completing its
transaction?

            regards, tom lane

Re: Row locking during UPDATE

From
"David F. Skoll"
Date:
On Thu, 4 Sep 2003, Tom Lane wrote:

> Any process that arrives at the row and finds it already modified by
> some concurrent transaction will wait for that concurrent transaction
> to complete.

Right.  And it waits on a semaphore, right?  So there's no way to
use select() to wait for EITHER the semaphore OR the loss of the connection?
I hate SysV IPC. :-(

> Zapping clients that are in the middle of database operations is bad
> design IMHO.

It's required.  The clients are e-mail filters and they must reply
quickly, before the end of the SMTP transaction.  If they take too long,
they must be killed so the SMTP transaction can be tempfailed.  If they
are not killed, the SMTP sessions pile up and eventually kill the machine.

> That's correct, a backend will generally not notice client disconnect
> until it next waits for a client command.  It's not totally clear why
> you've got so many processes waiting to update the same row, though.

It's on a high-volume mail server that receives around 500K
messages/day.  About 180,000 of those are viruses, so we often have
multiple processes trying to update the virus statistics row.

> Which process does have the row lock, and why isn't it completing its
> transaction?

I don't know the details of PostgreSQL's implementation, but it seems
that when lots of processes are waiting to update the same row, it
gets incredibly slow.

--
David.

Re: Row locking during UPDATE

From
Tom Lane
Date:
"David F. Skoll" <dfs@roaringpenguin.com> writes:
>> Which process does have the row lock, and why isn't it completing its
>> transaction?

> I don't know the details of PostgreSQL's implementation, but it seems
> that when lots of processes are waiting to update the same row, it
> gets incredibly slow.

Hmm.  That might represent a fixable bug.  How many is "lots"?  Can you
give a self-contained test case?

            regards, tom lane

Re: Row locking during UPDATE

From
Sam Barnett-Cormack
Date:
On Thu, 4 Sep 2003, David F. Skoll wrote:

> > Zapping clients that are in the middle of database operations is bad
> > design IMHO.
>
> It's required.  The clients are e-mail filters and they must reply
> quickly, before the end of the SMTP transaction.  If they take too long,
> they must be killed so the SMTP transaction can be tempfailed.  If they
> are not killed, the SMTP sessions pile up and eventually kill the machine.

It might be worth racking your brains to think of other ways. Query
timeouts?

> > That's correct, a backend will generally not notice client disconnect
> > until it next waits for a client command.  It's not totally clear why
> > you've got so many processes waiting to update the same row, though.
>
> It's on a high-volume mail server that receives around 500K
> messages/day.  About 180,000 of those are viruses, so we often have
> multiple processes trying to update the virus statistics row.
>
> > Which process does have the row lock, and why isn't it completing its
> > transaction?
>
> I don't know the details of PostgreSQL's implementation, but it seems
> that when lots of processes are waiting to update the same row, it
> gets incredibly slow.

All trying to access the same row seems a bad idea generally. Instead,
why not make it store a new record for each instance, and have a cronjob
each day update the statistics from that. It will be more efficient,
overall. It can be done hourly, even.

--

Sam Barnett-Cormack
Software Developer                           |  Student of Physics & Maths
UK Mirror Service (http://www.mirror.ac.uk)  |  Lancaster University

Re: Row locking during UPDATE

From
"David F. Skoll"
Date:
On Thu, 4 Sep 2003, Sam Barnett-Cormack wrote:

> It might be worth racking your brains to think of other ways. Query
> timeouts?

Either way, if the back-end is waiting on a semaphore, will it time
out the query and terminate?  The problem is lots of waiting back-end
processes.

> why not make it store a new record for each instance, and have a cronjob
> each day update the statistics from that. It will be more efficient,
> overall. It can be done hourly, even.

I posted a compromise solution:  A "low-contention" table that reduces
the likelihood of contention.  Adding a new record could result in pretty
large tables that need to be summarised.  I'll play around with the
low-contention table first.

Regards,

David.

Re: Row locking during UPDATE

From
Andrew Sullivan
Date:
I didn't see any response to this.  Sorry if this is already stale
for you.

On Thu, Sep 04, 2003 at 08:21:17AM -0400, David F. Skoll wrote:
> Questions:
>
> 1) Am I on the right track?  Is this how PostgreSQL works?

More or less, yes.  The significant part here is that the postmaster
won't notice that the client is gone until it returns from the work
it was trying to do.  It'll eventually come back, but it'll take some
time.  How low does your contention need to be?

> A statistics update looks like this:
>
> UPDATE low_contention_stats SET value = value + 1
>     WHERE date = current_date AND key = 'KEY'
>     AND randomizer = my_random_val AND not summarized;
> (If the update updates 0 rows, we do an INSERT)
>
> The summary process when we copy to statistics looks like this:

If you're going to summarise anyway, why not just always insert into
a "holding" table, and then periodically (infrequently, though)
select out of there and summarise at that point.  Note that if you do
this very frequently, and you have also to select the summary data,
it won't work (as I have learned from painful experience) because
the holding table will gradually build up a lot of dead tuples.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Row locking during UPDATE

From
"David F. Skoll"
Date:
On Fri, 12 Sep 2003, Andrew Sullivan wrote:

> More or less, yes.  The significant part here is that the postmaster
> won't notice that the client is gone until it returns from the work
> it was trying to do.  It'll eventually come back, but it'll take some
> time.  How low does your contention need to be?

Low, low, low... near real-time response is required.

> If you're going to summarise anyway, why not just always insert into
> a "holding" table, and then periodically (infrequently, though)
> select out of there and summarise at that point.

This is the solution I picked (thanks to Tom Lane) and it seems to work
well.

> Note that if you do
> this very frequently, and you have also to select the summary data,
> it won't work (as I have learned from painful experience) because
> the holding table will gradually build up a lot of dead tuples.

That doesn't seem to be a problem; after the summary, I do a
VACUUM and the holding table seems to shrink down nicely.

Regards,

David.

Re: Row locking during UPDATE

From
Andrew Sullivan
Date:
On Fri, Sep 12, 2003 at 09:53:47AM -0400, David F. Skoll wrote:
> > Note that if you do
> > this very frequently, and you have also to select the summary data,
> > it won't work (as I have learned from painful experience) because
> > the holding table will gradually build up a lot of dead tuples.
>
> That doesn't seem to be a problem; after the summary, I do a
> VACUUM and the holding table seems to shrink down nicely.

You apparently don't have the contention that we did.  Are your
transactions short?  We had a problem with this sort of design, but I
think it was because we had some transactions which ran long and
depended on the summarised results.  But if it works for you, great.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110