Thread: update functions locking tables

update functions locking tables

From
Clodoaldo Pinto
Date:
I have almost completed one web site migration from mysql to pgsql. It
is already running totally pgsql.

But there is one last conversion problem. Most of the queries use
tables populated every 3 hours.

In mysql, for the site to not be unavailable during updating i was
creating temporary tables, populating them, dropping the main tables
and then renaming the temp tables to the main tables. The updating is
not trivial (for me) and needs some coding effort to be done.

Since pgsql has MVCC I wanted to eliminate the table rotation step and
use a transaction to update the tables. But what is happening is that
the plpgsql update functions are locking the tables and this is what
the web clients are getting (from ps ax):

26873 ? S 0:18 postgres: user database 127.0.0.1(38946) SELECT waiting
23973 ? S 0:22 postgres: user database 127.0.0.1(43396) SELECT waiting
31212 ? S 0:02 postgres: user database 127.0.0.1(60619) SELECT waiting
21601 ? S 0:00 postgres: user database 127.0.0.1(46770) SELECT waiting

I've been trying for 3 days to figure out what is happening to no
avail. What am i missing about transactions and MVCC? What could make
a plpgsql update function lock a table? The indexes are default btree.
Otherwise the functions are behaving exactly as expected.

Regards, Clodoaldo Pinto

Re: update functions locking tables

From
Tom Lane
Date:
Clodoaldo Pinto <clodoaldo.pinto@gmail.com> writes:
> I've been trying for 3 days to figure out what is happening to no
> avail. What am i missing about transactions and MVCC? What could make
> a plpgsql update function lock a table?

What is the function doing to the table, exactly?  DDL changes generally
take exclusive locks ...

            regards, tom lane

Re: update functions locking tables

From
Michael Fuhr
Date:
On Mon, Aug 29, 2005 at 09:41:21PM -0300, Clodoaldo Pinto wrote:
>
> 26873 ? S 0:18 postgres: user database 127.0.0.1(38946) SELECT waiting
> 23973 ? S 0:22 postgres: user database 127.0.0.1(43396) SELECT waiting
> 31212 ? S 0:02 postgres: user database 127.0.0.1(60619) SELECT waiting
> 21601 ? S 0:00 postgres: user database 127.0.0.1(46770) SELECT waiting
>
> I've been trying for 3 days to figure out what is happening to no
> avail. What am i missing about transactions and MVCC? What could make
> a plpgsql update function lock a table? The indexes are default btree.

In general, writers shouldn't block readers.  Have you examined
pg_locks?  Do you know exactly what the blocked queries are, or can
you find out from pg_stat_activity (stats_command_string must be
enabled)?  Are you doing any explicit locking (LOCK statement)?

--
Michael Fuhr

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
2005/8/29, Michael Fuhr <mike@fuhr.org>:
>
> In general, writers shouldn't block readers.  Have you examined
> pg_locks?  Do you know exactly what the blocked queries are, or can
> you find out from pg_stat_activity (stats_command_string must be
> enabled)?  Are you doing any explicit locking (LOCK statement)?
>

This is one of the blocked queries:
select count (*) from times_producao where pontos_0 - pontos_7 > 0;

These selects were done during the updating:

 select * from pg_locks;
 relation | database | transaction |  pid  |        mode         | granted
----------+----------+-------------+-------+---------------------+---------
  1813938 |  1813868 |             |  7040 | AccessShareLock     | t
  1813938 |  1813868 |             |  7040 | RowExclusiveLock    | t
  1813938 |  1813868 |             |  7040 | ShareLock           | t
  1813938 |  1813868 |             |  7040 | AccessExclusiveLock | t
  1813939 |  1813868 |             |  7040 | AccessShareLock     | t
  1813939 |  1813868 |             |  7040 | RowExclusiveLock    | t
  1813939 |  1813868 |             |  7040 | ShareLock           | t
  1813939 |  1813868 |             |  7040 | AccessExclusiveLock | t
  1813914 |  1813868 |             | 24012 | AccessShareLock     | f
  1813892 |  1813868 |             |  7040 | AccessShareLock     | t
  1813892 |  1813868 |             |  7040 | RowExclusiveLock    | t
  1813914 |  1813868 |             |  7040 | AccessShareLock     | t
  1813914 |  1813868 |             |  7040 | RowExclusiveLock    | t
  1813914 |  1813868 |             |  7040 | ShareLock           | t
  1813914 |  1813868 |             |  7040 | AccessExclusiveLock | t
  1813896 |  1813868 |             |  7040 | AccessShareLock     | t
    16839 |  1813868 |             | 12751 | AccessShareLock     | t
  2314110 |  1813868 |             | 26871 | AccessShareLock     | f
  1813914 |  1813868 |             | 26844 | AccessShareLock     | f
          |          |      288553 | 26844 | ExclusiveLock       | t
          |          |      288561 | 24012 | ExclusiveLock       | t
          |          |      288548 |  7040 | ExclusiveLock       | t
          |          |      288558 | 26871 | ExclusiveLock       | t
  1813914 |  1813868 |             | 31212 | AccessShareLock     | f
  2314110 |  1813868 |             |  7040 | AccessShareLock     | t
  2314110 |  1813868 |             |  7040 | RowExclusiveLock    | t
  2314110 |  1813868 |             |  7040 | ShareLock           | t
  2314110 |  1813868 |             |  7040 | AccessExclusiveLock | t
          |          |      288556 | 31212 | ExclusiveLock       | t
          |          |      288562 | 12751 | ExclusiveLock       | t
  1813887 |  1813868 |             |  7040 | AccessShareLock     | t
  2314112 |  1813868 |             |  7040 | ShareLock           | t
  2314112 |  1813868 |             |  7040 | AccessExclusiveLock | t
  1813907 |  1813868 |             |  7040 | AccessShareLock     | t
  1813911 |  1813868 |             |  7040 | AccessShareLock     | t
(35 rows)

select *
from pg_stat_user_tables as a
inner join pg_locks as b
on a.relid = b.relation
;

  relid  | schemaname |      relname      | seq_scan | seq_tup_read |
idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
relation | database | transaction |  pid  |        mode         |
granted

---------+------------+-------------------+----------+--------------+----------+---------------+-----------+-----------+-----------+----------+----------+-------------+-------+---------------------+---------
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        | 24012 | AccessShareLock     | f
 1813892 | public     | last_date         |        0 |            0 |
        |               |         0 |         0 |         0 |  1813892
|  1813868 |        |  7040 | AccessShareLock     | t
 1813892 | public     | last_date         |        0 |            0 |
        |               |         0 |         0 |         0 |  1813892
|  1813868 |        |  7040 | RowExclusiveLock    | t
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        |  7040 | AccessShareLock     | t
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        |  7040 | RowExclusiveLock    | t
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        |  7040 | ShareLock           | t
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        |  7040 | AccessExclusiveLock | t
 1813896 | public     | times             |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813896
|  1813868 |        |  7040 | AccessShareLock     | t
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        | 31383 | AccessShareLock     | f
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        | 12351 | AccessShareLock     | f
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        | 26871 | AccessShareLock     | f
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        | 26844 | AccessShareLock     | f
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        | 24021 | AccessShareLock     | f
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        | 31212 | AccessShareLock     | f
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        |  7040 | AccessShareLock     | t
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        |  7040 | RowExclusiveLock    | t
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        |  7040 | ShareLock           | t
 2314110 | public     | times_producao    |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  2314110
|  1813868 |        |  7040 | AccessExclusiveLock | t
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        | 26872 | AccessShareLock     | f
 1813887 | public     | datas             |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813887
|  1813868 |        |  7040 | AccessShareLock     | t
 1813914 | public     | usuarios_producao |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813914
|  1813868 |        |  8875 | AccessShareLock     | f
 1813907 | public     | usuarios          |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813907
|  1813868 |        |  7040 | AccessShareLock     | t
 1813911 | public     | usuarios_indice   |        0 |            0 |
      0 |             0 |         0 |         0 |         0 |  1813911
|  1813868 |        |  7040 | AccessShareLock     | t
(23 rows)


I had just enabled stats_command_string and in about 15 hours i will
be able to post pg_stat_activity.

Regards, Clodoaldo Pinto

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
2005/8/29, Tom Lane <tgl@sss.pgh.pa.us>:
>
> What is the function doing to the table, exactly?  DDL changes generally
> take exclusive locks ...

This is the transaction:

begin;
select update_last_date();
truncate times_producao;
select kstime(), insert_times_producao(), kstime();
select kstime(), update_ranking_times(), kstime();
truncate usuarios_producao;
select kstime(), insert_usuarios_producao(), kstime();
analyze usuarios_producao;
select kstime(), update_ranking_usuarios(), kstime();
select kstime(), update_ranking_usuarios_time(), kstime();
select kstime(), update_team_active_members(), kstime();
commit;

This is one of the functions:

CREATE OR REPLACE FUNCTION update_ranking_usuarios()
  RETURNS void AS
$BODY$declare
  linha record;
  rank integer;
begin
rank := 0;
for linha in
  select usuario
    from usuarios_producao
    order by pontos_0 desc, pontos_7 desc, pontos_24 desc
loop
  rank := rank + 1;
  update usuarios_producao
    set rank_0 = rank
    where usuario = linha.usuario
    ;
end loop;
-- ----------------------------------------------------------
rank := 0;
for linha in
  select usuario
    from usuarios_producao
    order by pontos_0 + (pontos_7 / 7) desc, pontos_0 desc
loop
  rank := rank + 1;
  update usuarios_producao
    set rank_24 = rank
    where usuario = linha.usuario
    ;
end loop;
-- ----------------------------------------------------------
rank := 0;
for linha in
  select usuario
    from usuarios_producao
    order by pontos_0 + pontos_7 desc, pontos_0 desc
loop
  rank := rank + 1;
  update usuarios_producao
    set rank_7 = rank
    where usuario = linha.usuario
    ;
end loop;
-- ----------------------------------------------------------
rank := 0;
for linha in
  select usuario
    from usuarios_producao
    order by pontos_0 + (pontos_7 * 30 / 7) desc, pontos_0 desc
loop
  rank := rank + 1;
  update usuarios_producao
    set rank_30 = rank
    where usuario = linha.usuario
    ;
end loop;
return;
end;$BODY$
  LANGUAGE 'plpgsql' VOLATILE STRICT;

There is no DDL inside the functions.

Regards, Clodoaldo Pinto

Re: update functions locking tables

From
Michael Fuhr
Date:
On Tue, Aug 30, 2005 at 08:39:52AM -0300, Clodoaldo Pinto wrote:
>
> begin;
> select update_last_date();
> truncate times_producao;

TRUNCATE acquires an AccessExclusiveLock, which conflicts with all
other lock types.  Locks are held until the transaction completes,
so once this lock is acquired no other transactions will be able
to access the table until this transaction commits or rolls back.

DELETE is slower than TRUNCATE but it won't block readers in other
transactions.

--
Michael Fuhr

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
2005/8/30, Michael Fuhr <mike@fuhr.org>:
>
> TRUNCATE acquires an AccessExclusiveLock, which conflicts with all
> other lock types.  Locks are held until the transaction completes,
> so once this lock is acquired no other transactions will be able
> to access the table until this transaction commits or rolls back.
>
> DELETE is slower than TRUNCATE but it won't block readers in other
> transactions.
>

I think it is of great help. I will change it and let you know what happened.

Regards, Clodoaldo Pinto

Re: update functions locking tables

From
Greg Stark
Date:
Clodoaldo Pinto <clodoaldo.pinto@gmail.com> writes:

> 2005/8/29, Tom Lane <tgl@sss.pgh.pa.us>:
> >
> > What is the function doing to the table, exactly?  DDL changes generally
> > take exclusive locks ...
>
> This is the transaction:
>
> begin;
> select update_last_date();
> truncate times_producao;

I think truncate takes a table lock.
Just change it to "delete from times_producao".

Also, if consider doing a "vacuum full" or "cluster" after the batch job to
clear up the free space (not in a large transaction). That will still take a
table lock but it may be a small enough downtime to be worth the speed
increase the rest of the day.

--
greg

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
30 Aug 2005 09:10:51 -0400, Greg Stark <gsstark@mit.edu>:
>
> I think truncate takes a table lock.
> Just change it to "delete from times_producao".

Thanks, i will try it.

>
> Also, if consider doing a "vacuum full" or "cluster" after the batch job to
> clear up the free space (not in a large transaction). That will still take a
> table lock but it may be a small enough downtime to be worth the speed
> increase the rest of the day.
>

I'm already doing a vacuum (not full) once a day.

A vacuum full or a cluster is totally out of reach since each take
about one hour. The biggest table is 170 million rows long.

Regards, Clodoaldo Pinto

Re: update functions locking tables

From
Alvaro Herrera
Date:
On Tue, Aug 30, 2005 at 10:18:20AM -0300, Clodoaldo Pinto wrote:
> 30 Aug 2005 09:10:51 -0400, Greg Stark <gsstark@mit.edu>:

> > Also, if consider doing a "vacuum full" or "cluster" after the batch job to
> > clear up the free space (not in a large transaction). That will still take a
> > table lock but it may be a small enough downtime to be worth the speed
> > increase the rest of the day.
>
> I'm already doing a vacuum (not full) once a day.
>
> A vacuum full or a cluster is totally out of reach since each take
> about one hour.

Even if you cluster/vacuum only the just-loaded table?

> The biggest table is 170 million rows long.

I hope this is not the one you are loading daily ...

--
Alvaro Herrera <alvherre[]alvh.no-ip.org>      Architect, www.EnterpriseDB.com
"El destino baraja y nosotros jugamos" (A. Schopenhauer)

Re: update functions locking tables

From
Greg Stark
Date:
Clodoaldo Pinto <clodoaldo.pinto@gmail.com> writes:

> I'm already doing a vacuum (not full) once a day.
>
> A vacuum full or a cluster is totally out of reach since each take
> about one hour. The biggest table is 170 million rows long.

Well a regular vacuum will mark the free space for reuse. If you insert or
update any records the new ones will go into those spots. Make sure you set
the fsm_* parameters high enough to cover all the updates and inserts for the
entire day (or repeat the vacuum periodically even if there are no deletes or
updates going on to create more free space).

You should realize that what's going on here is that the old records are still
in your table, marked as deleted. So any sequential scan will take twice as
long as otherwise. I think even index scans could take twice as long too
depending on the distribution of values.

I'm not saying that's untenable. If all your queries are fast enough then
you're set and it's just a cost of having no downtime.

--
greg

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
2005/8/30, Alvaro Herrera <alvherre@alvh.no-ip.org>:
> On Tue, Aug 30, 2005 at 10:18:20AM -0300, Clodoaldo Pinto wrote:
> > A vacuum full or a cluster is totally out of reach since each take
> > about one hour.
>
> Even if you cluster/vacuum only the just-loaded table?
>

No, that would  be much faster. The biggest just updated is about 600
thousand rows. I will consider it.

> > The biggest table is 170 million rows long.
>
> I hope this is not the one you are loading daily ...
>
I load daily 8 times 700+ thousand rows.

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
30 Aug 2005 10:35:31 -0400, Greg Stark <gsstark@mit.edu>:
>
> Well a regular vacuum will mark the free space for reuse. If you insert or
> update any records the new ones will go into those spots. Make sure you set
> the fsm_* parameters high enough to cover all the updates and inserts for the
> entire day (or repeat the vacuum periodically even if there are no deletes or
> updates going on to create more free space).

I will check those fsm_* parameters.

>
> You should realize that what's going on here is that the old records are still
> in your table, marked as deleted. So any sequential scan will take twice as
> long as otherwise. I think even index scans could take twice as long too
> depending on the distribution of values.
>
> I'm not saying that's untenable. If all your queries are fast enough then
> you're set and it's just a cost of having no downtime.
>
> --
> greg
>
>

Re: update functions locking tables

From
"Jim C. Nasby"
Date:
On Tue, Aug 30, 2005 at 08:13:15AM -0300, Clodoaldo Pinto wrote:
> 2005/8/29, Michael Fuhr <mike@fuhr.org>:
> >
> > In general, writers shouldn't block readers.  Have you examined
> > pg_locks?  Do you know exactly what the blocked queries are, or can
> > you find out from pg_stat_activity (stats_command_string must be
> > enabled)?  Are you doing any explicit locking (LOCK statement)?
> >
>
> This is one of the blocked queries:
> select count (*) from times_producao where pontos_0 - pontos_7 > 0;

FWIW, that where clause might be more efficient as
WHERE pontos_0 > pontos_7. Some databases would be able to use indexes
to answer that (not sure if PostgreSQL could), plus it removes an
operator. It also seems to be cleaner code to me. :)
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software        http://pervasive.com        512-569-9461

Re: update functions locking tables

From
Clodoaldo Pinto
Date:
2005/8/30, Jim C. Nasby <jnasby@pervasive.com>:
>
> FWIW, that where clause might be more efficient as
> WHERE pontos_0 > pontos_7. Some databases would be able to use indexes
> to answer that (not sure if PostgreSQL could), plus it removes an
> operator. It also seems to be cleaner code to me. :)
> --

Done, thanks.

Regards, Clodoaldo Pinto