row reuse while UPDATE and vacuum analyze problem - Mailing list pgsql-hackers

From Oleg Bartunov
Subject row reuse while UPDATE and vacuum analyze problem
Date
Msg-id Pine.GSO.3.96.SK.990728141058.27569K-100000@ra
Whole thread Raw
Responses Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem
List pgsql-hackers
Hi,

testing of DBIlogging to postgres I've got serious problem
with performance degradation during updates.
In my case I got 15-16 req/sec for the first 1000 updates which drops down
to 1-2 req. sec after 20000 updates. This is quite unusable even for
medium Web site. As Tom Lane noticed update is just an insert, so if
table has only one row which is updated by several processes the size
will grow until 'vacuum analyze'. Indeed 'vacuum analyze' helps a lot,
but index file doesn't affected, it remains big !

After 20190 updates and several 'vacuum analyze':

-rw-------   1 postgres users     1810432 Jul 28 14:22 hits
-rw-------   1 postgres users     1368064 Jul 28 14:22 hits_pkey
om:/usr/local/pgsql/data/base/discovery$ psql discovery -c 'select count(*) from hits'
count
-----
10000
(1 row)

om:/usr/local/pgsql/data/base/discovery$ psql discovery -c 'select sum(count) from hits' sum
-----
20190
(1 row)

I inserted 10,000 rows into table hits just to test how the number of 
rows could affect to performance while 2 rows are updated. I didn't notice
any difference. 

After 'vacuum analyze':
om:/usr/local/pgsql/data/base/discovery$ l hits*
-rw-------   1 postgres users      606208 Jul 28 14:27 hits
-rw-------   1 postgres users     1368064 Jul 28 14:27 hits_pkey
om:/usr/local/pgsql/data/base/discovery$ 

Index file doesn't touched, actually modification date changed, but the
size remains big.

How update performance could be increased if:1. 'vacuum analyze' will analyze index file2. reuse row instead of
inserting

I found in TODO only

* Allow row re-use without vacuum(Vadim)

My site isn't in production yet, so I'd like to know are there some chance
update problem will be solved. I think this is rather general problem
and many Web developers will appreciate solving it as Jan's feature patch
for LIMIT inspired many people to use postgres in real applications as well
as great new MVCC feature.
Regards,
    Oleg

PS.

For those who interested in my handler for Logging accumulated hits into 
postgres:


In httpd.conf:

PerlModule Apache::HitsDBI0
<Location /db/pubs.html>  PerlCleanupHandler Apache::HitsDBI0
</Location> 

Table scheme:
create table hits ( msg_id int4 not null primary key, count  int4 not null, first_access datetime default now(),
last_accessdatetime
 
);
-- grant information

GRANT SELECT ON hits to PUBLIC;
GRANT INSERT,UPDATE ON hits to httpd;



package Apache::HitsDBI0;

use strict;

# preloaded in startup.pl
use Apache::Constants qw(:common);
#use DBI ();

sub handler {   my $orig = shift;   if ( $orig->args() =~ /msg_id=(\d+)/ ) {     my $dbh =
DBI->connect("dbi:Pg:dbname=discovery")|| die DBI->errstr;     $dbh->{AutoCommit} = 0;     my $sth = $dbh->do("LOCK
TABLEhits IN SHARE ROW EXCLUSIVE MODE") || die $dbh->errstr;     my $rows_affected  = $dbh->do("update hits set
count=count+1,last_access=now()where msg_id=$1") || die $dbh->errstr;
 
## postgres specific !!!     $sth = $dbh->do("Insert Into hits (msg_id,count) values ($1, 1)") if ($rows_affected eq
'0E0');    my $rc  = $dbh->commit     || die $dbh->errstr;   }   return OK;
 
}

1;
__END__


_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83





pgsql-hackers by date:

Previous
From: Zeugswetter Andreas IZ5
Date:
Subject: Re: Selectivity of "=" (Re: [HACKERS] Index not used on simple se lect)
Next
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem