Thread: row reuse while UPDATE and vacuum analyze problem

row reuse while UPDATE and vacuum analyze problem

From

Oleg Bartunov

Date:

28 July 1999, 06:43:10

Hi,

testing of DBIlogging to postgres I've got serious problem
with performance degradation during updates.
In my case I got 15-16 req/sec for the first 1000 updates which drops down
to 1-2 req. sec after 20000 updates. This is quite unusable even for
medium Web site. As Tom Lane noticed update is just an insert, so if
table has only one row which is updated by several processes the size
will grow until 'vacuum analyze'. Indeed 'vacuum analyze' helps a lot,
but index file doesn't affected, it remains big !

After 20190 updates and several 'vacuum analyze':

-rw-------   1 postgres users     1810432 Jul 28 14:22 hits
-rw-------   1 postgres users     1368064 Jul 28 14:22 hits_pkey
om:/usr/local/pgsql/data/base/discovery$ psql discovery -c 'select count(*) from hits'
count
-----
10000
(1 row)

om:/usr/local/pgsql/data/base/discovery$ psql discovery -c 'select sum(count) from hits' sum
-----
20190
(1 row)

I inserted 10,000 rows into table hits just to test how the number of 
rows could affect to performance while 2 rows are updated. I didn't notice
any difference. 

After 'vacuum analyze':
om:/usr/local/pgsql/data/base/discovery$ l hits*
-rw-------   1 postgres users      606208 Jul 28 14:27 hits
-rw-------   1 postgres users     1368064 Jul 28 14:27 hits_pkey
om:/usr/local/pgsql/data/base/discovery$ 

Index file doesn't touched, actually modification date changed, but the
size remains big.

How update performance could be increased if:1. 'vacuum analyze' will analyze index file2. reuse row instead of
inserting

I found in TODO only

* Allow row re-use without vacuum(Vadim)

My site isn't in production yet, so I'd like to know are there some chance
update problem will be solved. I think this is rather general problem
and many Web developers will appreciate solving it as Jan's feature patch
for LIMIT inspired many people to use postgres in real applications as well
as great new MVCC feature.
Regards,
    Oleg

PS.

For those who interested in my handler for Logging accumulated hits into 
postgres:


In httpd.conf:

PerlModule Apache::HitsDBI0
<Location /db/pubs.html>  PerlCleanupHandler Apache::HitsDBI0
</Location> 

Table scheme:
create table hits ( msg_id int4 not null primary key, count  int4 not null, first_access datetime default now(),
last_accessdatetime
 
);
-- grant information

GRANT SELECT ON hits to PUBLIC;
GRANT INSERT,UPDATE ON hits to httpd;



package Apache::HitsDBI0;

use strict;

# preloaded in startup.pl
use Apache::Constants qw(:common);
#use DBI ();

sub handler {   my $orig = shift;   if ( $orig->args() =~ /msg_id=(\d+)/ ) {     my $dbh =
DBI->connect("dbi:Pg:dbname=discovery")|| die DBI->errstr;     $dbh->{AutoCommit} = 0;     my $sth = $dbh->do("LOCK
TABLEhits IN SHARE ROW EXCLUSIVE MODE") || die $dbh->errstr;     my $rows_affected  = $dbh->do("update hits set
count=count+1,last_access=now()where msg_id=$1") || die $dbh->errstr;
 
## postgres specific !!!     $sth = $dbh->do("Insert Into hits (msg_id,count) values ($1, 1)") if ($rows_affected eq
'0E0');    my $rc  = $dbh->commit     || die $dbh->errstr;   }   return OK;
 
}

1;
__END__


_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem

From

The Hermit Hacker

Date:

28 July 1999, 08:01:12

On Wed, 28 Jul 1999, Oleg Bartunov wrote:

> How update performance could be increased if:
>  1. 'vacuum analyze' will analyze index file
>  2. reuse row instead of inserting

Just to clarify, 'reuse row' won't replace inserting (to the best of my
knowledge), only reduce space wastage between vacuum's.  Especially, again
TTBOMK, with MVCC, where each "instance" of a row is serialized.  

Actually, there is a tought...if I understand the concept of MVCC, how is
reusing a row going to work?  My understanding is that I can "physically"
have to copies of a row in a table, one newer then the other.  So, if
someone is running a SELECT while I'm doing an UPDATE, their SELECT will
take the older version of hte row (the row at the time their SELECT
started)...depending on how busy that table is, there will have to be some
sort of mechanism for determining how 'stale' a row is, no?

ie. on a *very* large table, with multiple SELECT/UPDATEs happening?

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem

From

Philip Warner

Date:

28 July 1999, 08:29:12

At 09:00 28/07/99 -0300, you wrote:
>On Wed, 28 Jul 1999, Oleg Bartunov wrote:
>
>Actually, there is a tought...if I understand the concept of MVCC, how is
>reusing a row going to work?  My understanding is that I can "physically"
>have to copies of a row in a table, one newer then the other.  So, if
>someone is running a SELECT while I'm doing an UPDATE, their SELECT will
>take the older version of hte row (the row at the time their SELECT
>started)...depending on how busy that table is, there will have to be some
>sort of mechanism for determining how 'stale' a row is, no?
>
>ie. on a *very* large table, with multiple SELECT/UPDATEs happening?
>

I presume that's part of MVCC - if PGSQL has a 'transaction id', then you
only need to keep copies of rows back to the earliest started active tx. In
fact, you only really need to keep versions to satisfy all currenct active
tx's (ie. you don't need *all* intermediate versions). You must also keep a
copy of a row prior to the current writer's update (until they commit).

There was talk a while back about doing a 'background vacuum' - did the
talk go anywhere, because reusing space is probably the only way to solve
the infinite growth problem.

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: +61-03-5367 7422            |                 _________  \
Fax: +61-03-5367 7430            |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem

From

Oleg Bartunov

Date:

28 July 1999, 08:31:12

On Wed, 28 Jul 1999, The Hermit Hacker wrote:

> Date: Wed, 28 Jul 1999 09:00:21 -0300 (ADT)
> From: The Hermit Hacker <scrappy@hub.org>
> To: Oleg Bartunov <oleg@sai.msu.su>
> Cc: pgsql-hackers@postgreSQL.org, tgl@sss.pgh.pa.us, vadim@krs.ru
> Subject: Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem 
> 
> On Wed, 28 Jul 1999, Oleg Bartunov wrote:
> 
> > How update performance could be increased if:
> >  1. 'vacuum analyze' will analyze index file
> >  2. reuse row instead of inserting
> 
> Just to clarify, 'reuse row' won't replace inserting (to the best of my
> knowledge), only reduce space wastage between vacuum's.  Especially, again
> TTBOMK, with MVCC, where each "instance" of a row is serialized.  
> 
> Actually, there is a tought...if I understand the concept of MVCC, how is
> reusing a row going to work?  My understanding is that I can "physically"
> have to copies of a row in a table, one newer then the other.  So, if
> someone is running a SELECT while I'm doing an UPDATE, their SELECT will
> take the older version of hte row (the row at the time their SELECT
> started)...depending on how busy that table is, there will have to be some
> sort of mechanism for determining how 'stale' a row is, no?
> 
> ie. on a *very* large table, with multiple SELECT/UPDATEs happening?

This is what I noticed when start my testing about a week ago - I got
duplicates, because of multiple concurrent processes trying
inserts/updates. After LOCK TABLE hits IN SHARE ROW EXCLUSIVE MODE
all problem were gone except performance slow downs a little bit.
But after many updates performance dergrades very much IMO because
table and index size grow even if I update the same row and 
even 'vacuum analyze' doesn't reduce the size of index file.
In principle, I could live with cron job running vaccumdb every hour but
vacuum doesn't touch indices.
I hope I'll meet Vadim in Moscow and we discuss MVCC aspects in native 
russian :-)
Regards,
    Oleg

> 
> Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
> Systems Administrator @ hub.org 
> primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
> 

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem

From

Bernard Frankpitt

Date:

28 July 1999, 11:34:14

After thinking about the row reuse problem for a while, this is the best
I could come up with.  I don't think it is well enough thought out to be
worth implementing yet, but maybe the ideas are a step in the right
direction.

A partial solution to row-reuse is to allow writing commands (commands
with write locks on a buffer) to purge expendable tuples as they find
them.  When a transaction starts it could look at the snapshots of
concurrent transactions and compute a tide-mark equal to the oldest TID
in all the concurrent snapshots.  Any deleted tuple in a heap relation
with a committed tmax value that is smaller than the tide-mark is
invisible to any of the concurrent or future transactions, and is
expendable. The write command that discovers the expendable transaction
can then delete it, and reclaim the tuple space.  

Unfortunately this solves only half the problem.  Deleting the tuple
means that other scans no longer have to read and then discard it,  but
the space that the tuple uses is not really reclaimed because the way
that insertions work is that they always add to the end of the heap.  If
all the tuples in a 
table are of fixed size then one solution would be to keep a list of
empty tuple slots in the heap, and insert new tuples in these slots
first.  This would allow inserts to keep the table well packed.

In the case of tables with variable length tuples the problem seems
harder. Since the actual tuples in a table are always accessed through
ItemIds, it is possible for a process with a write lock on a buffer to
rearrange the tuples in the page to remove free space without affecting
concurrent processes' views of the data within the buffer.  After
freeing the tuple, and compacting the space in the page, the process
would have to update the free space list 
by removing any previous pointer to space on the page, and then
re-inserting a pointer to the new space on the page.  The free space
list becomes quite a bit more complicated in this case, as it has to
keep track of the sizes of the free space segments, and it needs to be
indexed by the block in which the free space resides, and the size of
the space available.  This would seem to indicate the need for both a
tree-structure and a hash structure associated with the free space list.

Comments anyone?

Bernie

Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem

From

Bruce Momjian

Date:

28 July 1999, 12:04:14

> On Wed, 28 Jul 1999, Oleg Bartunov wrote:
> 
> > How update performance could be increased if:
> >  1. 'vacuum analyze' will analyze index file
> >  2. reuse row instead of inserting
> 
> Just to clarify, 'reuse row' won't replace inserting (to the best of my
> knowledge), only reduce space wastage between vacuum's.  Especially, again
> TTBOMK, with MVCC, where each "instance" of a row is serialized.  
> 
> Actually, there is a tought...if I understand the concept of MVCC, how is
> reusing a row going to work?  My understanding is that I can "physically"
> have to copies of a row in a table, one newer then the other.  So, if
> someone is running a SELECT while I'm doing an UPDATE, their SELECT will
> take the older version of hte row (the row at the time their SELECT
> started)...depending on how busy that table is, there will have to be some
> sort of mechanism for determining how 'stale' a row is, no?
> 
> ie. on a *very* large table, with multiple SELECT/UPDATEs happening?

You would have to leave referenced rows alone.  I think Vadim has this
covered.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem

From

Philip Warner

Date:

28 July 1999, 12:11:14

At 15:54 28/07/99 +0000, Bernard Frankpitt wrote:
>A partial solution to row-reuse is to allow writing commands (commands
>with write locks on a buffer) to purge expendable tuples as they find
>them.  When a transaction starts it could look at the snapshots of
>concurrent transactions and compute a tide-mark equal to the oldest TID
>in all the concurrent snapshots.  Any deleted tuple in a heap relation
>with a committed tmax value that is smaller than the tide-mark is
>invisible to any of the concurrent or future transactions, and is
>expendable. The write command that discovers the expendable transaction
>can then delete it, and reclaim the tuple space.  

Is there any way that the space can be freed as soon as it is no longer
needed? I'm not sure how the MVCC stuff works, but I assume that when a R/O
TX starts, a lock is taken out on the tables (and/or rows) that are read.
If a R/W TX updates a record, then a new version is written with a new TID.
Can the writer also mark the old version of the row for deletion, so that
when the last reader commits, the row is deleted? I have no idea if this
would be more or less efficient that making the writers do it, but it would
have the advantage of distributing the load across more TXs.

>Unfortunately this solves only half the problem.  Deleting the tuple
>means that other scans no longer have to read and then discard it,  but
>the space that the tuple uses is not really reclaimed because the way
>that insertions work is that they always add to the end of the heap.  If
>all the tuples in a 
>table are of fixed size then one solution would be to keep a list of
>empty tuple slots in the heap, and insert new tuples in these slots
>first.  This would allow inserts to keep the table well packed.

This makes sense, but I would guess that fixed length tuples would not be
common.

>In the case of tables with variable length tuples the problem seems
>harder. Since the actual tuples in a table are always accessed through
>ItemIds, it is possible for a process with a write lock on a buffer to
>rearrange the tuples in the page to remove free space without affecting
>concurrent processes' views of the data within the buffer.  After
>freeing the tuple, and compacting the space in the page, the process
>would have to update the free space list 
>by removing any previous pointer to space on the page, and then
>re-inserting a pointer to the new space on the page.  The free space
>list becomes quite a bit more complicated in this case, as it has to
>keep track of the sizes of the free space segments, and it needs to be
>indexed by the block in which the free space resides, and the size of
>the space available.  This would seem to indicate the need for both a
>tree-structure and a hash structure associated with the free space list.

I'm not sure I follow this, I also don't know anything about the internals
of the data storage code, but...

Using my favorite database as a model, an alternative might be to (1)
Record the free space on a page-by-page (or any kind of chunk-by-chunk)
basis, (2) Don't bother doing any rearrangement when a record is deleted,
just mark the record as free, and update the bytes-free count for the page,
(3) When you want to store a record, look for any page that has enough
bytes free, then do any space rearrangement necessary to store the record.

AFAICT, most of the above steps (with the exception of page-fullness) must
already be performed.

This will of course fail totally if records are allowed to freely cross
page boundaries.

Please forgive me if this is way off the mark...and (if you have the
patience) explain what I've missed.

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: +61-03-5367 7422            |                 _________  \
Fax: +61-03-5367 7430            |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/