Thread: MVCC cons

MVCC cons

From
RPK
Date:
I want to know whether MVCC has cons also. Is it heavy on resources? How
PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation.
--
View this message in context: http://www.nabble.com/MVCC-cons-tf4268841.html#a12149505
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: MVCC cons

From
Kenneth Downs
Date:
RPK wrote:
> I want to know whether MVCC has cons also. Is it heavy on resources? How
> PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation.
>

Speaking as an end-user, I can give only one I've ever seen, which is
performance.  Because of MVCC, Postgres's write performance (insert and
update) appears on my systems to be almost exactly linear to row size.
Inserting 1000 rows into a table with row size 100 characters takes
twice as long as inserting 1000 rows into a table with row size 50
characters.

This tends to be more of an issue for me because my system materializes
derived information in columns, so my tables are more fat to begin with,
and so this hits me harder.

--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com    www.andromeda-project.org
631-689-7200   Fax: 631-689-0527
cell: 631-379-0010


Re: MVCC cons

From
"Scott Marlowe"
Date:
On 8/14/07, RPK <rpk.general@gmail.com> wrote:
>
> I want to know whether MVCC has cons also. Is it heavy on resources? How
> PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation.

Of course it does.  There ain't no such thing as a free lunch, after all.

PostgreSQL's mvcc implementation means that a row that gets updated a
lot may have many dead rows in the database, and if you don't run
vacuum often enough, or have enough space allocated in your free space
map, your tables can become bloated.

In a worst case scenario, a highly updated table may get so big that
normal vacuuming cannot salvage it and you would have to either
reindex or perform a vacuum full on it.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/

Re: MVCC cons

From
Tom Lane
Date:
Kenneth Downs <ken@secdat.com> writes:
> Speaking as an end-user, I can give only one I've ever seen, which is
> performance.  Because of MVCC, Postgres's write performance (insert and
> update) appears on my systems to be almost exactly linear to row size.
> Inserting 1000 rows into a table with row size 100 characters takes
> twice as long as inserting 1000 rows into a table with row size 50
> characters.

Not sure why you'd think that's specific to MVCC.  It sounds like purely
an issue of disk write bandwidth.

            regards, tom lane

Re: MVCC cons

From
Kenneth Downs
Date:
Tom Lane wrote:
Kenneth Downs <ken@secdat.com> writes: 
Speaking as an end-user, I can give only one I've ever seen, which is 
performance.  Because of MVCC, Postgres's write performance (insert and 
update) appears on my systems to be almost exactly linear to row size.  
Inserting 1000 rows into a table with row size 100 characters takes 
twice as long as inserting 1000 rows into a table with row size 50 
characters.   
Not sure why you'd think that's specific to MVCC.  It sounds like purely
an issue of disk write bandwidth.
		regards, tom lane 

I did not see this in MS SQL Server.

-- 
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com    www.andromeda-project.org
631-689-7200   Fax: 631-689-0527
cell: 631-379-0010

Re: MVCC cons

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/14/07 14:34, Kenneth Downs wrote:
> Tom Lane wrote:
>> Kenneth Downs <ken@secdat.com> writes:
>>
>>> Speaking as an end-user, I can give only one I've ever seen, which is
>>> performance.  Because of MVCC, Postgres's write performance (insert
>>> and update) appears on my systems to be almost exactly linear to row
>>> size.  Inserting 1000 rows into a table with row size 100 characters
>>> takes twice as long as inserting 1000 rows into a table with row size
>>> 50 characters.
>>>
>>
>> Not sure why you'd think that's specific to MVCC.  It sounds like purely
>> an issue of disk write bandwidth.
>>
>>             regards, tom lane
>>
>
> I did not see this in MS SQL Server.

It is only logical that it will take 2x as long to insert 2x as much
data.

Maybe SQL Server is compressing out white space?  Or (shudder)
heavily caching writes?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGwhXTS9HxQb37XmcRAmdTAJ4rpK60hNtcvT82gCD4RG4EPtcC2wCeNR/C
poURsgchjku2UC0y476KOfM=
=KVNY
-----END PGP SIGNATURE-----

Re: MVCC cons

From
"paul rivers"
Date:
> On 08/14/07 14:34, Kenneth Downs wrote:
> > Tom Lane wrote:
> >> Kenneth Downs <ken@secdat.com> writes:
> >>
> >>> Speaking as an end-user, I can give only one I've ever seen, which is
> >>> performance.  Because of MVCC, Postgres's write performance (insert
> >>> and update) appears on my systems to be almost exactly linear to row
> >>> size.  Inserting 1000 rows into a table with row size 100 characters
> >>> takes twice as long as inserting 1000 rows into a table with row size
> >>> 50 characters.
> >>>
> >>
> >> Not sure why you'd think that's specific to MVCC.  It sounds like
> purely
> >> an issue of disk write bandwidth.
> >>
> >>             regards, tom lane
> >>
> >
> > I did not see this in MS SQL Server.
>
> It is only logical that it will take 2x as long to insert 2x as much
> data.
>
> Maybe SQL Server is compressing out white space?  Or (shudder)
> heavily caching writes?


There's no SQL Server magic.  It doesn't compress whitespace or cache writes
in any scary way.  Doubling with row width does double the insert time.


On SQL Server 2000 sp4:

Setup via:

create database test_db
use test_db
create table t50 ( f1 char(50) )
create table t100 ( f1 char(100) )

Test 1:

declare @start datetime
select @start = getdate()
begin transaction
insert into t50 ( f1 ) values (
'01234567890123456789012345678901234567890123456789' );
-- Repeat above insert 1000 times
commit transaction
select datediff(ms, @start, getdate())

Test 2:
declare @start datetime
select @start = getdate()
begin transaction
insert into t50 ( f1 ) values (
'012345678901234567890123456789012345678901234567890123456789012345678901234
5678901234567890123456789' );
-- Repeat above insert 1000 times
commit transaction
select datediff(ms, @start, getdate())


On my system, test one averages around 16ms over 100 tests.  Test 2 averages
around 33ms over 100 tests.

I would wager my week's coffee change the same outcome on SQL 2005 sp2.

Paul






Re: MVCC cons

From
Gregory Stark
Date:
"Kenneth Downs" <ken@secdat.com> writes:

> RPK wrote:
>> I want to know whether MVCC has cons also. Is it heavy on resources? How
>> PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation.

Well the fundamental con of MVCC versus serializing everything using locks is
that you have to keep more data around which takes space and i/o resources.

> Speaking as an end-user, I can give only one I've ever seen, which is
> performance.  Because of MVCC, Postgres's write performance (insert and update)
> appears on my systems to be almost exactly linear to row size.  Inserting 1000
> rows into a table with row size 100 characters takes twice as long as inserting
> 1000 rows into a table with row size 50 characters.

Well, that's going to be true (assuming i/o is your limiting factor at all)
regardless of your architecture. There's no magic here, more bytes take more
blocks of space which take more time to write or read.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: MVCC cons

From
"Merlin Moncure"
Date:
On 8/14/07, Kenneth Downs <ken@secdat.com> wrote:
> RPK wrote:
> > I want to know whether MVCC has cons also. Is it heavy on resources? How
> > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation.
> >
>
> Speaking as an end-user, I can give only one I've ever seen, which is
> performance.  Because of MVCC, Postgres's write performance (insert and
> update) appears on my systems to be almost exactly linear to row size.
> Inserting 1000 rows into a table with row size 100 characters takes
> twice as long as inserting 1000 rows into a table with row size 50
> characters.

You were half right.  Inserts in PostgreSQL perform similar to other
databases (or at least, use similar mechanisms).  It's the updates
that suffer, because this translates to delete + insert essentially.
Databases that use simple locking strategies can simply update the
record in place.

PostgreSQL wins in terms of better concurrency (especially in long
transactions or transactions that touch a lot of records), cheap
rollbacks, and all the advantages of a sophisticated locking engine
(transactional ddl for example).

merlin

Re: MVCC cons

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/14/07 20:36, Merlin Moncure wrote:
[snip]
>
> PostgreSQL wins in terms of better concurrency (especially in long
> transactions or transactions that touch a lot of records), cheap
> rollbacks, and all the advantages of a sophisticated locking engine
> (transactional ddl for example).

Although MVCC is not a /sine qua non/ of transactional ddl.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGwl4GS9HxQb37XmcRAhm5AJ9pSZR08Vj5aOYtMnQdQaPsjtPMOQCghP+v
k81CxBZyH/42wHjPeT0Azvc=
=WRv0
-----END PGP SIGNATURE-----

Re: MVCC cons

From
Jeff Davis
Date:
On Wed, 2007-08-15 at 07:06 +0530, Merlin Moncure wrote:
> You were half right.  Inserts in PostgreSQL perform similar to other
> databases (or at least, use similar mechanisms).  It's the updates
> that suffer, because this translates to delete + insert essentially.
> Databases that use simple locking strategies can simply update the
> record in place.

I think in some databases that use locking, an INSERT can actually block
a SELECT, and vice-versa. So wouldn't that mean PostgreSQL MVCC is
better for INSERT performance?

Regards,
    Jeff Davis


Re: MVCC cons

From
"Scott Marlowe"
Date:
On 8/15/07, Jeff Davis <pgsql@j-davis.com> wrote:
> On Wed, 2007-08-15 at 07:06 +0530, Merlin Moncure wrote:
> > You were half right.  Inserts in PostgreSQL perform similar to other
> > databases (or at least, use similar mechanisms).  It's the updates
> > that suffer, because this translates to delete + insert essentially.
> > Databases that use simple locking strategies can simply update the
> > record in place.
>
> I think in some databases that use locking, an INSERT can actually block
> a SELECT, and vice-versa. So wouldn't that mean PostgreSQL MVCC is
> better for INSERT performance?

For certain values of better, yes.

For inserting 10,000,000 rows, it's probably not as fast as some other
databases.  But, you can insert those 10,000,000 rows while 100 users
run select statements and none of them will block.  So, even if it
takes twice as long as a table locking db, that's still probably ok,
because the readers can keep right on reading.

Re: MVCC cons

From
"Scott Marlowe"
Date:
On 8/14/07, RPK <rpk.general@gmail.com> wrote:
>
> I want to know whether MVCC has cons also. Is it heavy on resources? How
> PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation.

Of course it does.  There ain't no such thing as a free lunch, after all.

PostgreSQL's mvcc implementation means that a row that gets updated a
lot may have many dead rows in the database, and if you don't run
vacuum often enough, or have enough space allocated in your free space
map, your tables can become bloated.

In a worst case scenario, a highly updated table may get so big that
normal vacuuming cannot salvage it and you would have to either
reindex or perform a vacuum full on it.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/