Thread: MVCC cons
I want to know whether MVCC has cons also. Is it heavy on resources? How PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. -- View this message in context: http://www.nabble.com/MVCC-cons-tf4268841.html#a12149505 Sent from the PostgreSQL - general mailing list archive at Nabble.com.
RPK wrote: > I want to know whether MVCC has cons also. Is it heavy on resources? How > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. > Speaking as an end-user, I can give only one I've ever seen, which is performance. Because of MVCC, Postgres's write performance (insert and update) appears on my systems to be almost exactly linear to row size. Inserting 1000 rows into a table with row size 100 characters takes twice as long as inserting 1000 rows into a table with row size 50 characters. This tends to be more of an issue for me because my system materializes derived information in columns, so my tables are more fat to begin with, and so this hits me harder. -- Kenneth Downs Secure Data Software, Inc. www.secdat.com www.andromeda-project.org 631-689-7200 Fax: 631-689-0527 cell: 631-379-0010
On 8/14/07, RPK <rpk.general@gmail.com> wrote: > > I want to know whether MVCC has cons also. Is it heavy on resources? How > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. Of course it does. There ain't no such thing as a free lunch, after all. PostgreSQL's mvcc implementation means that a row that gets updated a lot may have many dead rows in the database, and if you don't run vacuum often enough, or have enough space allocated in your free space map, your tables can become bloated. In a worst case scenario, a highly updated table may get so big that normal vacuuming cannot salvage it and you would have to either reindex or perform a vacuum full on it. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Kenneth Downs <ken@secdat.com> writes: > Speaking as an end-user, I can give only one I've ever seen, which is > performance. Because of MVCC, Postgres's write performance (insert and > update) appears on my systems to be almost exactly linear to row size. > Inserting 1000 rows into a table with row size 100 characters takes > twice as long as inserting 1000 rows into a table with row size 50 > characters. Not sure why you'd think that's specific to MVCC. It sounds like purely an issue of disk write bandwidth. regards, tom lane
Tom Lane wrote:
I did not see this in MS SQL Server.
Kenneth Downs <ken@secdat.com> writes:Speaking as an end-user, I can give only one I've ever seen, which is performance. Because of MVCC, Postgres's write performance (insert and update) appears on my systems to be almost exactly linear to row size. Inserting 1000 rows into a table with row size 100 characters takes twice as long as inserting 1000 rows into a table with row size 50 characters.Not sure why you'd think that's specific to MVCC. It sounds like purely an issue of disk write bandwidth. regards, tom lane
I did not see this in MS SQL Server.
-- Kenneth Downs Secure Data Software, Inc. www.secdat.com www.andromeda-project.org 631-689-7200 Fax: 631-689-0527 cell: 631-379-0010
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/14/07 14:34, Kenneth Downs wrote: > Tom Lane wrote: >> Kenneth Downs <ken@secdat.com> writes: >> >>> Speaking as an end-user, I can give only one I've ever seen, which is >>> performance. Because of MVCC, Postgres's write performance (insert >>> and update) appears on my systems to be almost exactly linear to row >>> size. Inserting 1000 rows into a table with row size 100 characters >>> takes twice as long as inserting 1000 rows into a table with row size >>> 50 characters. >>> >> >> Not sure why you'd think that's specific to MVCC. It sounds like purely >> an issue of disk write bandwidth. >> >> regards, tom lane >> > > I did not see this in MS SQL Server. It is only logical that it will take 2x as long to insert 2x as much data. Maybe SQL Server is compressing out white space? Or (shudder) heavily caching writes? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwhXTS9HxQb37XmcRAmdTAJ4rpK60hNtcvT82gCD4RG4EPtcC2wCeNR/C poURsgchjku2UC0y476KOfM= =KVNY -----END PGP SIGNATURE-----
> On 08/14/07 14:34, Kenneth Downs wrote: > > Tom Lane wrote: > >> Kenneth Downs <ken@secdat.com> writes: > >> > >>> Speaking as an end-user, I can give only one I've ever seen, which is > >>> performance. Because of MVCC, Postgres's write performance (insert > >>> and update) appears on my systems to be almost exactly linear to row > >>> size. Inserting 1000 rows into a table with row size 100 characters > >>> takes twice as long as inserting 1000 rows into a table with row size > >>> 50 characters. > >>> > >> > >> Not sure why you'd think that's specific to MVCC. It sounds like > purely > >> an issue of disk write bandwidth. > >> > >> regards, tom lane > >> > > > > I did not see this in MS SQL Server. > > It is only logical that it will take 2x as long to insert 2x as much > data. > > Maybe SQL Server is compressing out white space? Or (shudder) > heavily caching writes? There's no SQL Server magic. It doesn't compress whitespace or cache writes in any scary way. Doubling with row width does double the insert time. On SQL Server 2000 sp4: Setup via: create database test_db use test_db create table t50 ( f1 char(50) ) create table t100 ( f1 char(100) ) Test 1: declare @start datetime select @start = getdate() begin transaction insert into t50 ( f1 ) values ( '01234567890123456789012345678901234567890123456789' ); -- Repeat above insert 1000 times commit transaction select datediff(ms, @start, getdate()) Test 2: declare @start datetime select @start = getdate() begin transaction insert into t50 ( f1 ) values ( '012345678901234567890123456789012345678901234567890123456789012345678901234 5678901234567890123456789' ); -- Repeat above insert 1000 times commit transaction select datediff(ms, @start, getdate()) On my system, test one averages around 16ms over 100 tests. Test 2 averages around 33ms over 100 tests. I would wager my week's coffee change the same outcome on SQL 2005 sp2. Paul
"Kenneth Downs" <ken@secdat.com> writes: > RPK wrote: >> I want to know whether MVCC has cons also. Is it heavy on resources? How >> PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. Well the fundamental con of MVCC versus serializing everything using locks is that you have to keep more data around which takes space and i/o resources. > Speaking as an end-user, I can give only one I've ever seen, which is > performance. Because of MVCC, Postgres's write performance (insert and update) > appears on my systems to be almost exactly linear to row size. Inserting 1000 > rows into a table with row size 100 characters takes twice as long as inserting > 1000 rows into a table with row size 50 characters. Well, that's going to be true (assuming i/o is your limiting factor at all) regardless of your architecture. There's no magic here, more bytes take more blocks of space which take more time to write or read. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On 8/14/07, Kenneth Downs <ken@secdat.com> wrote: > RPK wrote: > > I want to know whether MVCC has cons also. Is it heavy on resources? How > > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. > > > > Speaking as an end-user, I can give only one I've ever seen, which is > performance. Because of MVCC, Postgres's write performance (insert and > update) appears on my systems to be almost exactly linear to row size. > Inserting 1000 rows into a table with row size 100 characters takes > twice as long as inserting 1000 rows into a table with row size 50 > characters. You were half right. Inserts in PostgreSQL perform similar to other databases (or at least, use similar mechanisms). It's the updates that suffer, because this translates to delete + insert essentially. Databases that use simple locking strategies can simply update the record in place. PostgreSQL wins in terms of better concurrency (especially in long transactions or transactions that touch a lot of records), cheap rollbacks, and all the advantages of a sophisticated locking engine (transactional ddl for example). merlin
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/14/07 20:36, Merlin Moncure wrote: [snip] > > PostgreSQL wins in terms of better concurrency (especially in long > transactions or transactions that touch a lot of records), cheap > rollbacks, and all the advantages of a sophisticated locking engine > (transactional ddl for example). Although MVCC is not a /sine qua non/ of transactional ddl. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwl4GS9HxQb37XmcRAhm5AJ9pSZR08Vj5aOYtMnQdQaPsjtPMOQCghP+v k81CxBZyH/42wHjPeT0Azvc= =WRv0 -----END PGP SIGNATURE-----
On Wed, 2007-08-15 at 07:06 +0530, Merlin Moncure wrote: > You were half right. Inserts in PostgreSQL perform similar to other > databases (or at least, use similar mechanisms). It's the updates > that suffer, because this translates to delete + insert essentially. > Databases that use simple locking strategies can simply update the > record in place. I think in some databases that use locking, an INSERT can actually block a SELECT, and vice-versa. So wouldn't that mean PostgreSQL MVCC is better for INSERT performance? Regards, Jeff Davis
On 8/15/07, Jeff Davis <pgsql@j-davis.com> wrote: > On Wed, 2007-08-15 at 07:06 +0530, Merlin Moncure wrote: > > You were half right. Inserts in PostgreSQL perform similar to other > > databases (or at least, use similar mechanisms). It's the updates > > that suffer, because this translates to delete + insert essentially. > > Databases that use simple locking strategies can simply update the > > record in place. > > I think in some databases that use locking, an INSERT can actually block > a SELECT, and vice-versa. So wouldn't that mean PostgreSQL MVCC is > better for INSERT performance? For certain values of better, yes. For inserting 10,000,000 rows, it's probably not as fast as some other databases. But, you can insert those 10,000,000 rows while 100 users run select statements and none of them will block. So, even if it takes twice as long as a table locking db, that's still probably ok, because the readers can keep right on reading.
On 8/14/07, RPK <rpk.general@gmail.com> wrote: > > I want to know whether MVCC has cons also. Is it heavy on resources? How > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. Of course it does. There ain't no such thing as a free lunch, after all. PostgreSQL's mvcc implementation means that a row that gets updated a lot may have many dead rows in the database, and if you don't run vacuum often enough, or have enough space allocated in your free space map, your tables can become bloated. In a worst case scenario, a highly updated table may get so big that normal vacuuming cannot salvage it and you would have to either reindex or perform a vacuum full on it. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/