Re: MVCC performance issue - Mailing list pgsql-performance

From Thom Brown
Subject Re: MVCC performance issue
Date
Msg-id AANLkTinhnXGo4SxdHN1b5twy2rguV+1An-XAeDtbnJFi@mail.gmail.com
Whole thread Raw
In response to MVCC performance issue  ("Kyriacos Kyriacou" <kyriacosk@prime-tel.com>)
List pgsql-performance
On 12 November 2010 13:47, Kyriacos Kyriacou <kyriacosk@prime-tel.com> wrote:
This is my first post in this mailing list and I would like to raise an
issue that in my opinion is causing performance issues of PostgreSQL
especially in a transaction processing environment. In my company we are
using PostgreSQL for the last 8 year for our in-house developed billing
system (telecom). The last few months we started considering moving to
another RDBMS just because of this issue.

After all these years, I believe that the biggest improvement that could
be done and will boost overall performance especially for enterprise
application will be to improve Multiversion Concurrency Control (MVCC)
mechanism. In theory this seems to be improving performance for SELECT
queries but on tables with very intensive and frequent updates, even
that is not fully true because of the fragmentation of data caused by
MVCC. I saw cases were a SELECT COUNT(*) on an empty (!!!) table (used
as a buffer) took more than 40min to return a result! VACUUM is not a
solution in my opinion even though after the introduction of autovacuum
daemon situation got much better.

PROBLEM DECRIPTION
------------------
By definition of MVCC, when an UPDATE is performed, PostgreSQL creates a
new copy of the row in a new location. Any SELECT queries within the
same session are accessing the new version of the raw and all other
queries from other users are still accessing the old version. When
transaction is COMMIT PostgreSQL makes the a new version of the row as
the "active" row and expires the old row that remains "dead" and then is
up to VACUUM procedure to recover the "dead" rows space and make it
available to the database engine. In case that transaction is ROLLBACK
then space reserved for the new version of the row is released. The
result is to have huge fragmentation on table space, unnecessary updates
in all affected indexes, unnecessary costly I/O operations, poor
performance on SELECT that retrieves big record sets (i.e. reports etc)
and slower updates. As an example, consider updating the "live" balance
of a customer for each phone call where the entire customer record has
to be duplicated again and again upon each call just for modifying a
numeric value!

SUGGESTION
--------------
1) When a raw UPDATE is performed, store all "new raw versions" either
in separate temporary table space
  or in a reserved space at the end of each table (can be allocated
dynamically) etc
2) Any SELECT queries within the same session will be again accessing
the new version of the row
3) Any SELECT queries from other users will still be accessing the old
version
4) When UPDATE transaction is ROLLBACK just release the space used in
new temporary location
5) When UPDATE transaction is COMMIT then try to LOCK the old version
and overwrite it at the same physical location (NO FRAGMENTATION).
6) Similar mechanism can be applied on INSERTS and DELETES
7) In case that transaction was COMMIT, the temporary location can be
either released or archived/cleaned on a pre-scheduled basis. This will
possibly allow the introduction of a TRANSACTION LOG backup mechanism as
a next step.
8) After that VACUUM will have to deal only with deletions!!!


I understand that my suggestion seems to be too simplified and also that
there are many implementation details and difficulties that I am not
aware.

I strongly believe that the outcome of the discussion regarding this
issue will be helpful.


Which version of PostgreSQL are you basing this on?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

pgsql-performance by date:

Previous
From: Kenneth Marshall
Date:
Subject: Re: MVCC performance issue
Next
From: Jon Nelson
Date:
Subject: Re: postmaster consuming /lots/ of memory with hash aggregate. why?