Re: questions about PG update performance - Mailing list pgsql-hackers

From José Luis Tallón
Subject Re: questions about PG update performance
Date
Msg-id 562E158C.8050201@adv-solutions.net
Whole thread Raw
In response to Re: questions about PG update performance  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On 10/26/2015 05:49 AM, Amit Kapila wrote:
On Mon, Oct 26, 2015 at 9:03 AM, Любен Каравелов <karavelov@mail.bg> wrote:
>
>
> ----- Цитат от Kisung Kim (kskim@bitnine.co.kr), на 26.10.2015 в 04:36 -----
>
> > However, what I want to know is about the update performance difference
> > between PG and Oracle if there any.
> > The case I described is for exaggerating the difference between PG and
> > Oracle.
> >
> > I want to explain for our clients that PG's update performance is
> > comparable to Oracle's.
> >
>
> Oracle is also using MVCC but copies the old row in the rollback segment and
> rewrites the values in-place.

I think Oracle just copies the changed part of old row to rollback segment.
Also in Redo logs, it just writes the changed column value (both old and
new).  So for the case we are discussing in this thread (one changed
column out of 200 columns), Oracle will just write the old value of that
column in Redo and then in rollback segment, and write the new value
in Redo and then do the in-place update in heap row.

IMV, where Oracle is heavily optimized for "most DML transactions will commit successfully" and "no long-running transactions shall ever exists" / "not many transactions will have to read previous snapshots"(based on PI), Postgres does not actually make any such assumptions.

    Hence, for long running transactions / massive concurrency-many clients reading and writing older snapshots, Postgres will be faster (less work to do compared to re-constructing rows based on PIs)

    Plus, for updates where the size of the NEW row is bigger than the previous one (think adding text) the overhead is actually greater for Oracle (plus, they don't compress variable length values by default / no TOAST )... so here Postgres would be faster.
    For text-intensive workloads, Postgres is measurably faster than Oracle mostly due to this fact (plus much more efficient in it use of storage/RAM...)


In PostgreSQL, the whole new row is written in heap and diff tuple (difference
of old and new tuple; this optimization is done in 9.4, commit a3115f0d)
in WAL.  I think we can try to optimize and use the same technique for
heap as used for WAL to make PostgreSQL more efficient for such scenario's,
however as of today, my guess is that PostgreSQL's update would be lagging
in this area.

Yup. But see above for a potential reason where it might not be that bad, especially after the optimization you mention.

> It is still 2 writes as in Postgres.

The difference is in the amount of data written per write.

Yes, but compressed (for varlena-based datum/data), batched (group-commit) so mostly sequential, and non-duplicated (WAL vs REDO+UNDO).

So I guess the difference is quite small nowadays, and differences will be heavily influenced by actual workload.


Just my 2 (euro-) cents.


    / J.L.

pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: pg_basebackup and replication slots
Next
From: Euler Taveira
Date:
Subject: Re: pg_basebackup and replication slots