Re: A not so good comparison of MVCC implementations - Mailing list pgsql-advocacy

From Stephen Frost
Subject Re: A not so good comparison of MVCC implementations
Date
Msg-id 20180126122251.GE2416@tamriel.snowman.net
Whole thread Raw
In response to A not so good comparison of MVCC implementations  (Thomas Kellerer <spam_eater@gmx.net>)
Responses Re: A not so good comparison of MVCC implementations  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-advocacy
Greetings,

* Thomas Kellerer (spam_eater@gmx.net) wrote:
> https://dzone.com/articles/database-design-decisions-for-multi-version-concur
>
> That doesn't make Postgres look particular well

While interesting, if I'm following the paper correctly, they didn't
actually test *Postgres*, they tested their own implementation of how PG
works using "Peloton".  They also, apparently, discounted latency pretty
heavily given that their graph shows their "PG" implementation having
the lowest latency of all of the options.  If my reading is correct and
they didn't actually test these systems but just their own
implementation then it strikes me that this paper and those graphs are
particularly disingenuous and throw around these product names
specifically to try and garner attention.  The findings in the paper may
still be useful, of course, but it's unclear how much real-world
implication they have for users of the different products and if one
product would work better for a given user or workload than another.

One thing mentioned is the idea, again, of having indexes which include
the primary key of the table (a logical ID instead of the physical tuple
location) which has been discussed and patches proposed for.  That
seemed to be combined with the idea of flipping HOT chains to have the
latest version first instead of last in the chain.  Using logical IDs
instead of physical ones can reduce the updates required for indexes on
tables which have more than just the primary key and where the primary
key only rarely changes, since that ends up becoming more expensive.  Of
course, that also means that double-lookups are required when using
those non-primary-key indexes, which may explain the higher latency seen
in the approaches tested which use that.

This was just a quick review of the paper and article, just to be clear,
but it doesn't strike me as particularly concerning.  Unsurprisingly,
there are lots of trade-offs to be made and we continue to look at
ways to make PostgreSQL more flexible to allow users to choose which
trade-offs work best for their workload.

Thanks!

Stephen

Attachment

pgsql-advocacy by date:

Previous
From: Thomas Kellerer
Date:
Subject: A not so good comparison of MVCC implementations
Next
From: Robert Haas
Date:
Subject: Re: A not so good comparison of MVCC implementations