Thread: A not so good comparison of MVCC implementations
https://dzone.com/articles/database-design-decisions-for-multi-version-concur That doesn't make Postgres look particular well
Greetings, * Thomas Kellerer (spam_eater@gmx.net) wrote: > https://dzone.com/articles/database-design-decisions-for-multi-version-concur > > That doesn't make Postgres look particular well While interesting, if I'm following the paper correctly, they didn't actually test *Postgres*, they tested their own implementation of how PG works using "Peloton". They also, apparently, discounted latency pretty heavily given that their graph shows their "PG" implementation having the lowest latency of all of the options. If my reading is correct and they didn't actually test these systems but just their own implementation then it strikes me that this paper and those graphs are particularly disingenuous and throw around these product names specifically to try and garner attention. The findings in the paper may still be useful, of course, but it's unclear how much real-world implication they have for users of the different products and if one product would work better for a given user or workload than another. One thing mentioned is the idea, again, of having indexes which include the primary key of the table (a logical ID instead of the physical tuple location) which has been discussed and patches proposed for. That seemed to be combined with the idea of flipping HOT chains to have the latest version first instead of last in the chain. Using logical IDs instead of physical ones can reduce the updates required for indexes on tables which have more than just the primary key and where the primary key only rarely changes, since that ends up becoming more expensive. Of course, that also means that double-lookups are required when using those non-primary-key indexes, which may explain the higher latency seen in the approaches tested which use that. This was just a quick review of the paper and article, just to be clear, but it doesn't strike me as particularly concerning. Unsurprisingly, there are lots of trade-offs to be made and we continue to look at ways to make PostgreSQL more flexible to allow users to choose which trade-offs work best for their workload. Thanks! Stephen
Attachment
On Fri, Jan 26, 2018 at 7:22 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Thomas Kellerer (spam_eater@gmx.net) wrote: >> https://dzone.com/articles/database-design-decisions-for-multi-version-concur >> >> That doesn't make Postgres look particular well > > While interesting, if I'm following the paper correctly, they didn't > actually test *Postgres*, they tested their own implementation of how PG > works using "Peloton". Yeah, that's really deceptive. > They also, apparently, discounted latency pretty > heavily given that their graph shows their "PG" implementation having > the lowest latency of all of the options. Also, they seem to be comparing against PostgreSQL with SSI running (transaction isolation level serializable) which is not actually the way that people typically configure PostgreSQL. The point of the article seems to be to say that NuoDB made some good design decisions, rather than to objective compare existing systems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Jan 26, 2018, at 10:08, Robert Haas <robertmhaas@gmail.com> wrote: > > The point of the article seems to be to say that NuoDB made some good > design decisions, rather than to objective compare existing systems. It does remind me a bit of the Uber paper, in that they started with a technical decision they had already made, and workedbackwards from there.
Hi, > On Jan 26, 2018, at 1:08 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Jan 26, 2018 at 7:22 AM, Stephen Frost <sfrost@snowman.net> wrote: >> * Thomas Kellerer (spam_eater@gmx.net) wrote: >>> https://dzone.com/articles/database-design-decisions-for-multi-version-concur >>> >>> That doesn't make Postgres look particular well >> >> While interesting, if I'm following the paper correctly, they didn't >> actually test *Postgres*, they tested their own implementation of how PG >> works using "Peloton". > > Yeah, that's really deceptive. Skimming the paper, it also does not mention which versions of the software are being used. Ideally how the DBs were configured on the hardware would be great to see too, but that may be asking too much. >> They also, apparently, discounted latency pretty >> heavily given that their graph shows their "PG" implementation having >> the lowest latency of all of the options. > > Also, they seem to be comparing against PostgreSQL with SSI running > (transaction isolation level serializable) which is not actually the > way that people typically configure PostgreSQL. > > The point of the article seems to be to say that NuoDB made some good > design decisions, rather than to objective compare existing systems. So the question is if and how we respond. From a scan of the Twittersphere I do not see much talk about the paper, so I would not give it that much thought at this point and would not advocate for proactively addressing it. However, if anyone wants to independently benchmark it and provide some fair comparisons, that is something that we’ve certainly promoted through Planet PostgreSQL. Additionally, if anyone wants to comment to others who are referencing that paper e.g. on Twitter etc. there are enough sound points in this thread alone to help make the case of Postgres even without additional data. > Unsurprisingly, > there are lots of trade-offs to be made and we continue to look at > ways to make PostgreSQL more flexible to allow users to choose which > trade-offs work best for their workload. +1 Jonathan
On Fri, Jan 26, 2018 at 2:07 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: >> Yeah, that's really deceptive. > > Skimming the paper, it also does not mention which versions of the software > are being used. Ideally how the DBs were configured on the hardware > would be great to see too, but that may be asking too much. That's because they didn't use *any* version of PostgreSQL. They tested something that they claim works *like* PostgreSQL but is actually not the PostgreSQL code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi Robert, > On Jan 26, 2018, at 7:03 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Jan 26, 2018 at 2:07 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: >>> Yeah, that's really deceptive. >> >> Skimming the paper, it also does not mention which versions of the software >> are being used. Ideally how the DBs were configured on the hardware >> would be great to see too, but that may be asking too much. > > That's because they didn't use *any* version of PostgreSQL. They > tested something that they claim works *like* PostgreSQL but is > actually not the PostgreSQL code. To clarify, that comment was based on all the databases they were using, not just PostgreSQL. Thanks, Jonathan
On Sun, Jan 28, 2018 at 05:07:43PM -0500, Jonathan S. Katz wrote: >> On Jan 26, 2018, at 7:03 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Fri, Jan 26, 2018 at 2:07 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: >>>> Yeah, that's really deceptive. >>> >>> Skimming the paper, it also does not mention which versions of the software >>> are being used. Ideally how the DBs were configured on the hardware >>> would be great to see too, but that may be asking too much. >> >> That's because they didn't use *any* version of PostgreSQL. They >> tested something that they claim works *like* PostgreSQL but is >> actually not the PostgreSQL code. > > To clarify, that comment was based on all the databases they were using, > not just PostgreSQL. Their article never uses "configuration", "configure" and has no mention about what kind of tuning they've done for any systems. -- Michael