Re: New feature request: FlashBack Query - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: New feature request: FlashBack Query |
Date | |
Msg-id | 1171987361.3596.40.camel@localhost.localdomain Whole thread Raw |
In response to | Re: New feature request: FlashBack Query ("Jonah H. Harris" <jonah.harris@gmail.com>) |
Responses |
Re: New feature request: FlashBack Query
|
List | pgsql-hackers |
Ühel kenal päeval, T, 2007-02-20 kell 10:20, kirjutas Jonah H. Harris: > On 2/20/07, Gregory Stark <stark@enterprisedb.com> wrote: > > I used to say that too but I've since realized it's not really true. > > Heh, take a joke man... I was following up on Drake's email :) > > But, since you want to discuss your view of the systems openly... I'll > gladly reply :) > > > It's more like Oracle is optimized for data that's committed > > long in the past and we're optimized for data that's > > been recently updated. > > Wrong. When Oracle says it's committed, it's committed. No > difference between when, where, and how. In Oracle, the committed > version is *always* the first presented to the user... it takes time > to go back and look at older versions; Older versions are also committed :) He probably meant longer transactions and several versions visible to different backends. > but why shouldn't that be a bit slower, it isn't common practice anyway. Not for pure OLAP, at least when you have fairly fast transactions. But it can slow things down when you have some hotspot tables. > Same with rollbacks... why > should they optimize for them when 97% of transactions commit? Or other way around, - you should write code, where most transactions commit ;) > > In Oracle the data that's been committed long in the past requires no > > transactional overhead but the data that's been recently updated requires lots > > of work to fetch the right version. > > Wrong. The same transactional overhead applies to *all* data in > Oracle no matter of when it was committed. Similarly, the only > overhead required occurs when someone is querying in serializable > isolation or on read-committed data before or during a commit. On > short OLTP-type transactions, Oracle has the most optimized solution. > > > In Postgres it's the other way around. data that's been committed deleted long > > ago requires extra work to clean up but data that's been recently changed > > requires little additional work to see the correct version. > > PostgreSQL has little additional work? Like, checking the validity of > every tuple? Oracle checks visibility at the block level, so there's > *much* less overhead. Hmm. How can it check visibility at block level and at the same time do in-place updates on single tuples ? > Take most of the benchmarks which can hold ~200 > tuples per block. Tables in those benchmarks are 100+ million rows. > On a sequential scan, Oracle would perform 500K checks, PostgreSQL > would perform *all* 100M checks (not counting dead versions due to row > updates and the like). My proposal of keeping visibility info in a separate heap would help to get similar results, that is mostly 1 check per page. That would also cover much of the index lookup cases below. > On an index scan, Oracle not only has a > smaller index and less to check, but also knows the tuple will be > committed and will, in most cases, not have to perform additional > physical I/O to find the latest version of a row. It is also the reason why you can forget about doing simultaneous data loading and queries on the same table. If you know avoid doing that, then it "will, in most cases, not have to perform additional physical I/O to find the latest version of a row" ;) > Of course, Oracle's design is much more complicated in its ability to > build read-committed versions of the blocks at runtime; something the > simplicity of PostgreSQL's MVCC design eliminates. > > > In a sense then it's the opposite of what we usually say. Oracle is optimized > > for mostly static data. Postgres is optimized for changing data. > > Care to share an example to prove it? > > Like always, there are pros and cons with both designs, but denying > facts gets us nowhere. We're off-topic now... so we should either > move this off line or to another thread. I personally don't see much > of a reason to continue discussing MVCC designs anymore as Oracle's is > patented and PostgreSQL's is highly unlikely to change drastically. I don't think we will ever move to rollback segments, but for some use-cases moving visibility to a separate heap could make sense. And if we want to bring back time travel (see another thread about "Flashback Queries"), then we may end up implementing the original postgresql's design spec and make VACUUM spihon dead tuples over to archive relations, which already starts looking a little like rollback segments, only for other purposes :) > As always, I'd suggest discussing improvements, not the status quo. > Likewise, discussing Oracle's design, drawbacks, and limitations > without having used it extensively is quite obvious to anyone familiar > with Oracle. Using a system extensively can also create blind spots about some of the systems (mis)features. One learns to avoid doing some things without consciously knowing about it. > Don't get me wrong, it's fine to prefer one design to > another, but pushing discussion items comparing Oracle to PostgreSQL > because of things you've heard or read somewhere isn't the same as > understanding them because you've used them. To get a really meaningful discussion we should involve someone who has *designed* them, not merely used them . -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
pgsql-hackers by date: