Re: Anyone working on better transaction locking? - Mailing list pgsql-hackers

From cbbrowne@cbbrowne.com
Subject Re: Anyone working on better transaction locking?
Date
Msg-id 20030412150051.A7DC559A20@cbbrowne.com
Whole thread Raw
In response to Re: Anyone working on better transaction locking?  ("scott.marlowe" <scott.marlowe@ihs.com>)
Responses Re: Anyone working on better transaction locking?  (Kevin Brown <kevin@sysexperts.com>)
List pgsql-hackers
Scott Marlowe wrote:
> On Wed, 9 Apr 2003, Ron Peacetree wrote:
> 
> > "Andrew Sullivan" <andrew@libertyrms.info> wrote in message
> > news:20030409170926.GH2255@libertyrms.info...
> > > On Wed, Apr 09, 2003 at 05:41:06AM +0000, Ron Peacetree wrote:
> > > Nonsense.  You explicitly made the MVCC comparison with Oracle, and
> > > are asking for a "better" locking mechanism without providing any
> > > evidence that PostgreSQL's is bad.
> > >
> > Just because someone else's is "better" does not mean PostgreSQL's is
> > "bad", and I've never said such.  As I've said, I'll get back to Tom
> > and the list on this.
> 
> But you didn't identify HOW it was better.  I think that's the point 
> being made.

Oh, but he presented such detailed statistics to prove his case, didn't you 
see it?  :-)

> > > > Please see my posts with regards to ...
> > >
> > > I think your other posts were similar to the one which started this
> > > thread: full of mighty big pronouncements which turned out to depend
> > > on a bunch of not-so-tenable assumptions.
> > >
> > Hmmm.  Well, I don't think of algorithm analysis by the likes of
> > Knuth, Sedgewick, Gonnet, and Baeza-Yates as being "not so tenable
> > assumptions", but YMMV.  As for "mighty pronouncements", that also
> > seems a bit misleading since we are talking about quantifiable
> > programming and computer science issues, not unquantifiable things
> > like politics.
> 
> But the real truth is revealed when the rubber hits the pavement.  
> Remember that Linux Torvalds was roundly criticized for his choice of a 
> monolithic development model for his kernel, and was literally told that 
> his choice would restrict to "toy" status and that no commercial OS could 
> scale with a monolithic kernel.

Indeed.  I have the books from all of the above (when I studied databases 
under Gonnet, Baeza-Yates was his TA...).  And I have seen enough cases of the 
conglomeration of multiple algorithms not behaving the way a blind read of 
their books might suggest to refuse to blindly assume that things are so 
simple.

In the /real/ world, the dictates of flushing buffers to help ensure 
robustness can combine with having enough memory to virtually eliminate read 
I/O to substantially change the results from some simplistic O(f(n)) analysis.

Which is NOT to say that computational complexity is unimportant; what it 
indicates is that theoretical results are merely theoretical.  And may only 
represent a small part of what happens in practice.  The nonsense about radix 
sorts was a wonderful example; it would likely only be useful with PostgreSQL 
if you had some fantastical amount of memory that might not actually be able 
to be constructed within the confines of our solar system.

> There's no shortage of people with good ideas, just people with the skills 
> to implement those good ideas.  If you've got a patch to apply that's been 
> tested to show something is faster EVERYONE here wants to see it.
> 
> If you've got a theory, no matter how well backed up by academic research, 
> it's still just a theory.  Until someone writes to code to implement it, 
> the gains are theoretical, and many things that MIGHT help don't because 
> of the real world issues underlying your database, like I/O bandwidth or 
> CPU <-> memory bandwidth.

An unfortunate thing (to my mind) is that *genuinely novel* operating system 
research has pretty much disappeared.  All we see, these days, are rehashes of 
VMS, MVS, and Unix, along with some reimplementations of P-Code under monikers 
like "JVM", ".NET" or "Parrot."

There's good reason for it; if you build something that is much more than 95% 
indistinguishable from Unix, then you'll be left with the *enormous* projects 
of creating completely new infrastructure for compilers, data persistence 
("novel" would mean, to my mind, concepts different from files), program 
editors, and such.  But if it's 95% the same as Unix, then Emacs, GCC, CVS, 
PostgreSQL, and all sorts of "tool chain" are available to you.

What is unfortunate is that it would be nice to try out some things that are 
Very Different.  Unfortunately, it might take five years of slogging through 
recreating compilers and editors in order to get in about 6 months of "solid 
novel work."

Of course, if you don't plan to lift your finger to help make any of it 
happen, it's easy enough to "armchair quarterback" and suggest that someone 
else do all sorts of would-be "neat things."

> > > I'm sorry to be so cranky about this, but I get tired of having to
> > > defend one of my employer's core technologies from accusations based
> > > on half-truths and "everybody knows" assumptions.  For instance,
> > >
> > Again, "accusations" is a bit strong.  I thought the discussion was
> > about the technical merits and costs of various features and various
> > ways to implement them, particularly when this product must compete
> > for installed base with other solutions.  Being coldly realistic about
> > what a product's strengths and weaknesses are is, again, just good
> > business.  Sun Tzu's comment about knowing the enemy and yourself
> > seems appropriate here...

> No, you're wrong.  Postgresql doesn't have to compete.  It doesn't have to 
> win.  it doesn't need a marketing department.  All those things are nice, 
> and I'm glad if it does them, but doesn't HAVE TO.  Postgresql has to 
> work.  It does that well.

Having a bit more of a "marketing department" might be a nice thing; it could 
make it easier for people that would like to deploy PG to get the idea past 
the higher-ups that have a hard time listening to things that *don't* come 
from that department.

> > > > I'll mention thread support in passing,
> > >
> > > there's actually a FAQ item about thread support, because in the
> > > opinion of those who have looked at it, the cost is just not worth
> > > the benefit.  If you have evidence to the contrary (specific
> > > evidence, please, for this application), and have already read all
> > the
> > > previous discussion of the topic, perhaps people would be interested
> > in
> > > opening that debate again (though I have my doubts).
> > >
> > Zeus had a performance ceiling roughly 3x that of Apache when Zeus
> > supported threading as well as pre-forking and Apache only supported
> > pre forking.  The Apache folks now support both.  DB2, Oracle, and SQL
> > Server all use threads.  Etc, etc.
> 
> Yes, and if you configured your apache server to have 20 or 30 spare 
> servers, in the real world, it was nearly neck and neck to Zeus, but since 
> Zeus cost like $3,000 a copy, it is still cheaper to just overwhelm it 
> with more servers running apache than to use zeus.

All quite entertaining.  Andrew was perhaps trolling just a little bit there; 
our resident "algorithm expert" was certainly easily sucked into leaping down 
the path-too-much-trod.  Just as with choices of sorting algorithms, it's easy 
enough for there to be more to things than whatever the latest academic 
propaganda about threading is.

The VITAL point to be made about threading is that there is a tradeoff, and 
it's not the one that "armchair-quarterbacks-that-don't-write-code" likely 
think of.

--> Hand #1:  Implementing a threaded model would require a lot of work, and 
the *ACTUAL* expected benefits are unknown.

--> Hand #2:  So far, other *easier* optimizations have been providing 
significant speedups, requiring much less effort.

At some point in time, it might be that "doing threading" might become the 
strategy most expected to reap the most rewards for the least amount of 
programmer effort.  Until that time, it's not worth worrying about it.

> > That's an awful lot of very bright programmers and some serious $$
> > voting that threads are worth it.  
> 
> For THAT application.  for what a web server does, threads can be very 
> useful, even useful enough to put up with the problems created by running 
> threads on multiple threading libs on different OSes.  
> 
> Let me ask you, if Zeus scrams and crashes out, and it's installed
> properly so it just comes right back up, how much data can you lose?
> 
> If Postgresql scrams and crashes out, how much data can you lost?

There's another possibility, namely that the "voting" may not have anything to 
do with threading being "best."  Instead, it may be a road to allow the 
largest software houses, that can afford to have enough programmers that can 
"do threading," to crush smaller competitors.  After all, threading offers 
daunting new opportunities for deadlocks, data overruns, and crashes; if only 
those with the most, best thread programmers can compete, that discourages 
others from even /trying/ to compete.
--
output = ("cbbrowne" "@ntlug.org")
http://www3.sympatico.ca/cbbrowne/sgml.html
"I visited  a company  that was doing  programming in BASIC  in Panama
City and I asked them if they resented that the BASIC keywords were in
English.   The answer  was:  ``Do  you resent  that  the keywords  for
control of actions in music are in Italian?''"  -- Kent M Pitman



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Anyone working on better transaction locking?
Next
From: Lamar Owen
Date:
Subject: Re: [GENERAL] Upgrade to Red Hat Linux 9 broke PostgreSQL