Thread: Anyone know why PostgreSQL doesn't support 2 phase execution?

Anyone know why PostgreSQL doesn't support 2 phase execution?

From
"Ron Peacetree"
Date:
...and what kind of performance hit we take (and under what
circumstances) for not having it?



Re: Anyone know why PostgreSQL doesn't support 2 phase execution?

From
Rod Taylor
Date:
On Sat, 2003-04-05 at 11:35, Ron Peacetree wrote:
> ...and what kind of performance hit we take (and under what
> circumstances) for not having it?

Do you mean 2-phase commits?  If so, how do you take a performance hit
from *not* having it?  PostgreSQL doesn't have it (prepare & forget
phases)) simply because nobody has completed and submitted an
implementation.  Satoshi is working on the problem.

If not, what do you mean by 2 phase execution?

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Anyone know why PostgreSQL doesn't support 2 phase execution?

From
cbbrowne@cbbrowne.com
Date:
Ron Peacetree wrote
> ...and what kind of performance hit we take (and under what
> circumstances) for not having it?

Are you thinking of "two phase commit"?

There is no "performance hit for not having it."

And it does not currently apply to PostgreSQL.  Two phase commit is only
needed when updates need to be applied simultaneously on multiple
databases.

That is, you'd have something like:

CREATE DISTRIBUTED TRANSACTION X1; insert into table TBL1 in database ('db1', 'id1', 'auth1') values (1, 2, 100.00,
now());insert into table TBL2 in database ('db2', 'id2', 'auth2') values (1, 3, -100.00, now());
 
COMMIT DISTRIBUTED TRANSACTION X1;

where the "in database ('db1', 'id1', 'auth1')" part indicates some form
of connection parameters for the database.

There certainly is merit to having two phase commit; it allows
coordinating updates to multiple databases.

The "degradation of performance" that results from not having this is
that you can't have distributed transactions.  That's not a "performance
hit;" that's a case of "you can't do distributed transactions."  

And distributed transactions are probably /more/ expensive than
nondistributed ones, so it is more readily argued that by not supporting
them, you don't have the problem of performance degrading due to making
use of distributed transactions.
--
output = reverse("gro.gultn@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/nonrdbms.html
Rules of the Evil Overlord #195. "I will not use hostages as bait in a
trap.  Unless  you're going  to use them  for negotiation or  as human
shields, there's no point in taking them."
<http://www.eviloverlord.com/>



Re: Anyone know why PostgreSQL doesn't support 2 phase execution?

From
"Ron Peacetree"
Date:
"Rod Taylor" <rbt@rbt.ca> wrote in message
news:1049735504.40144.35.camel@jester...
> --=-v98gp7DTtZFJa7ee6YMf
> Content-Type: text/plain
> Content-Transfer-Encoding: quoted-printable
>
> On Sat, 2003-04-05 at 11:35, Ron Peacetree wrote:
> > ...and what kind of performance hit we take (and under what
> > circumstances) for not having it?
>
> Do you mean 2-phase commits?  If so, how do you take a performance
hit
> from *not* having it?  PostgreSQL doesn't have it (prepare & forget
> phases)) simply because nobody has completed and submitted an
> implementation.  Satoshi is working on the problem.
>
> If not, what do you mean by 2 phase execution?
>
The performance hit as in "in comparison to DB's that =do= have two
phase execution (and commit for that matter), just how much slower is
PostgreSQL?"

Two phase execution and two phase commit are two different concepts.
Two phase execution splits the execution of queries explicitly into a
"do all the book keeping and setup stuff before execution" phase and
an actual execution phase.  The benefit is that if you are going to
say, step through a largish table in chunks, doing the same query on
each chunk, two phase execution allows the DB to do everything (syntax
checking, query planning, blah, blah) except the actual execution
=once= and reuse it for each subsequent chunk.  Think of it as a close
cousin to loop unrolling.  It also helps parallel performance since
you can hand the "blessed" set up query plan to multiple processes and
those processes can focus on just getting work done.

The lack of two phase =commit= is a also a potential performance hit
in comparison to DB products that have it, but the more important
issue there is that there are SMP/distributed apps that really can't
work acceptably unless a DB product has two phase commit.

The three "biggies" in DB land, SQL Server, Oracle, and DB2, have both
features.  I suspect that PostgreSQL will need to as well...



Re: Anyone know why PostgreSQL doesn't support 2 phase

From
Neil Conway
Date:
On Mon, 2003-04-07 at 14:59, Ron Peacetree wrote:
> Two phase execution and two phase commit are two different concepts.
> Two phase execution splits the execution of queries explicitly into a
> "do all the book keeping and setup stuff before execution" phase and
> an actual execution phase.  The benefit is that if you are going to
> say, step through a largish table in chunks, doing the same query on
> each chunk, two phase execution allows the DB to do everything (syntax
> checking, query planning, blah, blah) except the actual execution
> =once= and reuse it for each subsequent chunk.

If "stepping through a largish table in chunks" is implemented as a
single SQL query, PostgreSQL already does this internally (the parsing,
planning, rewriting, and execution phases are distinct operations inside
the backend).

If the stepping is done as a bunch of similar queries, you can use
prepared queries (as of PostgreSQL 7.3) to do the parsing, planning and
rewriting only once, and then reuse the query plan multiple times.

> It also helps parallel performance since
> you can hand the "blessed" set up query plan to multiple processes and
> those processes can focus on just getting work done.

Prepared queries are per-backend as of PostgreSQL 7.3, so this can't be
done (and I'm a little skeptical that it would be very useful...)

Cheers,

Neil



Re: Anyone know why PostgreSQL doesn't support 2 phase

From
Jan Wieck
Date:
Ron Peacetree wrote:
> [...]
> The lack of two phase =commit= is a also a potential performance hit
> in comparison to DB products that have it, but the more important
> issue there is that there are SMP/distributed apps that really can't
> work acceptably unless a DB product has two phase commit.
> 
> The three "biggies" in DB land, SQL Server, Oracle, and DB2, have both
> features.  I suspect that PostgreSQL will need to as well...

Ron, do you actually have some ideas how to do 2 phase commits?
Especially things like how to re-lock during startup after a crash and
the like? Or is your knowledge in reality only buzzwords collected from
high glossy tradeshow flyers?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: Anyone know why PostgreSQL doesn't support 2 phase

From
"Ron Peacetree"
Date:
"Jan Wieck" <JanWieck@Yahoo.com> wrote in message
news:3E942A99.A6102F2D@Yahoo.com...
> Ron Peacetree wrote:
> > [...]
> > The lack of two phase =commit= is a also a potential performance
hit
> > in comparison to DB products that have it, but the more important
> > issue there is that there are SMP/distributed apps that really
can't
> > work acceptably unless a DB product has two phase commit.
> >
> > The three "biggies" in DB land, SQL Server, Oracle, and DB2, have
both
> > features.  I suspect that PostgreSQL will need to as well...
>
> Ron, do you actually have some ideas how to do 2 phase commits?
> Especially things like how to re-lock during startup after a crash
and
> the like? Or is your knowledge in reality only buzzwords collected
from
> high glossy tradeshow flyers?
>
If "some ideas" means "do I know how to code it into PostgreSQL right
now", the answer is no.  If "some ideas" means "do I understand the
general problem at a technical level well enough to be thinking about
the algorithms and datastructures needed to support the functionality"
the answer is yes.

So I'd say a fair response to your questions is that my knowledge is
in between the two extremes you've described, but probably closer to
the first than the second ;-).  We can have a private email discussion
on the topic if you wish.