Thread: Re: Are you actively hacking on 2PC?

Re: Are you actively hacking on 2PC?

From
Alvaro Herrera
Date:
Just to keep you guys posted, here is the work-in-progress 2PC patch.
I've done some cleanup, but I still want to do some more.

Tom, please let me know if you want to hack on it, if that's the case
I'll stop my cleanup and start to work on supporting the things that
need further coding, which are AFAICS MultiXactId, large objects and
notifies.

Regarding cleanup I want to do the following:

- remove the gidStash from xact.c, make twophase.c deal with it
  internally
- rework the iterator functions for procarray.c (previously known as sinval.c)
- rework the callback infrastructure

I haven't kept a detailed TODO list, I guess it's time to start.

Note that I haven't updated the regression test expected file.
Moreover, the current test exposes a bug that I think it's in sinval
management, not sure where.

This patch creates the following new files:

doc/src/sgml/ref/commit_prepared.sgml
doc/src/sgml/ref/prepare_transaction.sgml
doc/src/sgml/ref/rollback_prepared.sgml
src/backend/access/transam/twophase.c
src/backend/access/transam/twophase_rmgr.c
src/include/access/twophase.h
src/include/access/twophase_rmgr.h
src/include/storage/lockrmgr.h
src/test/regress/expected/prepared_xacts.out
src/test/regress/sql/prepared_xacts.sql

Thanks,

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"

Attachment

Re: Are you actively hacking on 2PC?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> Tom, please let me know if you want to hack on it, if that's the case
> I'll stop my cleanup and start to work on supporting the things that
> need further coding, which are AFAICS MultiXactId, large objects and
> notifies.

Well, I am just finishing up the multixact XLOG rewrite, so I was about
to pester you as to where you are on this.

I would like to start doing something with it --- how can we best avoid
stepping on each others' toes?

            regards, tom lane

Re: Are you actively hacking on 2PC?

From
Alvaro Herrera
Date:
On Tue, Jun 07, 2005 at 07:00:48PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > Tom, please let me know if you want to hack on it, if that's the case
> > I'll stop my cleanup and start to work on supporting the things that
> > need further coding, which are AFAICS MultiXactId, large objects and
> > notifies.
>
> Well, I am just finishing up the multixact XLOG rewrite, so I was about
> to pester you as to where you are on this.
>
> I would like to start doing something with it --- how can we best avoid
> stepping on each others' toes?

Please use the patch I posted yesterday as starting point -- I'll work
on missing features.  Just don't touch notify.c nor large object support
files...

I reworked the iterator routines for procarray.c and disaster ensued --
but I'm not sure if that uncovered previous bugs or just introduced some
semantic change I'm not seeing.  If you rewrite the whole stuff my
changes will prove useless, but I don't want to slow you down if it's a
bug of mine.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)

Re: Are you actively hacking on 2PC?

From
Alvaro Herrera
Date:
Ok, I made some more changes and the current patch is here (the same new
files list applies).  If you already started looking at the other patch
I can give you the patch between both, if it makes your life easier.

Additionally I collected some to-do entries, maybe you find it useful.

I'm sure you'll rewrite lots of things, so in order not to collide I'll
go back to my shared dependencies patch :-)  (regarding the two things I
said I'd work on, notifes are already supported; and large object
support seems more hassle than worth to implement and we could punt for
this release ... )


* Verify that TransactionIdIsInProgress is correct w.r.t.
  TransactionIdIsPrepared.

* MultiXactId needs to be supported.

* Clean up the callback support.

* Make sure we don't lose a file descriptor after BasicOpenFile if we
  ereport() after opening it.  Maybe use FileNameOpenFile instead.

* Clean up the use of transaction identifiers.  X/Open has a Transaction
  Identifier standard definition, look that up.

* Is PREPARE TRANSACTION the best interface?  What about marking a transaction
  for preparing when it begins?  Apparently the XA spec requires/prefers this.

* Fix the erroneous sinval interactions reported by the regression test.

* Make the relcache-file-delete logic separate from particular sinval messages.

* Check whether the TwoPhaseStateLock usage per GetPreparedTransactionXidList
  is safe.  (In particular in CheckPointTwoPhase)

* Rethink about using both WAL and a state file.  (If we declared the
  intent to persist a transaction when it begins, maybe we don't need
  the state file at all.)

* Rework the XLog entries for prepare, commit prepared, abort prepared.
  They seem messy.

* Verify the order of calls in PrepareTransaction.

* Why is the new code in GetOldestXmin dependent on allDbs?  That seems bogus.

* Add a databaseId field in GlobalTransactionData, and check it where
  appropiate.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"The first of April is the day we remember what we are
the other 364 days of the year"  (Mark Twain)

Attachment

Re: Are you actively hacking on 2PC?

From
Heikki Linnakangas
Date:
On Wed, 8 Jun 2005, Alvaro Herrera wrote:

> Additionally I collected some to-do entries, maybe you find it useful.

Good list overall. Comments to some entries below:

> * Clean up the callback support.

Yeah. Possibly remove the callback abstraction alltogether and wire
calls to different functions directly in a switch statement etc. I
expected there to be more functions in the callback tables but it didn't
turn out that way.

> * Is PREPARE TRANSACTION the best interface?  What about marking a transaction
>  for preparing when it begins?  Apparently the XA spec requires/prefers this.

I'd prefer not to mark the transaction on begin. I might not know it's
going to be a two-phase transaction until later on. At the very least it
must still be possible to do a regular commit even if you mark the
transaction as two-phase in the begin.

We'll have to see what the XA spec says about it.

BTW: while googling for the XA spec, I bumped into RFC2371 - Transaction
Internet Protocol, which seems to be a protocol for coordinating a
two-phase commit over TCP. It looks quite flexible, it should work either
way. I don't know how widely adopted it is, it's first time I hear about
it.

> * Check whether the TwoPhaseStateLock usage per GetPreparedTransactionXidList
>  is safe.  (In particular in CheckPointTwoPhase)

I don't see a problem, no other locks are held at the same time. Can you
elaborate?

> * Rethink about using both WAL and a state file.  (If we declared the
>  intent to persist a transaction when it begins, maybe we don't need
>  the state file at all.)

The purpose of the state file is two-fold. First, the original backend
uses it to send a message, in a sense, to the backend that's later going
to finish the 2nd phase commit. It contains the instructions to do the
finishing.

Secondly, the pg_twophase directory and the set of valid state files in it
allows prepared transactions to be picked up by recovery after crash.

The first need could be covered with a shared memory structure. But
shared memory is fixed-size.

I don't know any good alternatives for the second need. The information
need to be retained over checkpoints, so the WAL alone is not enough.

I wouldn't worry about the overhead of writing the same data twice, the
state files are typically around 300 bytes in the current implementation.
And it could probably be cut down even more with some thought, making the
gid variable size for example.

I'm a bit bothered with the fact that we have to create and delete a file
for each transaction, though. That could be expensive on some file
systems.

I don't see how it would help to declare the intent to persist
on transaction at begin.

> * Why is the new code in GetOldestXmin dependent on allDbs?  That seems bogus.

It's a bit accidental. GetOldestXmin is used in three places:

1. It determines the subtrans files safe truncation point
2. It determines the vacuum cut-off point, tuples older than that can be
removed.
3. Similar to 2, it's used in index building to determine which tuples
are dead.

Case 1 needs to take prepared transactions into account. Cases 2 & 3
don't, because a prepared transaction can't read any tuples anymore.

Case 1 happens to call GetOldestXmin with allDbs set to true, so I just
wired the logic to that parameter. It should be cleaned up.

- Heikki