Thread: Replication Bundled with Main Source.

Replication Bundled with Main Source.

From

Unihost Web Hosting

Date:

08 October 2003, 07:58:56

Hi All,

Firstly I've gotta say that I think that PostgreSQL is one of the finest
OSS projects out there and full credit to all of those involved.

After talking to a couple of other consultants who use Pg, and fully
encourage their clients in the enterprise that Pg is a perfectly viable
solution for a variety of scenarios, the question seems to crop up quite
often: "What About Replication?".  Whilst I understand that the eRServer
project is a fine project, and more than capable, and rapidly reaching
the point of having minimal bugginess, I have to wonder why there is no
talk of including replication capability within the main source tree.
After all in todays RDMS arena, it would seem almost like it is an after
thought, as CTOs expect replication to be a feature of the server,
rather than seeming to be an afterthought.  Almost like having an
transactional engine bolted on after the fact.

Are there likely to be any plans to integrate a replication engine into
the main code which could be switchable at compile time
'--with-replication'  for instance.  I beleive that this would encourage
acceptance within the corporate environment and lead to a more
well-rounded offering.

Just my 2 cents (or tuppence-ha'penny for those in blighty)

Regards

Tony

Re: Replication Bundled with Main Source.

From

"Joshua D. Drake"

Date:

08 October 2003, 18:08:02

Hello,

   It is not that we don't want to include replication in the base
project it is that ERserver does not meet the requirements of what can
be included in the base project. Specifically (I believe) the
requirement of Java.

Sincerely,

Joshua Drake


Unihost Web Hosting wrote:

> Hi All,
>
> Firstly I've gotta say that I think that PostgreSQL is one of the
> finest OSS projects out there and full credit to all of those involved.
> After talking to a couple of other consultants who use Pg, and fully
> encourage their clients in the enterprise that Pg is a perfectly
> viable solution for a variety of scenarios, the question seems to crop
> up quite often: "What About Replication?".  Whilst I understand that
> the eRServer project is a fine project, and more than capable, and
> rapidly reaching the point of having minimal bugginess, I have to
> wonder why there is no talk of including replication capability within
> the main source tree.  After all in todays RDMS arena, it would seem
> almost like it is an after thought, as CTOs expect replication to be a
> feature of the server, rather than seeming to be an afterthought.
> Almost like having an transactional engine bolted on after the fact.
>
> Are there likely to be any plans to integrate a replication engine
> into the main code which could be switchable at compile time
> '--with-replication'  for instance.  I beleive that this would
> encourage acceptance within the corporate environment and lead to a
> more well-rounded offering.
>
> Just my 2 cents (or tuppence-ha'penny for those in blighty)
>
> Regards
>
> Tony
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
Editor-N-Chief - PostgreSQl.Org - http://www.postgresql.org

Re: Replication Bundled with Main Source.

From

Bruce Momjian

Date:

08 October 2003, 18:15:58

Joshua D. Drake wrote:
> Hello,
>
>    It is not that we don't want to include replication in the base
> project it is that ERserver does not meet the requirements of what can
> be included in the base project. Specifically (I believe) the
> requirement of Java.

Maybe they will move to C someday.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Replication Bundled with Main Source.

From

Tom Lane

Date:

08 October 2003, 21:29:50

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Joshua D. Drake wrote:
>> It is not that we don't want to include replication in the base
>> project it is that ERserver does not meet the requirements of what can
>> be included in the base project. Specifically (I believe) the
>> requirement of Java.

> Maybe they will move to C someday.

Well, JDBC requires Java, and it's still in the main distro.

I think the real answer is that until recently, ERserver wasn't open
source and we didn't have the option to include it.  Now that it is
open source, we could think about it.  Having looked at the code, I
think it's definitely not ready for prime time, but it could get there
with some work.  When it's of comparable solidity to the base project
I'd be in favor of adding it to the base distro.

            regards, tom lane

Re: Replication Bundled with Main Source.

From

Andrew Sullivan

Date:

09 October 2003, 11:15:53

On Wed, Oct 08, 2003 at 08:28:57PM -0400, Tom Lane wrote:

> open source, we could think about it.  Having looked at the code, I
> think it's definitely not ready for prime time, but it could get there
> with some work.  When it's of comparable solidity to the base project

I agree completely.  You can get a long way with this code -- we've
used a version of it in production for 2 years now -- but it's a long
way from "turn it on and forget it" right now.

I also wonder why there's a push to put "replication" in the main
distribution, though.  I know, I know, the argument is that if you
have to get it separately, it's not the same as the "main" code.  But
it's just not true that Oracle (or whoever you like) "includes"
replication.  You have to buy the right licenses to get the functions
you want, and there are different kinds of subsystems depending on
what needs you have.  (e.g. RAC is gonna be a lousy choice across a
frame-relay VPN.  I shudder to think.)  I suppose someone could
package a "kitchen sink" Postgres which included all kinds of stuff
from gborg.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

Re: Replication Bundled with Main Source.

From

Jan Wieck

Date:

09 October 2003, 23:08:13

Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Joshua D. Drake wrote:
>>> It is not that we don't want to include replication in the base
>>> project it is that ERserver does not meet the requirements of what can
>>> be included in the base project. Specifically (I believe) the
>>> requirement of Java.
>
>> Maybe they will move to C someday.
>
> Well, JDBC requires Java, and it's still in the main distro.
>
> I think the real answer is that until recently, ERserver wasn't open
> source and we didn't have the option to include it.  Now that it is
> open source, we could think about it.  Having looked at the code, I
> think it's definitely not ready for prime time, but it could get there
> with some work.  When it's of comparable solidity to the base project
> I'd be in favor of adding it to the base distro.

Unfortunately I don't think it'll get there ever. There is a fundamental
design flaw in the system that is not fixable (there are multiple, but
this is one of the biggies). That is that eRServer only remembers that a
row has been modified, but not what, in what order, not even how often.

The problem is really easy to demonstrate. With a UNIQUE constraint on a
column, you change the values of two rows like

     A->C
     B->A
     C->B

If these 3 changes fall into one "snapshot", you have no chance to
replicate that. eRServer tries to do

     A->B
     B->A

and whatever order it tries, you'd need a deferred UNIQUE constraint to
get it done, and I don't have the slightest clue how the ever get _that_
implemented.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Replication Bundled with Main Source.

From

Bruce Momjian

Date:

10 October 2003, 14:29:31

Jan Wieck wrote:
> > I think the real answer is that until recently, ERserver wasn't open
> > source and we didn't have the option to include it.  Now that it is
> > open source, we could think about it.  Having looked at the code, I
> > think it's definitely not ready for prime time, but it could get there
> > with some work.  When it's of comparable solidity to the base project
> > I'd be in favor of adding it to the base distro.
>
> Unfortunately I don't think it'll get there ever. There is a fundamental
> design flaw in the system that is not fixable (there are multiple, but
> this is one of the biggies). That is that eRServer only remembers that a
> row has been modified, but not what, in what order, not even how often.
>
> The problem is really easy to demonstrate. With a UNIQUE constraint on a
> column, you change the values of two rows like
>
>      A->C
>      B->A
>      C->B
>
> If these 3 changes fall into one "snapshot", you have no chance to
> replicate that. eRServer tries to do
>
>      A->B
>      B->A
>
> and whatever order it tries, you'd need a deferred UNIQUE constraint to
> get it done, and I don't have the slightest clue how the ever get _that_
> implemented.

I was wondering about this.  It seems to be part of our existing problem
with handling unique contraints during the query, rather than at query
end or transaction end:

    test=> CREATE TABLE test (x INT);
    CREATE TABLE
    test=> INSERT INTO test VALUES (1);
    INSERT 17144 1
    test=> INSERT INTO test VALUES (2);
    INSERT 17145 1
    test=> UPDATE test SET x = x + 1;
    UPDATE 2
    test=> CREATE UNIQUE INDEX test_i ON test (x);
    CREATE INDEX
    test=> UPDATE test SET x = x + 1;
    ERROR:  duplicate KEY violates UNIQUE CONSTRAINT "test_i"

We have pretty complex handling of foreign key constraints, allowing
them to fire at the end of the transaction, we nothing for UNIQUE
constraints.  I assume we do this because it is more efficient to check
the unique index during insert/update of each row, but perhaps we need a
queue, as you suggest.

Another thing you might need is the ability to _not_ see changes made by
your transaction, so when you go to change B to A, you see the original
B but not the A->B you just changed.

Another idea would be to only queue up the unique constraint failures,
and re-check on transaction commit --- that way, you only have a queue
when you have a possible unique constraint violation, and you re-check
at the end.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Replication Bundled with Main Source.

From

Jan Wieck

Date:

10 October 2003, 20:16:59

Bruce Momjian wrote:
> Jan Wieck wrote:
>> Unfortunately I don't think it'll get there ever. There is a fundamental
>> design flaw in the system that is not fixable (there are multiple, but
>
> I was wondering about this.  It seems to be part of our existing problem
> with handling unique contraints during the query, rather than at query
> end or transaction end:
>
> [...]
>
> Another idea would be to only queue up the unique constraint failures,
> and re-check on transaction commit --- that way, you only have a queue
> when you have a possible unique constraint violation, and you re-check
> at the end.
>

_That_ actually is _the_ idea I was missing!

During index insert I think we know everything that needs to be known.
We know the index in question, which definitely leads to the relation in
question. And we know the CTID of the new heap tuple containing the key
values in conflict. IIRC that is enough to schedule some sort of
[DEFERRED] AFTER INSERT trigger ... one more of these generic C monsters.

Some sort of, because it's call interface might be a bit different. We
won't have a pg_trigger row for it anywhere. But since it'd be a generic
function for all index dupkey checks, I wouldn't mind much to hardwire
it into the trigger queue.

The trigger actually only needs to do a

     SELECT 1 FROM <rel> WHHERE <full qualification>

That over SPI_execp() with a tupcount of 2 and it'll be it. Maybe it
needs to do it FOR UPDATE to have the correct visibility and locking,
but that's a minor implementation detail.

Cool!

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #