Re: protocol change in 7.4 - Mailing list pgsql-hackers

From Satoshi Nagayasu
Subject Re: protocol change in 7.4
Date
Msg-id 20021107135314.4733be17.pgsql@snaga.org
Whole thread Raw
In response to Re: protocol change in 7.4  (Satoshi Nagayasu <pgsql@snaga.org>)
List pgsql-hackers

"Ross J. Reedstrom" <reedstrm@rice.edu> wrote:
> > Because the postgres backend must detect a type of incomming connection
> > (from the client app or the master).
> > 
> > If it is comming from the client, the backend relays the queries to the
> > slaves (act as the master).
> > 
> > But if it is comming from the master server, the backend must act as a
> > slave, and does not relay the queries.
> 
> So, your replication is an all-or-nothing type of thing? you can't
> replicate some tables and not others? If only some tables are replicated,
> then you can't decide if this is a distributed transaction until it's
> been parsed.

Yes. My current replication implementation is 'query based' replication,
so all queries to the database (except SELECT command) are replicated.
The database will be completely replicated, not partial.

I know this 'query based' design can't be used for a distributed
transaction.  I think more internal communication between distributed
servers is required.  We need 'the partial transfer of tables', 'the
bulk transfer of the index' or something like that for a distributed
transaction. I'm working for it now.

As I said, several connection types, a client application connection, an
internal transfer connection or a recovery connection, will be required
on replication and distributed transaction in near future.  Embedding
connection types in the startup packet is a good way to decide how the
backend should behave. It is simple and extendable, isn't it? 

If the backend can't understand the incoming connection type, the
backend will answer "I can't understand." and need only disconnect it.

> 
> Also, if we want to cascade, then one server can be both master and slave,
> as it were. For full-on-2PC, I'm not sure cascading is a good idea, but
> it's something to consider, especially if there's provisions for partial
> replication, or 'optional' slaves.

Yes. There are several implementation designs for replication.  Sync or
async, pre- or post-, full or partial, query-level or I/O-level or
journal-level. I think there is no "best way" for replication, because
applications have different requirements in different situations.

So the protocol should be more extendable.


> I think Hannu is suggesting that COMMIT could occur from either of two
> states in the transaction state diagram: from an open transaction, or
> from PRECOMMIT. There's no need to determine before that moment if
> this particular transaction is part of a 2PC or not, is there? So, no
> you don't _require_ PRECOMMIT/COMMIT because it's clustered: if a 
> 'bare' COMMIT shows up, do what you currently do: hide the details.
> If a PRECOMMIT shows up, report status back to the 'client'.

After status is returned, what does the 'client' do?
Should the client talk the 2PC protocol?

For example, if the database is replicated in 8 servers,
does the client application keep 8 connections for each server?

Is this good?

-- 
NAGAYASU Satoshi <snaga@snaga.org>



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 7.3b5 contrib compile problem
Next
From: Bruce Momjian
Date:
Subject: Re: 7.3b5 contrib compile problem