Re: Global Sequences - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Global Sequences
Date
Msg-id 507E67DE.6020007@bluegap.ch
Whole thread Raw
In response to Re: Global Sequences  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Global Sequences  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Simon,

On 10/16/2012 02:36 PM, Simon Riggs wrote:
> Where else would you put the hook? The hook's location as described
> won't change whether you decide you want 1, 2 or 3.

You assume we want an API that supports all three options. In that case,
yes, the hooks need to be very general.

Given that option 3 got by far the most support, I question whether we
need such a highly general API. I envision an API that keeps the
bookkeeping and cache lookup functionality within Postgres. So we have a
single, combined-effort, known working implementation for that.

What remains to be done within the plugin effectively is the consensus
problem: it all boils down to the question of which node gets the next
chunk of N sequence numbers. Where N can be 1 (default CACHE setting in
Postgres) or any higher number for better performance (reduces the total
communication overhead by a factor of N - or at least pretty close to
that, if you take into account "lost" chucks due to node failures).

A plugin providing that has to offer a method to request for a global
ordering and would have to trigger a callback upon reaching consensus
with other nodes on who gets the next chunk of sequence numbers. That
works for all N >= 1. And properly implements option 3 (but doesn't
allow implementations of options 1 or 2, which I claim we don't need,
anyway).

> Implementations will be similar, differing mostly in the topology and
> transport layer

I understand that different users have different needs WRT transport
layers - moving the hooks as outlined above still allows flexibility in
that regard.

What different topologies do you have in mind? I'd broadly categorize
this all as multi-master. Do you need finer grained differentiation? Or
do you somehow include slaves (i.e. read-only transactions) in this process?

As you yourself are saying, implementations will only differ in that
way, let's keep the common code the same. And not require plugins to
duplicate that. (This also allows us to use the system catalogs for book
keeping, as another benefit).

> which means its not going to be possible to provide
> such a thing initially without slowing it down to the point we don't
> actually get it at all.

Sorry, I don't quite understand what you are trying to say, here.

Overall, thanks for bringing this up. I'm glad to see something
happening in this area, after all.

Regards

Markus Wanner



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: How to avoid base backup in automated failover
Next
From: Markus Wanner
Date:
Subject: Re: Global Sequences