Thread: PgCon: who will be there?
Hi! A few people from the Cluster meeting have been discussion what things we might continue conversations about, or maybe hack on during PgCon (May 18-21, 2010) this year. So my questions to this list: Are you planning on attending PgCon? Is there any particular subject related to clustering that you're most eager to discuss or hear an update about during the conference? If we scheduled a cluster-hackers specific meetup one evening, would you be interested in attending? Thanks! -selena -- http://chesnok.com/daily - me http://endpoint.com - work
Hi, > Hi! > > A few people from the Cluster meeting have been discussion what things > we might continue conversations about, or maybe hack on during PgCon > (May 18-21, 2010) this year. > > So my questions to this list: > > Are you planning on attending PgCon? Yes, I am. > Is there any particular subject related to clustering that you're most > eager to discuss or hear an update about during the conference? At least I would like to hear what are happening in other clustering projects is. i.e. progress, problems and new findings. > If we scheduled a cluster-hackers specific meetup one evening, would > you be interested in attending? Yes! -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
Selena Deckelmann wrote: > Are you planning on attending PgCon? > Is there any particular subject related to clustering that you're most > eager to discuss or hear an update about during the conference? > If we scheduled a cluster-hackers specific meetup one evening, would > you be interested in attending? > I'll be there and attend any such event. The main thing I'd like to see discussed a little more is just how the items described at http://wiki.postgresql.org/wiki/ClusterFeatures fit together. I see a lot of small details there, but it's not really clear in each case why these particular features are important, what dependencies exist, and what "prior art" is floating around. I'd be glad to organize a discussion reviewing that list and promise to document the result. If I can get someone to explain some of these to me a little better, I'll make sure that insight makes it into written form for others too. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Hi, Greg Smith wrote: > I'll be there and attend any such event. The main thing I'd like to see > discussed a little more is just how the items described at > http://wiki.postgresql.org/wiki/ClusterFeatures fit together. I see a > lot of small details there, but it's not really clear in each case why > these particular features are important, what dependencies exist, and > what "prior art" is floating around. I'd be glad to organize a > discussion reviewing that list and promise to document the result. +1 > If I > can get someone to explain some of these to me a little better, I'll > make sure that insight makes it into written form for others too. Well, I think the devil is in the details. Meaning you'll rather get two, three or even more different descriptions (and wishes) than just one. Fleshing out what's usable for most of them (and doesn't hinder the others) is the hard part. And it's not like we did lots of that back in Tokyo. However, we can start discussing various features of that list right now on this mailing list. That would help a lot in preparation for such an event. I'll start with one that I'm interested in. Regards Markus Wanner
Hello Selena, Selena Deckelmann wrote: > Are you planning on attending PgCon? No, I'm not, sorry. Regards Markus
Hi, Greg Smith wrote: > Sure, which is why I was volunteering to help on this part. Put me in a > room full of developers with different opinions and let me listen to the > argument, and I can usually sort through the whole mess to figure out a > reasonable summary at the end anyway. Cool, so you sure seem to be the right person for summarizing the PgCon 2010 clustering meeting. > Basically I'd like to see *short* answers to each of the following > questions for every item listed there: Uhm.. not sure if you'll get *short* answers, but well... :-) let's try anyway. [skipped a nice list of crisp and clear questions] > ..but not enough dependency tracking > that leads to a clear roadmap for how everything is going to come > together in the end. Dependency tracking? I'm not sure what you mean here. I think there's a lot of code duplication in all of the clustering projects (and people continue to add even more). IMO the real purpose behind that list is to combine some of that code into a common component. Only then the individual projects can adapt to depend on that component (instead of their inferior custom duplicate variant). > Not like this list is overflowing with > traffic anyway. Hehe.. as with all replication mailing lists so far. (And the change-the-name one as well). Creation of mailing lists by itself obviously (and sadly) doesn't solve problems. Regards Markus Wanner
Markus Wanner wrote: > Well, I think the devil is in the details. Meaning you'll rather get > two, three or even more different descriptions (and wishes) than just > one. Fleshing out what's usable for most of them (and doesn't hinder > the others) is the hard part. And it's not like we did lots of that > back in Tokyo. Sure, which is why I was volunteering to help on this part. Put me in a room full of developers with different opinions and let me listen to the argument, and I can usually sort through the whole mess to figure out a reasonable summary at the end anyway. > However, we can start discussing various features of that list right > now on this mailing list. That would help a lot in preparation for > such an event. I'll start with one that I'm interested in. That's a pretty good idea; if we could cover some of this background before PGCon it would make the whole thing run smoother. Basically I'd like to see *short* answers to each of the following questions for every item listed there: 1) What feature does this help add from a user perspective? 2) Which replication projects would be expected to see an improvement from this addition? 3) What makes it difficult to implement? 4) Are there any other items on the list this depends on, or that it is expected to have a significant positive/negative interaction with? 5) What replication projects include a feature like this already, or a prototype of a similar one, that might be used as a proof of concept or example implementation? 6) Who is already working on it/planning to work on it/needs it for their related project? If we picked one item from there a week and brainstormed answers to those questions for them all, that could wrap up in time for the convention and we'd be starting with an improved basis for discussion (as well as input from people like yourself who won't be there). Much like some other PostgreSQL projects (improved table partitioning comes to mind), there seems to be lots of code and demos floating around for a lot of these clustering/replication, but not enough dependency tracking that leads to a clear roadmap for how everything is going to come together in the end. I encourage you to pick a feature you've got interest in and still filling in the details for it. Not like this list is overflowing with traffic anyway. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Greg, > I'll be there and attend any such event. The main thing I'd like to see > discussed a little more is just how the items described at > http://wiki.postgresql.org/wiki/ClusterFeatures fit together. I see a > lot of small details there, but it's not really clear in each case why > these particular features are important, what dependencies exist, and > what "prior art" is floating around. I'd be glad to organize a > discussion reviewing that list and promise to document the result. If I > can get someone to explain some of these to me a little better, I'll > make sure that insight makes it into written form for others too. We need to work out more of the details on this list and the wiki first. However, I've put "Clustering Features" on the agenda for the Developer Meeting, and you're more than welcome to actually lead it. --Josh Berkus
Hi, Selena, 2010/2/6 Selena Deckelmann <selenamarie@gmail.com>: > Hi! > > A few people from the Cluster meeting have been discussion what things > we might continue conversations about, or maybe hack on during PgCon > (May 18-21, 2010) this year. > > So my questions to this list: > > Are you planning on attending PgCon? Yes I am. > > Is there any particular subject related to clustering that you're most > eager to discuss or hear an update about during the conference? Yes, I'm planning to make Postgres-2 open to public by the end of March. I may be able to discuss about snapsot import and XID feed. > > If we scheduled a cluster-hackers specific meetup one evening, would > you be interested in attending? Yes, it must be very exciting. > > Thanks! > -selena > > -- > http://chesnok.com/daily - me > http://endpoint.com - work > > -- > Sent via pgsql-cluster-hackers mailing list (pgsql-cluster-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-cluster-hackers >
> Are you planning on attending PgCon? Yes > Is there any particular subject related to clustering that you're most > eager to discuss or hear an update about during the conference? * Getting truncate MVCC safe * Getting system tables MVCC safe * Allowing triggers on system tables (or other mechanism) > If we scheduled a cluster-hackers specific meetup one evening, would > you be interested in attending? Perhaps, but there seems to be very little overlap in the things I work on (async replication) versus some of the other solutions (esp. built-in and synchronous ones) -- Greg Sabino Mullane greg@endpoint.com End Point Corporation PGP Key: 0x14964AC8
Attachment
Hi, Greg Sabino Mullane wrote: > * Getting truncate MVCC safe > * Getting system tables MVCC safe Can you elaborate on how these two relate to clustering? > * Allowing triggers on system tables (or other mechanism) I understand the need for that one. > Perhaps, but there seems to be very little overlap in the things I > work on (async replication) versus some of the other solutions > (esp. built-in and synchronous ones) Spotting the common issues/modules is one of the reasons why we get (or got) together. For example, I'm interested in the conflict resolution logic of Bucardo. I plan to add something akin for Postgres-R to reduce the amount of transactions that need to be aborted. Regards Markus Wanner
On Sat, Feb 6, 2010 at 6:45 AM, Selena Deckelmann <selenamarie@gmail.com> wrote: > Are you planning on attending PgCon? No. Sorry. > Is there any particular subject related to clustering that you're most > eager to discuss or hear an update about during the conference? Though I cannot attend it, I'm looking forward to the result of the discussion about "Export snapshots to other sessions" feature ;) And it has already begun being discussed on -hackers. http://archives.postgresql.org/pgsql-hackers/2010-01/msg00916.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
This patch is to share the snapshot among backgrounds in a single node. In the cluster environment, we need a means to mainain snapshot consistency among backgrounds in multiple nodes. ---------- Koichi Suzuki 2010/2/12 Fujii Masao <masao.fujii@gmail.com>: > On Sat, Feb 6, 2010 at 6:45 AM, Selena Deckelmann <selenamarie@gmail.com> wrote: >> Are you planning on attending PgCon? > > No. Sorry. > >> Is there any particular subject related to clustering that you're most >> eager to discuss or hear an update about during the conference? > > Though I cannot attend it, I'm looking forward to the result of the > discussion about "Export snapshots to other sessions" feature ;) > And it has already begun being discussed on -hackers. > http://archives.postgresql.org/pgsql-hackers/2010-01/msg00916.php > > Regards, > > -- > Fujii Masao > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center > > -- > Sent via pgsql-cluster-hackers mailing list (pgsql-cluster-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-cluster-hackers >
On Fri, Feb 12, 2010 at 3:05 PM, Koichi Suzuki <koichi.szk@gmail.com> wrote: > This patch is to share the snapshot among backgrounds in a single > node. In the cluster environment, we need a means to mainain > snapshot consistency among backgrounds in multiple nodes. But it's the first step or the basis of that feature. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, On Fri, 12 Feb 2010 14:50:59 +0900, Fujii Masao <masao.fujii@gmail.com> wrote: > Though I cannot attend it, I'm looking forward to the result of the > discussion about "Export snapshots to other sessions" feature ;) Glad to hear that. I'm currently thinking about possible implementations, but I doubt somewhat that I'll have the time to do the actual implementation :-( > And it has already begun being discussed on -hackers. > http://archives.postgresql.org/pgsql-hackers/2010-01/msg00916.php Hm.. did you notice my review and comments on *this* mailing list [1]? I don't think that approach works for anything we need for clustering. It's designed for parallel pg_dump. Regards Markus Wanner [1]: Exporting Snapshots http://archives.postgresql.org/pgsql-cluster-hackers/2010-02/msg00009.php
On Fri, Feb 12, 2010 at 8:41 PM, Markus Wanner <markus@bluegap.ch> wrote: > Glad to hear that. I'm currently thinking about possible implementations, > but I doubt somewhat that I'll have the time to do the actual > implementation :-( Probably me too ;) >> And it has already begun being discussed on -hackers. >> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00916.php > > Hm.. did you notice my review and comments on *this* mailing list [1]? I > don't think that approach works for anything we need for clustering. It's > designed for parallel pg_dump. Umm.. do you think that the "Export snapshots to other sessions" feature is completely independent of the snapshot management for the parallel pg_dump? If not, we should make the patch more extensible for the feature we want? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, ---------- Koichi Suzuki 2010/2/12 Fujii Masao <masao.fujii@gmail.com>: > On Fri, Feb 12, 2010 at 8:41 PM, Markus Wanner <markus@bluegap.ch> wrote: >> Glad to hear that. I'm currently thinking about possible implementations, >> but I doubt somewhat that I'll have the time to do the actual >> implementation :-( > > Probably me too ;) > >>> And it has already begun being discussed on -hackers. >>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00916.php >> >> Hm.. did you notice my review and comments on *this* mailing list [1]? I >> don't think that approach works for anything we need for clustering. It's >> designed for parallel pg_dump. > > Umm.. do you think that the "Export snapshots to other sessions" feature is > completely independent of the snapshot management for the parallel pg_dump? > If not, we should make the patch more extensible for the feature we want? I don't think so. But in the cluster, from my experience of PG-2, it is essential to manage snapshot consistent among all the node involved and we should be very careful to make other local tasks, for example, autovacuum and analyze, not disturb other transactions because they can become global long transactions very easily. They don't happen in parallel pg_restore, parallel pg_dump or other local parallelisms. So I think their patch may be useful in cluster environment too. > > Regards, > > -- > Fujii Masao > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center > > -- > Sent via pgsql-cluster-hackers mailing list (pgsql-cluster-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-cluster-hackers >
Hi, On Fri, 12 Feb 2010 22:21:23 +0900, Koichi Suzuki <koichi.szk@gmail.com> wrote: > 2010/2/12 Fujii Masao <masao.fujii@gmail.com>: >> Umm.. do you think that the "Export snapshots to other sessions" feature >> is >> completely independent of the snapshot management for the parallel >> pg_dump? >> If not, we should make the patch more extensible for the feature we want? I think a common, more general implementation is possible, but certainly requires more work than a plain parallel pg_dump solution. > I don't think so. But in the cluster, from my experience of PG-2, it > is essential to manage snapshot consistent among all the node involved > and we should be very careful to make other local tasks, for example, > autovacuum and analyze, not disturb other transactions because they > can become global long transactions very easily. Well, I fear that's heavily dependent on the clustering solution. So I think the first step will be to have a general purpose snapshot exporting / cloning feature for a single node (i.e. usable for parallel querying). Extending that to allow for distributed parallel querying is the next step. Regards Markus Wanner
Hi, ---------- Koichi Suzuki 2010/2/12 Markus Wanner <markus@bluegap.ch>: > > Hi, > > On Fri, 12 Feb 2010 22:21:23 +0900, Koichi Suzuki <koichi.szk@gmail.com> > wrote: >> 2010/2/12 Fujii Masao <masao.fujii@gmail.com>: >>> Umm.. do you think that the "Export snapshots to other sessions" > feature >>> is >>> completely independent of the snapshot management for the parallel >>> pg_dump? >>> If not, we should make the patch more extensible for the feature we > want? > > I think a common, more general implementation is possible, but certainly > requires more work than a plain parallel pg_dump solution. > >> I don't think so. But in the cluster, from my experience of PG-2, it >> is essential to manage snapshot consistent among all the node involved >> and we should be very careful to make other local tasks, for example, >> autovacuum and analyze, not disturb other transactions because they >> can become global long transactions very easily. > > Well, I fear that's heavily dependent on the clustering solution. So I > think the first step will be to have a general purpose snapshot exporting / > cloning feature for a single node (i.e. usable for parallel querying). > Extending that to allow for distributed parallel querying is the next step. Yes, I agree that there's common feature to local and cluster use case and cluster use case needs some more. But as you mentioned, I believe cluster can use general purpose feature with some extension. > > Regards > > Markus Wanner > >
Hello > A few people from the Cluster meeting have been discussion what things > we might continue conversations about, or maybe hack on during PgCon > (May 18-21, 2010) this year. > > So my questions to this list: > > Are you planning on attending PgCon? Yes > Is there any particular subject related to clustering that you're most > eager to discuss or hear an update about during the conference? Techniques and solutions for High Availability and Scalability in mission critical environments. > If we scheduled a cluster-hackers specific meetup one evening, would > you be interested in attending? Yes, I'm sure would be there. Flavio Henrique A. Gurgel tel. 55-11-2125.4786 cel. 55-11-8389.7635 www.4linux.com.br FREE SOFTWARE SOLUTIONS
> >* Getting truncate MVCC safe > >* Getting system tables MVCC safe > > Can you elaborate on how these two relate to clustering? Well, the second is not so vital anymore with the introduction of session_replication_role, but making updates in a safe matter on older versions of Postgres requires mucking around with the system catalogs, which (unless you actually lock the tables, a non-starter), leads to problems. Getting truncate MVCC safe would make life much easier, as truncate is so much faster than delete when doing a bulk update of a replicated table. > >Perhaps, but there seems to be very little overlap in the things I > >work on (async replication) versus some of the other solutions > >(esp. built-in and synchronous ones) > > Spotting the common issues/modules is one of the reasons why we get > (or got) together. > > For example, I'm interested in the conflict resolution logic of > Bucardo. I plan to add something akin for Postgres-R to reduce the > amount of transactions that need to be aborted. Well, it's all open source for the looking. :) Here's a quick summary. Basically, Bucardo supports two types of conflict resolution: standard and custom. The standard includes a handful of default rules about which side (let's call them A and B) should "win" in a conflict. Options include 'source' (A), 'target' (B), 'latest' (who made the most recent change), and 'random' (call it a really weird form of load balancing :). The custom handlers are much more interesting, as it can incroporate business logic. Basically, you write a perl subroutine that gets fed information about the current conflict and returns a code indicating which side should win. The passed in information also includes handles to both sides, so that you can query other tables, and even update the rows in conflict directly. I don't know if any of that is really applicable to something like Postgres-R, but maybe that's one of the things we can talk about. -- Greg Sabino Mullane greg@endpoint.com End Point Corporation PGP Key: 0x14964AC8
Attachment
Hi, On Wed, 3 Mar 2010 10:08:01 -0500, Greg Sabino Mullane <greg@endpoint.com> wrote: > Well, it's all open source for the looking. :) Here's a quick summary. > Basically, Bucardo supports two types of conflict resolution: standard > and custom. The standard includes a handful of default rules about which > side (let's call them A and B) should "win" in a conflict. Options > include 'source' (A), 'target' (B), 'latest' (who made the most recent > change), and 'random' (call it a really weird form of load balancing :). > The custom handlers are much more interesting, as it can incroporate > business logic. Basically, you write a perl subroutine that gets fed > information about the current conflict and returns a code indicating > which side should win. The passed in information also includes handles > to both sides, so that you can query other tables, and even update > the rows in conflict directly. I don't know if any of that is really > applicable to something like Postgres-R, but maybe that's one of the > things we can talk about. Thank you for this quick summary (it's vastly more efficient than having to study source code ;-) ). For Postgres-R, I'm having something akin to your custom handlers in mind. Maybe we can even come up with a compatible interface? I'll certainly have a look at bucardo's custom (conflict resolution) handler interface. Regards Markus Wanner