Thread: PostgreSQL <-> Babelfish integration
I would like to share my thoughts in the list about the potential PostgreSQL <-> Babelfish integration. There is already a thread about protocol hooks [1], but I'd like to offer my PoV from a higher level perspective and keep that thread for the technical aspects of the protocol hooks. This is also a follow-up on a public blog post I recently published [2], and the feedback I received to bring the topic to the ML. As I stated in the mentioned post, I believe Babelfish is a very welcomed addition to the PostgreSQL ecosystem. It allows PostgreSQL to reach other users, other use cases, other markets; something which in my opinion PostgreSQL really needs to extend its reach, to become a more relevant player in the database market. The potential is there, specially given all the extensibility points that PostgreSQL already has, which are unparalleled in the industry. I believe we should engage in a conversation, with AWS included, about how we can possibly benefit from this integration. It must be symbiotic, both "parties" should win with it, otherwise it won't work. But I believe it can definitely be a win-win situation. There has been some concerns that this may be for Amazon's own benefit, and would suppose an increased maintenance burden for the PostgreSQL Community. I believe that analysis is not including the many benefits that such a compatibility for PostgreSQL would bring in many fronts. And possibly, the changes required to core, are beneficial for other areas of PostgreSQL. Several have already pointed out in the extensibility hooks thread that this could allow for new protocols into PostgreSQL, including the much desired v4 or an HTTP one. I can only strongly second that, and we should also analyze it from this perspective. There is also a risk factor that I believe needs to be factored into the analysis, and is what is the risk of not doing anything. From my understanding, it is very clear that AWS wants to treat Babelfish as a kind of development branch, waiting for inclusion into mainline. But I also believe, if this branch sits forever not merged, at some point, may be under the risk of having its own life, becoming a fork. And if it does, it may become our "MariaDB". I would not like this to happen. I'm happy to contribute what I can to this discussion: if we want Babelfish to be integrated, how, analyze pros and cons, etc. I see this as an incredible gift that, if managed properly, not only will make PostgreSQL much better in use-cases that cannot access now; but may also boost PostgreSQL's extensibility even further, and maybe even spark development of some projects (like v4 or HTTP protocol) that have been longer dismissed because there were (logically) too many requisites for any v3 replacement, that made its replacement extremely hard. But of course, these are just the humble 2 cents of a casual -hackers reader. Álvaro [1] https://www.postgresql.org/message-id/CAGBW59d5SjLyJLt-jwNv%2BoP6esbD8SCB%3D%3D%3D11WVe5%3DdOHLQ5wQ%40mail.gmail.com [2] https://postgresql.fund/blog/babelfish-the-elephant-in-the-room/ -- Alvaro Hernandez ----------- OnGres
On Fri, Feb 12, 2021 at 10:26 AM Álvaro Hernández <aht@ongres.com> wrote: > As I stated in the mentioned post, I believe Babelfish is a very > welcomed addition to the PostgreSQL ecosystem. It allows PostgreSQL to > reach other users, other use cases, other markets; something which in my > opinion PostgreSQL really needs to extend its reach, to become a more > relevant player in the database market. The potential is there, > specially given all the extensibility points that PostgreSQL already > has, which are unparalleled in the industry. Let's assume for the sake of argument that your analysis of the benefits is 100% correct -- let's take it for granted that Babelfish is manna from heaven. It's still not clear that it's worth embracing Babelfish in the way that you have advocated. We simply don't know what the costs are. Because there is no source code available. Maybe that will change tomorrow or next week, but as of this moment there is simply nothing substantive to evaluate. -- Peter Geoghegan
On 12/2/21 19:44, Peter Geoghegan wrote: > On Fri, Feb 12, 2021 at 10:26 AM Álvaro Hernández <aht@ongres.com> wrote: >> As I stated in the mentioned post, I believe Babelfish is a very >> welcomed addition to the PostgreSQL ecosystem. It allows PostgreSQL to >> reach other users, other use cases, other markets; something which in my >> opinion PostgreSQL really needs to extend its reach, to become a more >> relevant player in the database market. The potential is there, >> specially given all the extensibility points that PostgreSQL already >> has, which are unparalleled in the industry. > Let's assume for the sake of argument that your analysis of the > benefits is 100% correct -- let's take it for granted that Babelfish > is manna from heaven. It's still not clear that it's worth embracing > Babelfish in the way that you have advocated. > > We simply don't know what the costs are. Because there is no source > code available. Maybe that will change tomorrow or next week, but as > of this moment there is simply nothing substantive to evaluate. I'm sure if we embrace an open and honest conversation, we will be able to figure out what the integration costs are even before the source code gets published. As I said, this goes beyond the very technical detail of source code integration. Waiting until the source code is published is a bit chicken-and-egg (as I presume the source will morph towards convergence if there's work that may be started, even if it is just for example for protocol extensibility). I'm sure this can be also discussed at an architectural level, getting an analysis of what parts of PostgreSQL would be changed, what extension mechanisms are required, what is the volume of the project, and many others. Álvaro -- Alvaro Hernandez ----------- OnGres
On Fri, 12 Feb 2021 at 19:44, Peter Geoghegan <pg@bowt.ie> wrote: > > On Fri, Feb 12, 2021 at 10:26 AM Álvaro Hernández <aht@ongres.com> wrote: > > As I stated in the mentioned post, I believe Babelfish is a very > > welcomed addition to the PostgreSQL ecosystem. It allows PostgreSQL to > > reach other users, other use cases, other markets; something which in my > > opinion PostgreSQL really needs to extend its reach, to become a more > > relevant player in the database market. The potential is there, > > specially given all the extensibility points that PostgreSQL already > > has, which are unparalleled in the industry. > > Let's assume for the sake of argument that your analysis of the > benefits is 100% correct -- let's take it for granted that Babelfish > is manna from heaven. It's still not clear that it's worth embracing > Babelfish in the way that you have advocated. > > We simply don't know what the costs are. Because there is no source > code available. Maybe that will change tomorrow or next week, but as > of this moment there is simply nothing substantive to evaluate. I agree. I believe that Babelfish's efforts can be compared with the zedstore and zheap efforts: they require work in core before they can be integrated or added as an extension that could replace the normal heap tableam, and while core is being prepared we can discover what can and cannot be prepared in core for this new feature. But as long as there is no information about what structural updates in core would be required, no commitment can be made for inclusion. And although I would agree that an extension system for custom protocols and parsers would be interesting, I think it would be putting the cart before the horse if you want to force a decision 4 years ahead of time [0], without ever having seen the code or even a design document. In general, I think postgres could indeed benefit from a pluggable protocol and dialect frontend, but as long as there are no public and open projects that demonstrate the benefits or would provide a guide for implementing such frontend, I see no reason for the postgres project to put work into such a feature. With regards, Matthias van de Meent [0] I believe this is an optimistic guess, based on the changes that were (and are yet still) required for the zedstore and/or zheap tableam, but am happy to be proven wrong.
On Fri, Feb 12, 2021 at 11:04 AM Álvaro Hernández <aht@ongres.com> wrote: > I'm sure if we embrace an open and honest conversation, we will be > able to figure out what the integration costs are even before the source > code gets published. As I said, this goes beyond the very technical > detail of source code integration. Perhaps that's true in one sense, but if the cost of integrating Babelfish is prohibitive then it's still not going to go anywhere. If it did happen then that would certainly involve at least one or two senior community members that personally adopt it. That's our model for everything, to some degree because there is no other way that it could work. It's very bottom-up. For better or worse, very high level discussion like this has always followed from code, not the other way around. We don't really have the ability or experience to do it any other way IMO. > Waiting until the source code is > published is a bit chicken-and-egg (as I presume the source will morph > towards convergence if there's work that may be started, even if it is > just for example for protocol extensibility). Well, the priorities of Postgres development are not set in any fixed way (except to the limited extent that you're on the hook for anything you integrate that breaks). I myself am not convinced that this is worth spending any time on right now, especially given the lack of code to evaluate. -- Peter Geoghegan
On Fri, Feb 12, 2021 at 11:13 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > I agree. I believe that Babelfish's efforts can be compared with the > zedstore and zheap efforts: they require work in core before they can > be integrated or added as an extension that could replace the normal > heap tableam, and while core is being prepared we can discover what > can and cannot be prepared in core for this new feature. I see what you mean, but even that seems generous to me, since, as I said, we don't have any Babelfish code to evaluate today. Whereas Zedstore and zheap can actually be downloaded and tested. -- Peter Geoghegan
Just wanted to link to the discussion on this from HN for anyone intersted: https://news.ycombinator.com/item?id=26114281
We are applying the Babelfish commits to the REL_12_STABLE branch now, and the plan is to merge them into the REL_13_STABLEand master branch ASAP after that. There should be a publicly downloadable git repository before very long. On 2/12/21, 2:35 PM, "Peter Geoghegan" <pg@bowt.ie> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. On Fri, Feb 12, 2021 at 11:13 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > I agree. I believe that Babelfish's efforts can be compared with the > zedstore and zheap efforts: they require work in core before they can > be integrated or added as an extension that could replace the normal > heap tableam, and while core is being prepared we can discover what > can and cannot be prepared in core for this new feature. I see what you mean, but even that seems generous to me, since, as I said, we don't have any Babelfish code to evaluate today. Whereas Zedstore and zheap can actually be downloaded and tested. -- Peter Geoghegan
On Mon, 15 Feb 2021 at 17:01, Finnerty, Jim <jfinnert@amazon.com> wrote: > > We are applying the Babelfish commits to the REL_12_STABLE branch now, and the plan is to merge them into the REL_13_STABLEand master branch ASAP after that. There should be a publicly downloadable git repository before very long. Hi, Out of curiosity, are you able to share the status on the publication of this repository? I mainly ask this because I haven't seen any announcements from Amazon / AWS regarding the publication of the Babelfish project since the start of this thread, and the relevant websites [0][1][2] also do not appear to have seen an update. The last mention of babelfish in a thread here on -hackers also only seem to date back to late March. Kind regards, Matthias van de Meent [0] https://babelfish-for-postgresql.github.io/babelfish-for-postgresql/ [1] https://aws.amazon.com/rds/aurora/babelfish/ [2] https://github.com/babelfish-for-postgresql/