Thread: Proposal: include PL/Proxy into core
PL/Proxy is a small PL whose goal is to allow creating "proxy functions" that call actual functions in remote database. Basic design is: Function body describes how to deduce final database. Its either CONNECT 'connstr'; -- connect to exactly this db or when partitioning is used: -- partitons are described under that nameCLUSTER 'namestr'; -- calculate int4 based on function paramenters-- and use that to pick a partitionRUN ON hashtext(username); Actual function call info (arguments, result fields) are deduced from looking at its own signature. so function "foo(int4, text) returns setof text" will result in query "select * from foo($1::int4, $2::text)" to be executed. Announcement with more examples: http://archives.postgresql.org/pgsql-announce/2007-03/msg00005.php Documentation: https://developer.skype.com/SkypeGarage/DbProjects/PlProxy Patch: http://plproxy.projects.postgresql.org/plproxy_core.diff.gz Now, why put it into core? 1) Much simpler replacement for various other clustering solutions that try to cluster regular SQL. 2) Nicer replacement for dblink. 3) PLs need much more intimate knowledge of the PostgreSQL core then regular modules. API for PLs has been changing every major revision of PostgreSQL. 4) It promotes the db-access-thru-functions design to databases, that has proven to be killer feature of PostgreSQL. In a sense it is using PostgreSQL as appserver which provides fixed API via functions for external users, but hides internal layout from them, so it can be changed invisibly to external users. 5) The language is ready feature-wise - theres no need for it to grow into "Remote PLPGSQL", because all logic can be put into remote function. Some objections that may be: 1) It is not a universal solves-everything tool for remote access/clustering. But those solves-everything tools have very hard time maturing, and will be not exactly simple. Much better is to have simple tool that works well. 2) You cant use it for all thing you can use dblink. PL/Proxy is easier to use for simple result fetching. For complicated access using full-blown PLs (plperl, plpython) is better. From such POV dblink is replaced. 3) It is possible for PL to live outside, The pain is not that big. Sure its possible. We just feel that its usefulness : lines-of-code ratio is very high, so its worthy of being builtin into PostgreSQL core, thus also giving PostgreSQL opportunity to boast being clusterable out-of-box. 4) What about all the existing apps that dont access database thru functions? Those are target for "solves-everything" tool... 5) It it too new product. We think this is offset by the small scope of the task it takes, and it already works well in that scope. -- marko
Le vendredi 30 mars 2007 12:36, Marko Kreen a écrit : > Patch: > > http://plproxy.projects.postgresql.org/plproxy_core.diff.gz Note a perhaps oversight in your makefile : + #REGRESS_OPTS = --dbname=$(PL_TESTDB) --load-language=plpgsql --load-language=plproxy + REGRESS_OPTS = --dbname=regression --load-language=plpgsql --load-language=plproxy
On 3/30/07, Cédric Villemain <cedric.villemain@dalibo.com> wrote: > Le vendredi 30 mars 2007 12:36, Marko Kreen a écrit: > > Patch: > > > > http://plproxy.projects.postgresql.org/plproxy_core.diff.gz > Note a perhaps oversight in your makefile : > > + #REGRESS_OPTS > = --dbname=$(PL_TESTDB) --load-language=plpgsql --load-language=plproxy > + REGRESS_OPTS > = --dbname=regression --load-language=plpgsql --load-language=plproxy Heh. The problem is I had 'regression' hardwired into regtests, so I could not use $(PL_TESTDB). If the proposal is accespted and we want to always run PL/Proxy regtests, there should be some dynamic way of passing main dbname and also connstrings for partitions into regression tests. ATM I thought it can stay as-is. (Actually I forgot that change after I had done it :) -- marko
Ühel kenal päeval, R, 2007-03-30 kell 13:36, kirjutas Marko Kreen: > PL/Proxy is a small PL whose goal is to allow creating > "proxy functions" that call actual functions in remote database. > Basic design is: > > Function body describes how to deduce final database. Its either > > CONNECT 'connstr'; -- connect to exactly this db > > or when partitioning is used: > > -- partitons are described under that name > CLUSTER 'namestr'; > > -- calculate int4 based on function paramenters > -- and use that to pick a partition > RUN ON hashtext(username); > > > Actual function call info (arguments, result fields) are deduced > from looking at its own signature. > > so function "foo(int4, text) returns setof text" will result in > query "select * from foo($1::int4, $2::text)" to be executed. > > Announcement with more examples: > > http://archives.postgresql.org/pgsql-announce/2007-03/msg00005.php > > Documentation: > > https://developer.skype.com/SkypeGarage/DbProjects/PlProxy > > Patch: > > http://plproxy.projects.postgresql.org/plproxy_core.diff.gz > > > Now, why put it into core? > > 1) Much simpler replacement for various other clustering solutions > that try to cluster regular SQL. > > 2) Nicer replacement for dblink. > > 3) PLs need much more intimate knowledge of the PostgreSQL core > then regular modules. API for PLs has been changing every > major revision of PostgreSQL. > > 4) It promotes the db-access-thru-functions design to databases, that > has proven to be killer feature of PostgreSQL. In a sense it is > using PostgreSQL as appserver which provides fixed API via > functions for external users, but hides internal layout from them, > so it can be changed invisibly to external users. > > 5) The language is ready feature-wise - theres no need for it to grow > into "Remote PLPGSQL", because all logic can be put into remote function. > > > Some objections that may be: > > 1) It is not a universal solves-everything tool for remote access/clustering. > > But those solves-everything tools have very hard time maturing, > and will be not exactly simple. Much better is to have simple > tool that works well. current pl/proxy proposed here for inclusion is already an almost complete redesign and rewrite based on our experiences of using the initial version in production databases, so you can expect ver 2.x robustness, maintainability and code cleanness from it. > 5) It it too new product. > > We think this is offset by the small scope of the task it takes, > and it already works well in that scope. Also, it is actively used serving thousands of requests per second in a 24/7 live environment, which means that it should be reasonably well tested. Together with our lightweight connection pooler https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer pl/proxy can be used to implement the vision of building a "DB-bus" over a database farm of diverse postgresql servers as shown in SLIDE3: of https://developer.skype.com/SkypeGarage/DbProjects/SkypePostgresqlWhitepaper . The connection pooler is not strictly needed and can be left out for smaller configurations with maybe less than about 10 databases and/or concurrent db connections. (btw, the connection poolers name PgBouncer comes from its initial focus of "bouncing around" single-transaction db calls.) -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
"Marko Kreen" <markokr@gmail.com> writes: > Now, why put it into core? I don't think you have made a sufficient case for that. I think it should stay as an outside project for awhile and see what sort of userbase it attracts. If it becomes sufficiently popular I'd be willing to consider adding it to core, but that remains to be seen. We can barely keep up maintaining what's in core now --- we need to be very strict about adding stuff that doesn't really have to be in core, and this evidently doesn't, since you've got it working ... regards, tom lane
Hannu, Marko, I, personally, think that it's worth talking about integrating these. However, the old versions were definitely NOT ready for integration, and the new versions went on the internet like a week ago. Heck, I haven't even downloaded them yet. Can we address these on the 8.4 timeline? That will give the rest of us in the community time to download, try and debug the new SkyTools. I know I'm planning on testing them and will know a lot more about your code/performance in a few months. Is there a reason why getting PL/proxy into 8.3 is critical? -- Josh Berkus PostgreSQL @ Sun San Francisco
On 3/30/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Marko Kreen" <markokr@gmail.com> writes: > > Now, why put it into core? > > I don't think you have made a sufficient case for that. I think it > should stay as an outside project for awhile and see what sort of > userbase it attracts. If it becomes sufficiently popular I'd be > willing to consider adding it to core, but that remains to be seen. Fair enough. -- marko
On 3/30/07, Josh Berkus <josh@agliodbs.com> wrote: > I, personally, think that it's worth talking about integrating these. > However, the old versions were definitely NOT ready for integration, and the > new versions went on the internet like a week ago. Heck, I haven't even > downloaded them yet. Yeah, the old version was bit too complicated. Thats why we did a rewrite. And it turned out nice and simple. > Can we address these on the 8.4 timeline? That will give the rest of us in > the community time to download, try and debug the new SkyTools. I know I'm > planning on testing them and will know a lot more about your code/performance > in a few months. Is there a reason why getting PL/proxy into 8.3 is > critical? No hurry. Just the timing was too good... Also, if there are some design/API issues that may hinder merging, we'd like to solve them immidiately. -- marko