Thread: Federated Postgresql architecture ?
Hi List; Anyone have any experiences to share per setting up a federated architecture with PostgreSQL ? I wonder if the dblink contrib works well in a federated scenario, specifically in the setup of the federated views which equate to a select * from the same table on each federated server ? Thanks in advance... /Kevin
On Thu, Jun 26, 2008 at 4:33 PM, kevin kempter <kevin@kevinkempterllc.com> wrote: > Anyone have any experiences to share per setting up a federated architecture > with PostgreSQL ? I wonder if the dblink contrib works well in a federated > scenario, specifically in the setup of the federated views which equate to a > select * from the same table on each federated server ? Because Postgres currently lacks the ability to push down predicates to individual nodes over a database link, you have to spend a good amount of time writing PL set-returning functions capable of adding appropriate WHERE clauses to queries sent over the link. There are other things you can do, but it's mostly hackery at this point in time. IIRC, David Fetter is trying to get some of the required predicate information exposed for use in DBI-Link. Not to self-plug, but if you require it, EnterpriseDB includes Oracle-style database links (SELECT col FROM table@node) which support predicate push-down. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
Kevin, > Not to self-plug, but if you require it, EnterpriseDB includes > Oracle-style database links (SELECT col FROM table@node) which support > predicate push-down. Also check out Skytools: http://skytools.projects.postgresql.org/doc/ -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On Thu, Jun 26, 2008 at 5:41 PM, Josh Berkus <josh@agliodbs.com> wrote: >> Not to self-plug, but if you require it, EnterpriseDB includes >> Oracle-style database links (SELECT col FROM table@node) which support >> predicate push-down. > > Also check out Skytools: http://skytools.projects.postgresql.org/doc/ Hmm, I didn't think the Skype tools could really provide federated database functionality without a good amount of custom work. Or, am I mistaken? -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
Jonah, > Hmm, I didn't think the Skype tools could really provide federated > database functionality without a good amount of custom work. Or, am I > mistaken? Sure, what do you think pl/proxy is for? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On Thu, Jun 26, 2008 at 6:31 PM, Josh Berkus <josh@agliodbs.com> wrote: > Sure, what do you think pl/proxy is for? Well, considering that an application must be written specifically to make use of it, and for very specific scenarios, I wouldn't consider it as making PostgreSQL a federated database. The pl/proxy architecture certainly doesn't resemble federated in the sense of the other database vendors. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
josh@agliodbs.com (Josh Berkus) writes: > Jonah, > >> Hmm, I didn't think the Skype tools could really provide federated >> database functionality without a good amount of custom work. Or, am I >> mistaken? > > Sure, what do you think pl/proxy is for? Ah, but the thing is, it changes the model from a relational one, where you can have fairly arbitrary "where clauses," to one where parameterization of queries must be predetermined. The "hard part" of federated database functionality at this point is the [parenthesized portion] of... select * from table@node [where criterion = x]; What we'd like to be able to do is to ascertain that [where criterion = x] portion, and run it on the remote DBMS, so that only the relevant tuples would come back. Consider... What if table@node is a remote table with 200 million tuples, and [where criterion = x] restricts the result set to 200 of those. If you *cannot* push the "where clause" down to the remote node, then you're stuck with pulling all 200 million tuples, and filtering out, on the "local" node, the 200 tuples that need to be kept. To do better, with pl/proxy, requires having a predetermined function that would do that filtering, and if it's missing, you're stuck pulling 200M tuples, and throwing out nearly all of them. In contrast, with the work David Fetter's looking at, the [where criterion = x] clause would get pushed to the node which the data is being drawn from, and so the query, when running on "table@node," could use indices, and return only the 200 tuples that are of interest. It's a really big win, if it works. -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/lisp.html "The avalanche has started, it is too late for the pebbles to vote" -- Kosh, Vorlon Ambassador to Babylon 5
On 6/27/08, Chris Browne <cbbrowne@acm.org> wrote: > josh@agliodbs.com (Josh Berkus) writes: > > Jonah, > > > >> Hmm, I didn't think the Skype tools could really provide federated > >> database functionality without a good amount of custom work. Or, am I > >> mistaken? > > > > Sure, what do you think pl/proxy is for? > > > Ah, but the thing is, it changes the model from a relational one, > where you can have fairly arbitrary "where clauses," to one where > parameterization of queries must be predetermined. > > The "hard part" of federated database functionality at this point is > the [parenthesized portion] of... > > select * from table@node [where criterion = x]; > > What we'd like to be able to do is to ascertain that [where criterion > = x] portion, and run it on the remote DBMS, so that only the relevant > tuples would come back. > > Consider... > > What if table@node is a remote table with 200 million tuples, and > [where criterion = x] restricts the result set to 200 of those. > > If you *cannot* push the "where clause" down to the remote node, then > you're stuck with pulling all 200 million tuples, and filtering out, > on the "local" node, the 200 tuples that need to be kept. > > To do better, with pl/proxy, requires having a predetermined function > that would do that filtering, and if it's missing, you're stuck > pulling 200M tuples, and throwing out nearly all of them. > > In contrast, with the work David Fetter's looking at, the [where > criterion = x] clause would get pushed to the node which the data is > being drawn from, and so the query, when running on "table@node," > could use indices, and return only the 200 tuples that are of > interest. > > It's a really big win, if it works. I agree that for doing free-form queries on remote database, the PL/Proxy is not the right answer. (Although the recent patch to support dynamic records with AS clause at least makes them work.) But I want to clarify it's goal - it is not to run "pre-determined queries." It is to run "pre-determined complex transactions." And to make those work in a "federated database" takes huge amount of complexity that PL/Proxy simply sidesteps. At the price of requiring function-based API. But as the function-based API has other advantages even without PL/Proxy, it seems fine tradeoff. -- marko
On Mon, Jun 30, 2008 at 9:16 AM, Marko Kreen <markokr@gmail.com> wrote: > But I want to clarify it's goal - it is not to run "pre-determined > queries." It is to run "pre-determined complex transactions." Yes. > And to make those work in a "federated database" takes huge amount > of complexity that PL/Proxy simply sidesteps. At the price of > requiring function-based API. But as the function-based API has > other advantages even without PL/Proxy, it seems fine tradeoff. Agreed. PL/Proxy has its own set of advantages. As usual, it really just depends on the application and its requirements. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/