Thread: Federated Postgresql architecture ?

Federated Postgresql architecture ?

From
kevin kempter
Date:
Hi List;

Anyone have any experiences to share per setting up a federated
architecture with PostgreSQL ? I wonder if the dblink contrib works
well in a federated scenario, specifically in the setup of the
federated views which equate to a select * from the same table on each
federated server ?

Thanks in advance...


/Kevin

Re: Federated Postgresql architecture ?

From
"Jonah H. Harris"
Date:
On Thu, Jun 26, 2008 at 4:33 PM, kevin kempter
<kevin@kevinkempterllc.com> wrote:
> Anyone have any experiences to share per setting up a federated architecture
> with PostgreSQL ? I wonder if the dblink contrib works well in a federated
> scenario, specifically in the setup of the federated views which equate to a
> select * from the same table on each federated server ?

Because Postgres currently lacks the ability to push down predicates
to individual nodes over a database link, you have to spend a good
amount of time writing PL set-returning functions capable of adding
appropriate WHERE clauses to queries sent over the link.  There are
other things you can do, but it's mostly hackery at this point in
time.  IIRC, David Fetter is trying to get some of the required
predicate information exposed for use in DBI-Link.

Not to self-plug, but if you require it, EnterpriseDB includes
Oracle-style database links (SELECT col FROM table@node) which support
predicate push-down.

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/

Re: Federated Postgresql architecture ?

From
Josh Berkus
Date:
Kevin,

> Not to self-plug, but if you require it, EnterpriseDB includes
> Oracle-style database links (SELECT col FROM table@node) which support
> predicate push-down.

Also check out Skytools:  http://skytools.projects.postgresql.org/doc/

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Federated Postgresql architecture ?

From
"Jonah H. Harris"
Date:
On Thu, Jun 26, 2008 at 5:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Not to self-plug, but if you require it, EnterpriseDB includes
>> Oracle-style database links (SELECT col FROM table@node) which support
>> predicate push-down.
>
> Also check out Skytools:  http://skytools.projects.postgresql.org/doc/

Hmm, I didn't think the Skype tools could really provide federated
database functionality without a good amount of custom work.  Or, am I
mistaken?

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/

Re: Federated Postgresql architecture ?

From
Josh Berkus
Date:
Jonah,

> Hmm, I didn't think the Skype tools could really provide federated
> database functionality without a good amount of custom work.  Or, am I
> mistaken?

Sure, what do you think pl/proxy is for?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Federated Postgresql architecture ?

From
"Jonah H. Harris"
Date:
On Thu, Jun 26, 2008 at 6:31 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Sure, what do you think pl/proxy is for?

Well, considering that an application must be written specifically to
make use of it, and for very specific scenarios, I wouldn't consider
it as making PostgreSQL a federated database.  The pl/proxy
architecture certainly doesn't resemble federated in the sense of the
other database vendors.

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/

Re: Federated Postgresql architecture ?

From
Chris Browne
Date:
josh@agliodbs.com (Josh Berkus) writes:
> Jonah,
>
>> Hmm, I didn't think the Skype tools could really provide federated
>> database functionality without a good amount of custom work.  Or, am I
>> mistaken?
>
> Sure, what do you think pl/proxy is for?

Ah, but the thing is, it changes the model from a relational one,
where you can have fairly arbitrary "where clauses," to one where
parameterization of queries must be predetermined.

The "hard part" of federated database functionality at this point is
the [parenthesized portion] of...

  select * from table@node [where criterion = x];

What we'd like to be able to do is to ascertain that [where criterion
= x] portion, and run it on the remote DBMS, so that only the relevant
tuples would come back.

Consider...

What if table@node is a remote table with 200 million tuples, and
[where criterion = x] restricts the result set to 200 of those.

If you *cannot* push the "where clause" down to the remote node, then
you're stuck with pulling all 200 million tuples, and filtering out,
on the "local" node, the 200 tuples that need to be kept.

To do better, with pl/proxy, requires having a predetermined function
that would do that filtering, and if it's missing, you're stuck
pulling 200M tuples, and throwing out nearly all of them.

In contrast, with the work David Fetter's looking at, the [where
criterion = x] clause would get pushed to the node which the data is
being drawn from, and so the query, when running on "table@node,"
could use indices, and return only the 200 tuples that are of
interest.

It's a really big win, if it works.
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/lisp.html
"The avalanche has started, it is too late for the pebbles to vote"
-- Kosh, Vorlon Ambassador to Babylon 5

Re: Federated Postgresql architecture ?

From
"Marko Kreen"
Date:
On 6/27/08, Chris Browne <cbbrowne@acm.org> wrote:
> josh@agliodbs.com (Josh Berkus) writes:
>  > Jonah,
>  >
>  >> Hmm, I didn't think the Skype tools could really provide federated
>  >> database functionality without a good amount of custom work.  Or, am I
>  >> mistaken?
>  >
>  > Sure, what do you think pl/proxy is for?
>
>
> Ah, but the thing is, it changes the model from a relational one,
>  where you can have fairly arbitrary "where clauses," to one where
>  parameterization of queries must be predetermined.
>
>  The "hard part" of federated database functionality at this point is
>  the [parenthesized portion] of...
>
>   select * from table@node [where criterion = x];
>
>  What we'd like to be able to do is to ascertain that [where criterion
>  = x] portion, and run it on the remote DBMS, so that only the relevant
>  tuples would come back.
>
>  Consider...
>
>  What if table@node is a remote table with 200 million tuples, and
>  [where criterion = x] restricts the result set to 200 of those.
>
>  If you *cannot* push the "where clause" down to the remote node, then
>  you're stuck with pulling all 200 million tuples, and filtering out,
>  on the "local" node, the 200 tuples that need to be kept.
>
>  To do better, with pl/proxy, requires having a predetermined function
>  that would do that filtering, and if it's missing, you're stuck
>  pulling 200M tuples, and throwing out nearly all of them.
>
>  In contrast, with the work David Fetter's looking at, the [where
>  criterion = x] clause would get pushed to the node which the data is
>  being drawn from, and so the query, when running on "table@node,"
>  could use indices, and return only the 200 tuples that are of
>  interest.
>
>  It's a really big win, if it works.

I agree that for doing free-form queries on remote database,
the PL/Proxy is not the right answer.  (Although the recent patch
to support dynamic records with AS clause at least makes them work.)

But I want to clarify it's goal - it is not to run "pre-determined
queries."  It is to run "pre-determined complex transactions."

And to make those work in a "federated database" takes huge amount
of complexity that PL/Proxy simply sidesteps.  At the price of
requiring function-based API.  But as the function-based API has
other advantages even without PL/Proxy, it seems fine tradeoff.

--
marko

Re: Federated Postgresql architecture ?

From
"Jonah H. Harris"
Date:
On Mon, Jun 30, 2008 at 9:16 AM, Marko Kreen <markokr@gmail.com> wrote:
> But I want to clarify it's goal - it is not to run "pre-determined
> queries."  It is to run "pre-determined complex transactions."

Yes.

> And to make those work in a "federated database" takes huge amount
> of complexity that PL/Proxy simply sidesteps.  At the price of
> requiring function-based API.  But as the function-based API has
> other advantages even without PL/Proxy, it seems fine tradeoff.

Agreed.  PL/Proxy has its own set of advantages.

As usual, it really just depends on the application and its requirements.

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/