Federated PG servers -- Was: Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL - Mailing list pgsql-performance

From Mischa Sandberg
Subject Federated PG servers -- Was: Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL
Date
Msg-id 1115846482.428277525b64f@webmail.telus.net
Whole thread Raw
In response to Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Was curious why you pointed out SQL-MED as a SQL-standard approach to
federated servers. Always thought of it as covering access to non-SQL
data, the way the lo_* interface works; as opposed to meshing compatible
(to say nothing of identical) SQL servers. Just checked Jim Melton's
last word on that, to make sure, too. Is there something beyond that,
that I'm missing?

The approach that made first best sense to me (perhaps from having gone
there before) is to leave the SQL syntactically unchanged, and to manage
federated relations via pg_ tables and probably procedures. MSSQL and
Sybase went that route. It won't preclude moving to a system embedded in
the SQL language.

The hurdles for federated SQL service are:
- basic syntax (how to refer to a remote object)
- connection management and delegated security
- timeouts and temporary connection failures
- efficient distributed queries with >1 remote table
- distributed transactions
- interserver integrity constraints

Sometimes the lines get weird because of opportunistic implementations.
For example, for the longest time, MSSQL supported server.db.user.object
references WITHIN STORED PROCEDURES, since the proc engine could hide
some primitive connection management.

PG struck me as such a natural for cross-server queries, because
it keeps everything out in the open, including statistics.
PG is also well set-up to handle heterogeneous table types,
and has functions that return rowsets. Nothing needs to be bent out of
shape syntactically, or in the cross-server interface, to get over the
hurdles above.

The fact that queries hence transactions can't span multiple databases
tells me, PG has a way to go before it can handle dependency on a
distributed transaction monitor.

My 2c.


pgsql-performance by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Sort and index
Next
From: Mischa Sandberg
Date:
Subject: Re: Bad plan after vacuum analyze