Re: dblink: add polymorphic functions. - Mailing list pgsql-hackers
From | Corey Huinker |
---|---|
Subject | Re: dblink: add polymorphic functions. |
Date | |
Msg-id | CADkLM=c6Wip4-g5jnRxhhXtq0XCC4tNwO0oRJBMoYKzbzKMOWw@mail.gmail.com Whole thread Raw |
In response to | Re: dblink: add polymorphic functions. (Merlin Moncure <mmoncure@gmail.com>) |
List | pgsql-hackers |
On Mon, Jul 6, 2015 at 3:52 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
In the earlier iterations of PMPP, I had it simply wrap all queries sent to the remote server inside a 'SELECT * from row_to_json(...)'.
Another option is to handle the intermediate result in json if you'reOn Mon, Jul 6, 2015 at 11:13 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
> On Mon, Jul 6, 2015 at 11:33 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>> On Mon, Jul 6, 2015 at 9:52 AM, Joe Conway <mail@joeconway.com> wrote:
>> > -----BEGIN PGP SIGNED MESSAGE-----
>> > Hash: SHA1
>> >
>> > On 07/06/2015 07:37 AM, Merlin Moncure wrote:
>> >> yup, and at least one case now fails where previously it ran
>> >> through: postgres=# select * from dblink('a', 'b', 'c'); ERROR:
>> >> function dblink(unknown, unknown, unknown) is not unique
>> >
>> > Hmm, that is an issue, possibly a fatal one.
>> >
>> > When Cory first mentioned this to me over a year ago we discussed some
>> > other, arguably better and more generic solutions. One was to build
>> > support for:
>> >
>> > SELECT * FROM srf() AS TYPE OF(foo);
>> >
>> > The second idea I think is actually SQL standard if I recall correctly:
>> >
>> > SELECT * FROM CAST(srf() AS foo) x;
>> >
>> > Currently this works:
>> >
>> > 8<--------------------
>> > select *
>> > from cast (row(11,'l','{a11,b11,c11}') as foo);
>> > f1 | f2 | f3
>> > - ----+----+---------------
>> > 11 | l | {a11,b11,c11}
>> > (1 row)
>> > 8<--------------------
>> >
>> > But this fails:
>> >
>> > 8<--------------------
>> > select *
>> > from cast
>> > (dblink('dbname=contrib_regression','select * from foo') as foo);
>> > ERROR: cannot cast type record to foo
>> > 8<--------------------
>> >
>> > Comments in the source have this to say:
>> >
>> > 8<--------------------
>> > /*
>> > * coerce_record_to_complex
>> > * Coerce a RECORD to a specific composite type.
>> > *
>> > * Currently we only support this for inputs that are RowExprs or
>> > * whole-row Vars.
>> > */
>> > 8<--------------------
>> >
>> > That explains why the first example works while the second does not.
>> > I'm not sure how hard it would be to fix that, but it appears that
>> > that is where we should focus.
>>
>> Yeah. FWIW, here's my 0.02$: I use dblink all the time, for all
>> kinds of reasons, vastly preferring to have control over the query
>> string (vs. FDW type solutions). I have two basic gripes with it. #1
>> is that remote queries are not cancelled over all call sites when
>> cancelled locally (off-topic for this thread) and #2 is that the SRF
>> record describing mechanics are not abstractable because of using
>> syntax to describe the record. Corey's proposal, overloading issues
>> aside, appears to neatly deal with this problem because anyelement can
>> be passed down through a wrapping API.
>>
>> IOW, I'd like to do:
>> CREATE FUNCTION remote_call(...) RETURNS ... AS
>> $$
>> SELECT dblink(...) AS r(...)
>> $$ language sql;
>>
>> ...which can't be done (discounting dynamic sql acrobatics) because of
>> the syntax based expression of the 'r' record. So I like Corey's
>> idea...I just think the functions need to be named differently (maybe
>> to 'dblink_any', and dblink_get_result_any'?). TBH, to do better
>> than that you'd need SQL level support for handling the return type in
>> the vein of NEW/OLD. For fun, let's call it 'OUT'...then you could:
>>
>> SELECT * FROM remote_call(...) RETURNS SETOF foo;
>>
>> Inside remote_call, you'd see something like:
>>
>> SELECT dblink(...) AS OUT;
>>
>> As to the proposed syntax, I would vote to support the SQL standard
>> variant if it could be handled during parse. I don't see what AS TYPE
>> OF really buys you because FWICT it does not support wrapping.
>>
>> merlin
>
>
> Your experiences with dblink are very similar to mine.
>
> The whole issue arose for me as an outcropping of my Poor Man's Parallel
> Processing extension (still not released but currently working for us in
> production internally).
>
> At some point I had to do dblink_get_result(...) as t(...) and not only did
> I have to render the structure as a string, I was going to have to execute
> that SQL dynamically (because plpgsql lacks a PREPARE statement) or I was
> going to have to re-code in C or plv8. Overall those calls aren't terribly
> expensive (it's working in production - for us - without this dblink
> modification), but a cleaner solution would be better.
willing to sacrifice a little bit of performance. For example,
suppose you wanted to log every dblink call through an wrapper without
giving up the ability to handle arbitrary result sets:
CREATE OR REPLACE FUNCTION dblink_log(_query TEXT) RETURNS SETOF JSON AS
$$
BEGIN
RAISE WARNING 'got dblink %', _query;
RETURN QUERY SELECT * FROM dblink(
'host=localhost',
format('SELECT row_to_json(q) from (%s) q', _query))
AS R(j json);
END;
$$ LANGUAGE PLPGSQL;
postgres=# select * from dblink_log('select 0 as value');
WARNING: got dblink select 0 as value
dblink_log
─────────────
{"value":0}
(1 row)
With json, you have a number of good options to convert back to a
record. For example,
postgres=# CREATE TYPE foo AS (value int);
CREATE TYPE
SELECT p.*
FROM dblink_log('SELECT s AS value FROM generate_series(1,3) s') j
CROSS JOIN LATERAL json_populate_record(null::foo, j) p;
WARNING: got dblink select s as value from generate_series(1,3) s
value
───────
1
2
3
(3 rows)
Note, I still support the case behind your patch, but, if it uh,
doesn't make it through, it's good to have options :-).
merlin
(again, more backstory to enhance other's understanding of the issue)
The serialization/deserialization was a performance hit, offset slightly by having the RETURN QUERY SELECT json_field from dblink_get_result('c1') as t(json_field json) be a static (reusable) query.
The ugliness of decomposing the fields from json was no fun, and Merlin's suggestion of json_populate_record ( I don't remember if that function existed at the time...) would get around that, albeit with another performance hit.
pgsql-hackers by date: