Re: pgsql_fdw, FDW for PostgreSQL server - Mailing list pgsql-hackers

From Albe Laurenz
Subject Re: pgsql_fdw, FDW for PostgreSQL server
Date
Msg-id D960CB61B694CF459DCFB4B0128514C207B2CBB7@exadv11.host.magwien.gv.at
Whole thread Raw
In response to Re: pgsql_fdw, FDW for PostgreSQL server  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Responses Re: pgsql_fdw, FDW for PostgreSQL server  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Shigeru HANADA wrote:
>> During a foreign scan, type input functions are used to convert
>> the text representation of values.  If a foreign table is
misconfigured,
>> you can get error messages from these functions, like:
>>
>> ERROR:  invalid input syntax for type double precision: "etwas"
>> or
>> ERROR:  value too long for type character varying(3)
>>
>> It might me nice for finding problems if the message were
>> something like:
>>
>> ERROR:  cannot convert data in foreign scan of "tablename", column
"col"
>> in row 42
>> DETAIL:  ERROR:  value too long for type character varying(3)
>
> Agreed.  How about showing context information with errcontext() in
> addition to main error message?  Of course, identifiers are quoted if
> necessary.  This way doesn't need additional PG_TRY block, so overhead
> would be relatively cheap.
>
> postgres=# SELECT * FROM ft1 WHERE c1 = 1;  -- ERROR
> ERROR:  invalid input syntax for integer: "1970-01-02 17:00:00+09"
> CONTEXT:  column c4 of foreign table ft1
>
> Showing index of the row seems overkill, because most cause of this
kind
> of error is wrong configuration, as you say, and users would be able
to
> address the issue without knowing which record caused the error.

Agreed.  I think that is a better approach than what I suggested.

>> As stated previously, I don't think that using local stats on
>> foreign tables is a win.  The other patches work fine for me, and
>> I'd be happy if that could go into 9.2.
>
> I have opposite opinion on this issue because we need to do some of
> filtering on local side.  We can leave cost/rows estimation to remote
> side about WHERE expressions which are pushed down, but we need
> selectivity of extra filtering done on local side.  For such purpose,
> having local stats of foreign data seems reasonable and useful.
>
> Of course, it has downside that we need to execute explicit ANALYZE
for
> foreign tables which would cause full sequential scan on remote
tables,
> in addition to ANALYZE for remote tables done on remote side as usual
> maintenance work.

This approach is much better and does not suffer from the
limitations the original analyze patch had.

I think that the price of a remote table scan is something
we should be willing to pay for good local statistics.
And there is always the option not to analyze the foreign
table if you are not willing to pay that price.

Maybe the FDW API could be extended so that foreign data wrappers
can provide a random sample to avoid a full table scan.

> Attached patch contains changes below:
>
> pgsql_fdw_v19.patch
>   - show context of data conversion error
>   - move codes for fetch_count FDW option to option.c
>     (refactoring)
> pgsql_fdw_pushdown_v12.patch
>   - make deparseExpr function static (refactoring)
>
> I also attached pgsql_fdw_analyze for only testing the effect of local
> statistics.  It contains both backend's ANALYZE command support and
> pgsql_fdw's ANALYZE support.

I think the idea is promising.

I'll mark the patch as "ready for committer".

Yours,
Laurenz Albe


pgsql-hackers by date:

Previous
From: Dobes Vandermeer
Date:
Subject: Re: HTTP Frontend? (and a brief thought on materialized views)
Next
From: Boszormenyi Zoltan
Date:
Subject: Re: [PATCH] lock_timeout and common SIGALRM framework