Thread: FDW for PostgreSQL
Hi all, I'd like to propose FDW for PostgreSQL as a contrib module again. Attached patch is updated version of the patch proposed in 9.2 development cycle. For ease of review, I summarized what the patch tries to achieve. Abstract ======== This patch provides FDW for PostgreSQL which allows users to access external data stored in remote PostgreSQL via foreign tables. Of course external instance can be beyond network. And I think that this FDW could be an example of other RDBMS-based FDW, and it would be useful for proof-of-concept of FDW-related features. Note that the name has been changed from "pgsql_fdw" which was used in last proposal, since I got a comment which says that most of existing FDWs have name "${PRODUCT_NAME}_fdw" so "postgresql_fdw" or "postgres_fdw" would be better. For this issue, I posted another patch which moves existing postgresql_fdw_validator into contrib/dblink with renaming in order to reserve the name "postgresql_fdw" for this FDW. Please note that the attached patch requires dblink_fdw_validator.patch to be applied first. http://archives.postgresql.org/pgsql-hackers/2012-09/msg00454.php Query deparser ============== Now postgresql_fdw has its own SQL query deparser inside, so it's free from backend's ruleutils module. This deparser maps object names when generic options below were set. nspname of foreign table: used as namespace (schema) of relation relname of foreign table: used as relation name colname of foreign column: used as column name This mapping allows flexible schema design. SELECT optimization =================== postgresql_fdw always retrieves as much columns as foreign table from remote to avoid overhead of column mapping. However, often some of them (or sometimes all of them) are not used on local side, so postgresql_fdw uses NULL literal as such unused columns in SELECT clause of remote query. For example, let's assume one of pgbench workloads: SELECT abalance FROM pgbench_accounts WHERE aid = 1; This query generates a remote query below. In addition to bid and filler, aid is replaced with NULL because it's already evaluated on remote side. SELECT NULL, NULL, abalance, NULL FROM pgbench_accounts WHERE (aid OPERATOR(pg_catalog.=) 1); This trick would improve performance notably by reducing amount of data to be transferred. One more example. Let's assume counting rows. SELCT count(*) FROM pgbench_accounts; This query requires only existence of row, so no actual column reference is in SELECT clause. SELECT NULL, NULL, NULL, NULL FROM pgbench_accounts; WHERE push down =============== postgresql_fdw pushes down some of restrictions (IOW, top level elements in WHERE clause which are connected with AND) which can be evaluated on remote side safely. Currently the criteria "safe" is declared as whether an expression contains only: - column reference - constant of bult-in type (scalar and array) - external parameter of EXECUTE statement - built-in operator which uses built-in immutable function (operator cannot be collative unless it's "=" or "<>") - built-in immutable function Some other elements might be also safe to be pushed down, but criteria above seems enough for basic use cases. Although it might seem odd, but operators are deparsed into OPERATOR notation to avoid search_path problem. E.g. local query : WHERE col = 1 remote query: WHERE (col OPERATOR(pg_catalog.=) 1) Connection management ===================== postgresql_fdw has its own connection manager. Connection is established when first foreign scan on a server is planned, and it's pooled in the backend. If another foreign scan on same server is invoked, same connection will be used. Connection pool is per-backend. This means that different backends never share connection. postgresql_fdw_connections view shows active connections, and postgresql_fdw_disconnect() allows users to discard particular connection at arbitrary timing. Transaction management ====================== If multiple foreign tables on same foreign server is used in a local query, postgresql_fdw uses same connection to retrieve results in a transaction to make results consistent. Currently remote transaction is closed at the end of local query, so following local query might produce inconsistent result. Costs estimation ================ To estimate costs and result rows of a foreign scan, postgresql_fdw executes EXPLAIN statement on remote side, and retrieves costs and rows values from the result. For cost estimation, cost of connection establishment and data transfer are added to the base costs. Currently these two factors is hard-coded, but making them configurable is not so difficult. Executing EXPLAIN is not cheap, but remote query itself is usually very expensive, so such additional cost would be acceptable. ANALYZE support =============== postgresql_fdw supports ANALYZE to improve selectivity estimation of filtering done on local side (WHERE clauses which could not been pushed down. The sampler function retrieves all rows from remote table and skip some of them so that result fits requested size. As same as file_fdw, postgresql_fdw doesn't care order of result, because it's important for only correlation, and correlation is important for only index scans, which is not supported for this FDW. Fetching Data ============= postgresql_fdw uses single-row mode of libpq so that memory usage is kept in low level even if the result is huge. To cope with difference of encoding, postgresql_fdw automatically sets client_encoding to server encoding of local database. Future improvement ================== I have some ideas for improvement: - Provide sorted result path (requires index information?) - Provide parameterized path - Transaction mapping between local and remotes (2PC) - binary transfer (only against servers with same PG major version?) - JOIN push-down (requires support by core) Any comments and questions are welcome. -- Shigeru HANADA
Attachment
Hanada-san, I tried to check this patch. Because we also had some discussion on this extension through the last two commit fests, I have no fundamental design arguments. So, let me drop in the implementation detail of this patch. At the postgresql_fdw/deparse.c, * Even though deparseVar() is never invoked with need_prefix=true, I doubt why Var reference needs to be qualified with relationalias. It seems to me relation alias is never used in the remote query, so isn't it a possible bug? * deparseFuncExpr() has case handling depending on funcformat of FuncExpr. I think all the cases can be deparsed using explicitfunction call, and it can avoid a trouble when remote host has inconsistent cast configuration. At the postgresql_fdw/connection.c, * I'm worry about the condition for invocation of begin_remote_tx(). + if (use_tx && entry->refs == 1) + begin_remote_tx(entry->conn);+ entry->use_tx = use_tx; My preference is: if (use_tx && !entry->use_tx), instead. Even thoughhere is no code path to make a problem obvious, it may cause possible difficult-to-find bug, in case when caller triedto get a connection being already acquired by someone but no transaction needed. At the postgresql_fdw/postgresql_fdw.c, * When pgsqlGetForeignPaths() add SQL statement into fdw_private, it is implemented as: + /* Construct list of SQL statementsand bind it with the path. */ + fdw_private = lappend(NIL, makeString(fpstate->sql.data)); Could you use list_make1()instead? * At the bottom half of query_row_processor(), I found these mysterious two lines. MemoryContextSwitchTo(festate->scan_cxt); MemoryContextSwitchTo(festate->temp_cxt); Why not switch temp_cxt directly? At the sgml/postgresql-fdw.sgml, * Please add this version does not support sub-transaction handling. Especially, all we can do is abort top-level transactionin case when an error is occurred at the remote side within sub-transaction. I hope to take over this patch for committer soon. Thanks, 2012/9/14 Shigeru HANADA <shigeru.hanada@gmail.com>: > Hi all, > > I'd like to propose FDW for PostgreSQL as a contrib module again. > Attached patch is updated version of the patch proposed in 9.2 > development cycle. > > For ease of review, I summarized what the patch tries to achieve. > > Abstract > ======== > This patch provides FDW for PostgreSQL which allows users to access > external data stored in remote PostgreSQL via foreign tables. Of course > external instance can be beyond network. And I think that this FDW > could be an example of other RDBMS-based FDW, and it would be useful for > proof-of-concept of FDW-related features. > > Note that the name has been changed from "pgsql_fdw" which was used in > last proposal, since I got a comment which says that most of existing > FDWs have name "${PRODUCT_NAME}_fdw" so "postgresql_fdw" or > "postgres_fdw" would be better. For this issue, I posted another patch > which moves existing postgresql_fdw_validator into contrib/dblink with > renaming in order to reserve the name "postgresql_fdw" for this FDW. > Please note that the attached patch requires dblink_fdw_validator.patch > to be applied first. > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00454.php > > Query deparser > ============== > Now postgresql_fdw has its own SQL query deparser inside, so it's free > from backend's ruleutils module. > > This deparser maps object names when generic options below were set. > > nspname of foreign table: used as namespace (schema) of relation > relname of foreign table: used as relation name > colname of foreign column: used as column name > > This mapping allows flexible schema design. > > SELECT optimization > =================== > postgresql_fdw always retrieves as much columns as foreign table from > remote to avoid overhead of column mapping. However, often some of them > (or sometimes all of them) are not used on local side, so postgresql_fdw > uses NULL literal as such unused columns in SELECT clause of remote > query. For example, let's assume one of pgbench workloads: > > SELECT abalance FROM pgbench_accounts WHERE aid = 1; > > This query generates a remote query below. In addition to bid and > filler, aid is replaced with NULL because it's already evaluated on > remote side. > > SELECT NULL, NULL, abalance, NULL FROM pgbench_accounts > WHERE (aid OPERATOR(pg_catalog.=) 1); > > This trick would improve performance notably by reducing amount of data > to be transferred. > > One more example. Let's assume counting rows. > > SELCT count(*) FROM pgbench_accounts; > > This query requires only existence of row, so no actual column reference > is in SELECT clause. > > SELECT NULL, NULL, NULL, NULL FROM pgbench_accounts; > > WHERE push down > =============== > postgresql_fdw pushes down some of restrictions (IOW, top level elements > in WHERE clause which are connected with AND) which can be evaluated on > remote side safely. Currently the criteria "safe" is declared as > whether an expression contains only: > - column reference > - constant of bult-in type (scalar and array) > - external parameter of EXECUTE statement > - built-in operator which uses built-in immutable function > (operator cannot be collative unless it's "=" or "<>") > - built-in immutable function > > Some other elements might be also safe to be pushed down, but criteria > above seems enough for basic use cases. > > Although it might seem odd, but operators are deparsed into OPERATOR > notation to avoid search_path problem. > E.g. > local query : WHERE col = 1 > remote query: WHERE (col OPERATOR(pg_catalog.=) 1) > > Connection management > ===================== > postgresql_fdw has its own connection manager. Connection is > established when first foreign scan on a server is planned, and it's > pooled in the backend. If another foreign scan on same server is > invoked, same connection will be used. Connection pool is per-backend. > This means that different backends never share connection. > > postgresql_fdw_connections view shows active connections, and > postgresql_fdw_disconnect() allows users to discard particular > connection at arbitrary timing. > > Transaction management > ====================== > If multiple foreign tables on same foreign server is used in a local > query, postgresql_fdw uses same connection to retrieve results in a > transaction to make results consistent. Currently remote transaction is > closed at the end of local query, so following local query might produce > inconsistent result. > > Costs estimation > ================ > To estimate costs and result rows of a foreign scan, postgresql_fdw > executes EXPLAIN statement on remote side, and retrieves costs and rows > values from the result. For cost estimation, cost of connection > establishment and data transfer are added to the base costs. Currently > these two factors is hard-coded, but making them configurable is not so > difficult. > > Executing EXPLAIN is not cheap, but remote query itself is usually very > expensive, so such additional cost would be acceptable. > > ANALYZE support > =============== > postgresql_fdw supports ANALYZE to improve selectivity estimation of > filtering done on local side (WHERE clauses which could not been pushed > down. The sampler function retrieves all rows from remote table and > skip some of them so that result fits requested size. As same as > file_fdw, postgresql_fdw doesn't care order of result, because it's > important for only correlation, and correlation is important for only > index scans, which is not supported for this FDW. > > Fetching Data > ============= > postgresql_fdw uses single-row mode of libpq so that memory usage is > kept in low level even if the result is huge. > > To cope with difference of encoding, postgresql_fdw automatically sets > client_encoding to server encoding of local database. > > Future improvement > ================== > I have some ideas for improvement: > - Provide sorted result path (requires index information?) > - Provide parameterized path > - Transaction mapping between local and remotes (2PC) > - binary transfer (only against servers with same PG major version?) > - JOIN push-down (requires support by core) > > Any comments and questions are welcome. > -- > Shigeru HANADA > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Fri, Sep 14, 2012 at 9:25 AM, Shigeru HANADA <shigeru.hanada@gmail.com> wrote: > Hi all, > > I'd like to propose FDW for PostgreSQL as a contrib module again. > Attached patch is updated version of the patch proposed in 9.2 > development cycle. very nice. > - binary transfer (only against servers with same PG major version?) Unfortunately this is not enough -- at least not without some additional work. The main problem is user defined types, especially composites. Binary wire format sends internal member oids which the receiving server will have to interpret. merlin
Kaigai-san, Thanks for the review. On Thu, Oct 4, 2012 at 6:10 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > At the postgresql_fdw/deparse.c, > > * Even though deparseVar() is never invoked with need_prefix=true, > I doubt why Var reference needs to be qualified with relation alias. > It seems to me relation alias is never used in the remote query, > so isn't it a possible bug? This must be a remaining of my effort against supporting JOIN-push-down (it is one of future improvements). At the moment it is not clear what should be used as column prefix, so I removed need_prefix parameter to avoid possible confusion. I removed need_prefix from deparseRelation as well. > * deparseFuncExpr() has case handling depending on funcformat > of FuncExpr. I think all the cases can be deparsed using explicit > function call, and it can avoid a trouble when remote host has > inconsistent cast configuration. Hm, your point is that specifying underlying function, e.g. "cast_func(value)", is better than simple cast notation,e.g. "value::type" and "CAST(value AS type)", because such explicit statement prevents possible problems caused by difference of cast configuration, right? If so, I agree about explicit casts. I'm not sure about implicit casts because it seems difficult to avoid unexpected implicit cast at all. As background, I just followed the logic implemented in ruleutils.c for FuncExpr, which deparses explicit cast in format of 'value::type'. If it's sure that FuncExpr comes from cast never takes arguments more than one, we can go your way. I'll check it. > At the postgresql_fdw/connection.c, > > * I'm worry about the condition for invocation of begin_remote_tx(). > + if (use_tx && entry->refs == 1) > + begin_remote_tx(entry->conn); > + entry->use_tx = use_tx; > My preference is: if (use_tx && !entry->use_tx), instead. > Even though here is no code path to make a problem obvious, > it may cause possible difficult-to-find bug, in case when caller > tried to get a connection being already acquired by someone > but no transaction needed. I got it. In addition, I fixed ReleaseConnection to call abort_remote_tx after decrementing reference counter, as GetConnection does for begin_remote_tx. > At the postgresql_fdw/postgresql_fdw.c, > > * When pgsqlGetForeignPaths() add SQL statement into > fdw_private, it is implemented as: > + /* Construct list of SQL statements and bind it with the path. */ > + fdw_private = lappend(NIL, makeString(fpstate->sql.data)); > Could you use list_make1() instead? Fixed. > > * At the bottom half of query_row_processor(), I found these > mysterious two lines. > MemoryContextSwitchTo(festate->scan_cxt); > MemoryContextSwitchTo(festate->temp_cxt); > Why not switch temp_cxt directly? It must be a copy-and-paste mistake. Removed both. > > At the sgml/postgresql-fdw.sgml, > > * Please add this version does not support sub-transaction handling. > Especially, all we can do is abort top-level transaction in case when > an error is occurred at the remote side within sub-transaction. I don't think that abort of local top-level transaction is not necessary in such case, because now connection_cleanup() closes remote connection whenever remote error occurs in sub-transactions. For instance, we can recover from remote syntax error (it could easily happen from wrong relname setting) by ROLLBACK. Am I missing something? $ ALTER FOREIGN TABLE foo OPTIONS (SET relname 'invalid'); $ BEGIN; -- explicit transaction $ SAVEPOINT a; -- begin sub-transaction $ SELECT * FROM foo; -- this causes remote error, then remote -- connection is closed automatically $ ROLLBACK TO a; -- clears local error state $ SELECT 1; -- this should be successfully executed > I hope to take over this patch for committer soon. I hope so too :) Please examine attached v2 patch (note that is should be applied onto latest dblink_fdw_validator patch). Regards, -- Shigeru HANADA
Attachment
Hi Hanada-san, > Please examine attached v2 patch (note that is should be applied onto > latest dblink_fdw_validator patch). I've reviewed your patch quickly. I noticed that the patch has been created in a slightly different way from the guidelines: http://www.postgresql.org/docs/devel/static/fdw-planning.html The guidelines say the following, but the patch identifies the clauses in baserel->baserestrictinfo in GetForeignRelSize, not GetForeignPaths. Also, it has been implemented so that most sub_expressions are evaluated at the remote end, not the local end, though I'm missing something. For postgresql_fdw to be a good reference for FDW developers, ISTM it would be better that it be consistent with the guidelines. I think it would be needed to update the following document or redesign the function to be consistent with the following document. As an example, the FDW might identify some restriction clauses of the form foreign_variable= sub_expression, which it determines can be executed on the remote server given the locally-evaluated value of the sub_expression. The actual identification of such a clause should happen during GetForeignPaths, since it would affect the cost estimate for the path. Thanks, Best regards, Etsuro Fujita
2012/10/11 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>: > Hi Hanada-san, > >> Please examine attached v2 patch (note that is should be applied onto >> latest dblink_fdw_validator patch). > > I've reviewed your patch quickly. I noticed that the patch has been created in > a slightly different way from the guidelines: > http://www.postgresql.org/docs/devel/static/fdw-planning.html The guidelines > say the following, but the patch identifies the clauses in > baserel->baserestrictinfo in GetForeignRelSize, not GetForeignPaths. Also, it > has been implemented so that most sub_expressions are evaluated at the remote > end, not the local end, though I'm missing something. For postgresql_fdw to be > a good reference for FDW developers, ISTM it would be better that it be > consistent with the guidelines. I think it would be needed to update the > following document or redesign the function to be consistent with the following > document. > Hmm. It seems to me Fujita-san's comment is right. Even though the latest implementation gets an estimated number of rows using EXPLAIN with qualified SELECT statement on remote side, then, it is adjusted with selectivity of local qualifiers, we shall be able to obtain same cost estimation because postgresql_fdw assumes all the pushed-down qualifiers are built-in only. So, is it available to move classifyConditions() to pgsqlGetForeignPlan(), then, separate remote qualifiers and local ones here? If we call get_remote_estimate() without WHERE clause, remote side will give just whole number of rows of remote table. Then, it can be adjusted with selectivity of "all" the RestrictInfo (including both remote and local). Sorry, I should suggest it at the beginning. This change may affects the optimization that obtains remote columns being in-use at local side. Let me suggest an expression walker that sets member of BitmapSet for columns in-use at local side or target list. Then, we can list up them on the target list of the remote query. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Shigeru HANADA escribió: > Please examine attached v2 patch (note that is should be applied onto > latest dblink_fdw_validator patch). Tom committed parts of the dblink_fdw_validator patch, but not the removal, so it seems this patch needs to be rebased on top of that somehow. I am not able to say what's the best resolution for that conflict, however. But it seems to me that maybe you will need to choose a different name for the validator after all, to support binary upgrades. There are some other comments downthread that a followup patch probably needs to address too, as well. I am marking this patch Returned with Feedback. Please submit an updated version to CF3. Thanks. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Sorry for delayed response. On Sun, Oct 21, 2012 at 3:16 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > 2012/10/11 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>: >> I've reviewed your patch quickly. I noticed that the patch has been created in >> a slightly different way from the guidelines: >> http://www.postgresql.org/docs/devel/static/fdw-planning.html The guidelines >> say the following, but the patch identifies the clauses in >> baserel->baserestrictinfo in GetForeignRelSize, not GetForeignPaths. Also, it >> has been implemented so that most sub_expressions are evaluated at the remote >> end, not the local end, though I'm missing something. For postgresql_fdw to be >> a good reference for FDW developers, ISTM it would be better that it be >> consistent with the guidelines. I think it would be needed to update the >> following document or redesign the function to be consistent with the following >> document. >> > Hmm. It seems to me Fujita-san's comment is right. Indeed postgresql_fdw touches baserestrictinfo in GetForeignRelSize, but it's because of optimization for better width estimate. The doc Fujita-san pointed says that: > The actual identification of such a clause should happen during > GetForeignPaths, since it would affect the cost estimate for the > path. I understood this description says that "you need to touch baserestrict info *before* GetForeignPlan to estimate costs of optimized path". I don't feel that this description prohibits FDW to touch baserestrictinfo in GetForeignRelSize, but mentioning it clearly might be better. Regards, -- Shigeru HANADA
2012/11/6 Shigeru HANADA <shigeru.hanada@gmail.com>: > Sorry for delayed response. > > On Sun, Oct 21, 2012 at 3:16 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> 2012/10/11 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>: >> >>> I've reviewed your patch quickly. I noticed that the patch has been >>> created in >>> a slightly different way from the guidelines: >>> http://www.postgresql.org/docs/devel/static/fdw-planning.html The >>> guidelines >>> say the following, but the patch identifies the clauses in >>> baserel->baserestrictinfo in GetForeignRelSize, not GetForeignPaths. >>> Also, it >>> has been implemented so that most sub_expressions are evaluated at the >>> remote >>> end, not the local end, though I'm missing something. For postgresql_fdw >>> to be >>> a good reference for FDW developers, ISTM it would be better that it be >>> consistent with the guidelines. I think it would be needed to update the >>> following document or redesign the function to be consistent with the >>> following >>> document. >>> >> Hmm. It seems to me Fujita-san's comment is right. > > > Indeed postgresql_fdw touches baserestrictinfo in GetForeignRelSize, but > it's because of optimization for better width estimate. The doc Fujita-san > pointed says that: > > >> The actual identification of such a clause should happen during >> GetForeignPaths, since it would affect the cost estimate for the >> path. > > > I understood this description says that "you need to touch baserestrict info > *before* GetForeignPlan to estimate costs of optimized path". I don't feel > that this description prohibits FDW to touch baserestrictinfo in > GetForeignRelSize, but mentioning it clearly might be better. > Hanada-san, Isn't it possible to pick-up only columns to be used in targetlist or local qualifiers, without modification of baserestrictinfo? Unless we put WHERE clause on EXPLAIN statement for cost estimation on GetForeignRelSize, all we have to know is list of columns to be fetched using underlying queries. Once we construct a SELECT statement without WHERE clause on GetForeignRelSize stage, it is never difficult to append it later according to the same criteria being implemented at classifyConditions. I'd like to see committer's opinion here. Please give Hanada-san some directions. -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Kohei KaiGai <kaigai@kaigai.gr.jp> writes: > 2012/11/6 Shigeru HANADA <shigeru.hanada@gmail.com>: >> Indeed postgresql_fdw touches baserestrictinfo in GetForeignRelSize, but >> it's because of optimization for better width estimate. The doc Fujita-san >> pointed says that: >>> The actual identification of such a clause should happen during >>> GetForeignPaths, since it would affect the cost estimate for the >>> path. >> I understood this description says that "you need to touch baserestrict info >> *before* GetForeignPlan to estimate costs of optimized path". I don't feel >> that this description prohibits FDW to touch baserestrictinfo in >> GetForeignRelSize, but mentioning it clearly might be better. > Isn't it possible to pick-up only columns to be used in targetlist or > local qualifiers, without modification of baserestrictinfo? What the doc means to suggest is that you can look through the baserestrictinfo list and then record information elsewhere about interesting clauses you find. If the FDW is actually *modifying* that list, I agree that seems like a bad idea. I don't recall anything in the core system that does that, so it seems fragile. The closest parallel I can think of in the core system is indexscans pulling out restriction clauses to use as index quals. That code doesn't modify the baserestrictinfo list, only make new lists with some of the same entries. regards, tom lane
On 2012/11/07, at 1:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Isn't it possible to pick-up only columns to be used in targetlist or >> local qualifiers, without modification of baserestrictinfo? > > What the doc means to suggest is that you can look through the > baserestrictinfo list and then record information elsewhere about > interesting clauses you find. If the FDW is actually *modifying* that > list, I agree that seems like a bad idea. Kaigai-san might have misunderstood that postgresql_fdw changes baserestrictinfo, since it did so in old implementation. ClassifyConditions creates new lists, local_conds and remote_conds, which have cells which point RestrictInfo(s) in baserestrictinfo. It doesn't copy RestrictInfo for new lists, but I think it's ok because baserestrictinfo list itself and RestrictInfo(s) pointed by it are never modified by postgresql_fdw. > I don't recall anything in > the core system that does that, so it seems fragile. The closest > parallel I can think of in the core system is indexscans pulling out > restriction clauses to use as index quals. That code doesn't modify > the baserestrictinfo list, only make new lists with some of the same > entries. Thanks for the advise. I found relation_excluded_by_constraints which is called by set_rel_size() creates new RestrictInfo list from baserestrictinfo, and this seems like what postgresql_fdw does in GetForeignRelSize, from the perspective of relation size estimation. Regards, -- Shigeru HANADA shigeru.hanada@gmail.com
花田 茂 <shigeru.hanada@gmail.com> writes: > ClassifyConditions creates new lists, local_conds and remote_conds, > which have cells which point RestrictInfo(s) in baserestrictinfo. > It doesn't copy RestrictInfo for new lists, but I think it's ok > because baserestrictinfo list itself and RestrictInfo(s) pointed by > it are never modified by postgresql_fdw. That's good. I think there are actually some assumptions that RestrictInfo nodes are not copied once created. You can link them into new lists all you want, but don't copy them. regards, tom lane
Hi Kaigai-san,
Shigeru HANADA
Sorry for delayed response. I updated the patch, although I didn't change any about timing issue you and Fujita-san concern.
1) add some FDW options for cost estimation. Default behavior is not changed.
2) get rid of array of libpq option names, similary to recent change of dblink
3) enhance document, especially remote query optimization
4) rename to postgres_fdw, to avoid naming conflict with the validator which exists in core
5) cope with changes about error context handling
On Tue, Nov 6, 2012 at 7:36 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
-- local qualifiers,Isn't it possible to pick-up only columns to be used in targetlist or
without modification of baserestrictinfo?
IMO, it's possible. postgres_fdw doesn't modify baserestrictinfo at all; it just create two new lists which exclusively point RestrictInfo elements in baserestrictinfo. Pulling vars up from conditions which can't be pushed down would gives us list of necessary columns. Am I missing something?
Shigeru HANADA
Attachment
2012/11/15 Shigeru Hanada <shigeru.hanada@gmail.com>: > Hi Kaigai-san, > > Sorry for delayed response. I updated the patch, although I didn't change > any about timing issue you and Fujita-san concern. > > 1) add some FDW options for cost estimation. Default behavior is not > changed. > 2) get rid of array of libpq option names, similary to recent change of > dblink > 3) enhance document, especially remote query optimization > 4) rename to postgres_fdw, to avoid naming conflict with the validator which > exists in core > 5) cope with changes about error context handling > > On Tue, Nov 6, 2012 at 7:36 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> Isn't it possible to pick-up only columns to be used in targetlist or >> local qualifiers, >> without modification of baserestrictinfo? > > > IMO, it's possible. postgres_fdw doesn't modify baserestrictinfo at all; it > just create two new lists which exclusively point RestrictInfo elements in > baserestrictinfo. Pulling vars up from conditions which can't be pushed > down would gives us list of necessary columns. Am I missing something? > Hanada-san, Let me comments on several points randomly. I also think the new "use_remote_explain" option is good. It works fine when we try to use this fdw over the network with latency more or less. It seems to me its default is "false", thus, GetForeignRelSize will return the estimated cost according to ANALYZE, instead of remote EXPLAIN. Even though you mention the default behavior was not changed, is it an expected one? My preference is the current one, as is. The deparseFuncExpr() still has case handling whether it is explicit case, implicit cast or regular functions. If my previous proposition has no flaw, could you fix up it using regular function invocation manner? In case when remote node has incompatible implicit-cast definition, this logic can make a problem. At InitPostgresFdwOptions(), the source comment says we don't use malloc() here for simplification of code. Hmm. I'm not sure why it is more simple. It seems to me we have no reason why to avoid malloc here, even though libpq options are acquired using malloc(). Regarding to the regression test. [kaigai@iwashi sepgsql]$ cat contrib/postgres_fdw/regression.diffs *** /home/kaigai/repo/sepgsql/contrib/postgres_fdw/expected/postgres_fdw.out Sat Nov 17 21:31:19 2012 --- /home/kaigai/repo/sepgsql/contrib/postgres_fdw/results/postgres_fdw.out Tue Nov 20 13:53:32 2012 *************** *** 621,627 **** -- =================================================================== ALTER FOREIGN TABLE ft1 ALTER COLUMNc5 TYPE int; SELECT * FROM ft1 WHERE c1 = 1; -- ERROR ! ERROR: invalid input syntax for integer: "1970-01-02 00:00:00" CONTEXT: column c5 of foreign table ft1 ALTER FOREIGNTABLE ft1 ALTER COLUMN c5 TYPE timestamp; -- =================================================================== --- 621,627 ---- -- =================================================================== ALTER FOREIGN TABLE ft1 ALTER COLUMNc5 TYPE int; SELECT * FROM ft1 WHERE c1 = 1; -- ERROR ! ERROR: invalid input syntax for integer: "Fri Jan 02 00:00:00 1970" CONTEXT: column c5 of foreign table ft1 ALTER FOREIGNTABLE ft1 ALTER COLUMN c5 TYPE timestamp; -- =================================================================== ====================================================================== I guess this test tries to check a case when remote column has incompatible data type with local side. Please check timestamp_out(). Its output format follows "datestyle" setting of GUC, and it affects OS configuration on initdb time. (Please grep "datestyle" at initdb.c !) I'd like to recommend to use another data type for this regression test to avoid false-positive detection. Elsewhere, I could not find problems right now. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Thank for the comment!
Shigeru HANADA
On Tue, Nov 20, 2012 at 10:23 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
-- I also think the new "use_remote_explain" option is good. It works fine
when we try to use this fdw over the network with latency more or less.
It seems to me its default is "false", thus, GetForeignRelSize will return
the estimated cost according to ANALYZE, instead of remote EXPLAIN.
Even though you mention the default behavior was not changed, is it
an expected one? My preference is the current one, as is.
Oops, I must have focused on only cost factors.
I too think that using local statistics is better as default behavior,
because foreign tables would be used for relatively stable tables.
If target tables are updated often, it would cause problems about
consistency, unless we provide full-fledged transaction mapping.
The deparseFuncExpr() still has case handling whether it is explicit case,
implicit cast or regular functions. If my previous proposition has no flaw,
could you fix up it using regular function invocation manner? In case when
remote node has incompatible implicit-cast definition, this logic can make
a problem.
Sorry, I overlooked this issue. Fixed to use function call notation
for all of explicit function calls, explicit casts, and implicit casts.
At InitPostgresFdwOptions(), the source comment says we don't use
malloc() here for simplification of code. Hmm. I'm not sure why it is more
simple. It seems to me we have no reason why to avoid malloc here, even
though libpq options are acquired using malloc().
I used "simple" because using palloc avoids null-check and error handling.
However, many backend code use malloc to allocate memory which lives
as long as backend process itself, so I fixed.
Regarding to the regression test.
[snip]
I guess this test tries to check a case when remote column has incompatible
data type with local side. Please check timestamp_out(). Its output
format follows
"datestyle" setting of GUC, and it affects OS configuration on initdb time.
(Please grep "datestyle" at initdb.c !) I'd like to recommend to use
another data type
for this regression test to avoid false-positive detection.
Good catch. :)
I fixed the test to use another data type, user defined enum.
Regards,
Shigeru HANADA
Attachment
2012/11/21 Shigeru Hanada <shigeru.hanada@gmail.com>: > Thank for the comment! > > On Tue, Nov 20, 2012 at 10:23 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> I also think the new "use_remote_explain" option is good. It works fine >> when we try to use this fdw over the network with latency more or less. >> It seems to me its default is "false", thus, GetForeignRelSize will return >> the estimated cost according to ANALYZE, instead of remote EXPLAIN. >> Even though you mention the default behavior was not changed, is it >> an expected one? My preference is the current one, as is. > > > Oops, I must have focused on only cost factors. > I too think that using local statistics is better as default behavior, > because foreign tables would be used for relatively stable tables. > If target tables are updated often, it would cause problems about > consistency, unless we provide full-fledged transaction mapping. > >> >> The deparseFuncExpr() still has case handling whether it is explicit case, >> implicit cast or regular functions. If my previous proposition has no >> flaw, >> could you fix up it using regular function invocation manner? In case when >> remote node has incompatible implicit-cast definition, this logic can make >> a problem. > > > Sorry, I overlooked this issue. Fixed to use function call notation > for all of explicit function calls, explicit casts, and implicit casts. > >> At InitPostgresFdwOptions(), the source comment says we don't use >> malloc() here for simplification of code. Hmm. I'm not sure why it is more >> simple. It seems to me we have no reason why to avoid malloc here, even >> though libpq options are acquired using malloc(). > > > I used "simple" because using palloc avoids null-check and error handling. > However, many backend code use malloc to allocate memory which lives > as long as backend process itself, so I fixed. > >> >> Regarding to the regression test. > > [snip] >> >> I guess this test tries to check a case when remote column has >> incompatible >> data type with local side. Please check timestamp_out(). Its output >> format follows >> "datestyle" setting of GUC, and it affects OS configuration on initdb >> time. >> (Please grep "datestyle" at initdb.c !) I'd like to recommend to use >> another data type >> for this regression test to avoid false-positive detection. > > > Good catch. :) > I fixed the test to use another data type, user defined enum. > One other thing I noticed. At execute_query(), it stores the retrieved rows onto tuplestore of festate->tuples at once. Doesn't it make problems when remote- table has very big number of rows? IIRC, the previous code used cursor feature to fetch a set of rows to avoid over-consumption of local memory. Do we have some restriction if we fetch a certain number of rows with FETCH? It seems to me, we can fetch 1000 rows for example, and tentatively store them onto the tuplestore within one PG_TRY() block (so, no need to worry about PQclear() timing), then we can fetch remote rows again when IterateForeignScan reached end of tuplestore. Please point out anything if I missed something. Anyway, I'll check this v4 patch simultaneously. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Wed, Nov 21, 2012 at 7:31 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
--
Shigeru HANADA
At execute_query(), it stores the retrieved rows onto tuplestore of
festate->tuples at once. Doesn't it make problems when remote-
table has very big number of rows?
No. postgres_fdw uses single-row processing mode of libpq when
retrieving query results in execute_query, so memory usage will
be stable at a certain level.
IIRC, the previous code used cursor feature to fetch a set of rows
to avoid over-consumption of local memory. Do we have some
restriction if we fetch a certain number of rows with FETCH?
It seems to me, we can fetch 1000 rows for example, and tentatively
store them onto the tuplestore within one PG_TRY() block (so, no
need to worry about PQclear() timing), then we can fetch remote
rows again when IterateForeignScan reached end of tuplestore.
As you say, postgres_fdw had used cursor to avoid possible memory
exhaust on large result set. I switched to single-row processing mode
(it could be said "protocol-level cursor"), which was added in 9.2,
because it accomplish same task with less SQL calls than cursor.
Regards,
Shigeru HANADA
After playing with some big SQLs for testing, I came to feel that
showing every remote query in EXPLAIN output is annoying, especially
when SELECT * is unfolded to long column list.
AFAIK no plan node shows so many information in a line, so I'm
inclined to make postgres_fdw to show it only when VERBOSE was
specified. This would make EXPLAIN output easy to read, even if many
foreign tables are used in a query.
Thoughts?
--
Shigeru HANADA
2012/11/22 Shigeru Hanada <shigeru.hanada@gmail.com>: > After playing with some big SQLs for testing, I came to feel that > showing every remote query in EXPLAIN output is annoying, especially > when SELECT * is unfolded to long column list. > > AFAIK no plan node shows so many information in a line, so I'm > inclined to make postgres_fdw to show it only when VERBOSE was > specified. This would make EXPLAIN output easy to read, even if many > foreign tables are used in a query. > > Thoughts? > Indeed, I also think it is reasonable solution. -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Kohei KaiGai wrote: > 2012/11/22 Shigeru Hanada <shigeru.hanada@gmail.com>: >> After playing with some big SQLs for testing, I came to feel that >> showing every remote query in EXPLAIN output is annoying, especially >> when SELECT * is unfolded to long column list. >> >> AFAIK no plan node shows so many information in a line, so I'm >> inclined to make postgres_fdw to show it only when VERBOSE was >> specified. This would make EXPLAIN output easy to read, even if many >> foreign tables are used in a query. >> >> Thoughts? >> > Indeed, I also think it is reasonable solution. +1 That's the way I do it for oracle_fdw. Yours, Laurenz Albe
2012/11/21 Shigeru Hanada <shigeru.hanada@gmail.com>: > Thank for the comment! > > On Tue, Nov 20, 2012 at 10:23 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> I also think the new "use_remote_explain" option is good. It works fine >> when we try to use this fdw over the network with latency more or less. >> It seems to me its default is "false", thus, GetForeignRelSize will return >> the estimated cost according to ANALYZE, instead of remote EXPLAIN. >> Even though you mention the default behavior was not changed, is it >> an expected one? My preference is the current one, as is. > > > Oops, I must have focused on only cost factors. > I too think that using local statistics is better as default behavior, > because foreign tables would be used for relatively stable tables. > If target tables are updated often, it would cause problems about > consistency, unless we provide full-fledged transaction mapping. > >> >> The deparseFuncExpr() still has case handling whether it is explicit case, >> implicit cast or regular functions. If my previous proposition has no >> flaw, >> could you fix up it using regular function invocation manner? In case when >> remote node has incompatible implicit-cast definition, this logic can make >> a problem. > > > Sorry, I overlooked this issue. Fixed to use function call notation > for all of explicit function calls, explicit casts, and implicit casts. > >> At InitPostgresFdwOptions(), the source comment says we don't use >> malloc() here for simplification of code. Hmm. I'm not sure why it is more >> simple. It seems to me we have no reason why to avoid malloc here, even >> though libpq options are acquired using malloc(). > > > I used "simple" because using palloc avoids null-check and error handling. > However, many backend code use malloc to allocate memory which lives > as long as backend process itself, so I fixed. > >> >> Regarding to the regression test. > > [snip] >> >> I guess this test tries to check a case when remote column has >> incompatible >> data type with local side. Please check timestamp_out(). Its output >> format follows >> "datestyle" setting of GUC, and it affects OS configuration on initdb >> time. >> (Please grep "datestyle" at initdb.c !) I'd like to recommend to use >> another data type >> for this regression test to avoid false-positive detection. > > > Good catch. :) > I fixed the test to use another data type, user defined enum. > Hanada-san, I checked the v4 patch, and I have nothing to comment anymore. So, could you update the remaining EXPLAIN with VERBOSE option stuff? Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Sun, Nov 25, 2012 at 5:24 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
-- I checked the v4 patch, and I have nothing to comment anymore.
So, could you update the remaining EXPLAIN with VERBOSE option
stuff?
Thanks for the review. Here is updated patch.
BTW, we have one more issue around naming of new FDW, and it is discussed in another thread.
Please follow that thread for naming issue.
Shigeru HANADA
Attachment
2012/11/28 Shigeru Hanada <shigeru.hanada@gmail.com>: > > On Sun, Nov 25, 2012 at 5:24 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> I checked the v4 patch, and I have nothing to comment anymore. >> >> So, could you update the remaining EXPLAIN with VERBOSE option >> stuff? >> > > Thanks for the review. Here is updated patch. > I checked the patch. The new VERBOSE option of EXPLAIN statement seems to me working fine. I think it is time to hand over this patch to committer. It is not a matter to be solved, but just my preference. postgres=# EXPLAIN(VERBOSE) SELECT * FROM ftbl WHERE a > 0 AND b like '%a%'; QUERY PLAN --------------------------------------------------------------------------------Foreign Scan on public.ftbl (cost=100.00..100.01rows=1 width=36) Output: a, b Filter: (ftbl.b ~~ '%a%'::text) Remote SQL: SELECT a, b FROM public.tblWHERE ((a OPERATOR(pg_catalog.>) 0)) (4 rows) postgres=# EXPLAIN SELECT * FROM ftbl WHERE a > 0 AND b like '%a%'; QUERY PLAN -------------------------------------------------------------Foreign Scan on ftbl (cost=100.00..100.01 rows=1 width=36) Filter: (b ~~ '%a%'::text) (2 rows) Do you think the qualifier being pushed-down should be explained if VERBOSE option was not given? > BTW, we have one more issue around naming of new FDW, and it is discussed in > another thread. > http://archives.postgresql.org/message-id/9E59E6E7-39C9-4AE9-88D6-BB0098579017@gmail.com > I don't have any strong option about this naming discussion. As long as it does not conflict with existing name and is not misleading, I think it is reasonable. So, "postgre_fdw" is OK for me. "pgsql_fdw" is also welcome. "posugure_fdw" may make sense only in Japan. "pg_fdw" is a bit misleading. "postgresql_fdw" might be the best, but do we have some clear advantage on this name to take an additional effort to solve the conflict with existing built-in postgresql_fdw_validator() function? I think, "postgres_fdw" is enough reasonable choice. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2012/11/28 Kohei KaiGai <kaigai@kaigai.gr.jp>: > it is reasonable. So, "postgre_fdw" is OK for me. "pgsql_fdw" is also welcome. > Sorry, s/"postgre_fdw"/"postgres_fdw"/g Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > [ postgres_fdw.v5.patch ] I started to look at this patch today. There seems to be quite a bit left to do to make it committable. I'm willing to work on it, but there are some things that need discussion: * The code seems to always use GetOuterUserId() to select the foreign user mapping to use. This seems entirely wrong. For instance it will do the wrong thing inside a SECURITY DEFINER function, where surely the relevant privileges should be those of the function owner, not the session user. I would also argue that if Alice has access to a foreign table owned by Bob, and Alice creates a view that selects from that table and grants select privilege on the view to Charlie, then when Charlie selects from the view the user mapping to use ought to be Alice's. (If anyone thinks differently about that, speak up!) To implement that for queries, we need code similar to what ExecCheckRTEPerms does, ie "rte->checkAsUser ? rte->checkAsUser : GetUserId()". It's a bit of a pain to get hold of the RTE from postgresGetForeignRelSize or postgresBeginForeignScan, but it's doable. (Should we modify the APIs for these functions to make that easier?) I think possibly postgresAcquireSampleRowsFunc should use the foreign table's owner regardless of the current user ID - if the user has permission to run ANALYZE then we don't really want the command to succeed or fail depending on exactly who the user is. That's perhaps debatable, anybody have another theory? * AFAICT, the patch expects to use a single connection for all operations initiated under one foreign server + user mapping pair. I don't think this can possibly be workable. For instance, we don't really want postgresIterateForeignScan executing the entire remote query to completion and stashing the results locally -- what if that's many megabytes? It ought to be pulling the rows back a few at a time, and that's not going to work well if multiple scans are sharing the same connection. (We might be able to dodge that by declaring a cursor for each scan, but I'm not convinced that such a solution will scale up to writable foreign tables, nested queries, subtransactions, etc.) I think we'd better be prepared to allow multiple similar connections. The main reason I'm bringing this up now is that it breaks the assumption embodied in postgres_fdw_get_connections() and postgres_fdw_disconnect() that foreign server + user mapping can constitute a unique key for identifying connections. However ... * I find postgres_fdw_get_connections() and postgres_fdw_disconnect() to be a bad idea altogether. These connections ought to be a hidden implementation matter, not something that the user has a view of, much less control over. Aside from the previous issue, I believe it's a trivial matter to crash the patch as it now stands by applying postgres_fdw_disconnect() to a connection that's in active use. I can see the potential value in being able to shut down connections when a session has stopped using them, but this is a pretty badly-designed approach to that. I suggest that we just drop these functions for now and revisit that problem later. (One idea is some sort of GUC setting to control how many connections can be held open speculatively for future use.) * deparse.c contains a depressingly large amount of duplication of logic from ruleutils.c, and can only need more as we expand the set of constructs that can be pushed to the remote end. This doesn't seem like a maintainable approach. Was there a specific reason not to try to use ruleutils.c for this? I'd much rather tweak ruleutils to expose some additional APIs, if that's what it takes, than have all this redundant logic. regards, tom lane
2013/2/14 Tom Lane <tgl@sss.pgh.pa.us>: > * deparse.c contains a depressingly large amount of duplication of logic > from ruleutils.c, and can only need more as we expand the set of > constructs that can be pushed to the remote end. This doesn't seem like > a maintainable approach. Was there a specific reason not to try to use > ruleutils.c for this? I'd much rather tweak ruleutils to expose some > additional APIs, if that's what it takes, than have all this redundant > logic. > The original pgsql_fdw design utilized ruleutils.c logic. Previously, you suggested to implement its own logic for query deparsing, then Hanada-san rewrite the relevant code. http://www.postgresql.org/message-id/12181.1331223482@sss.pgh.pa.us Indeed, most of the logic is duplicated. However, it is to avoid bugs in some corner cases, for instance, function name is not qualified with schema even if this function is owned by different schema in remote side. Do we add a flag on deparse_expression() to show this call intend to construct remote executable query? It may be reasonable, but case- branch makes code complicated in general.... Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Thu, Feb 14, 2013 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > * The code seems to always use GetOuterUserId() to select the foreign > user mapping to use. This seems entirely wrong. For instance it will > do the wrong thing inside a SECURITY DEFINER function, where surely the > relevant privileges should be those of the function owner, not the > session user. I would also argue that if Alice has access to a foreign > table owned by Bob, and Alice creates a view that selects from that > table and grants select privilege on the view to Charlie, then when > Charlie selects from the view the user mapping to use ought to be > Alice's. (If anyone thinks differently about that, speak up!) Agreed that OuterUserId is wrong for user mapping. Also agreed that Charlie doesn't need his own mapping for the server, if he is accessing via a valid view. > To implement that for queries, we need code similar to what > ExecCheckRTEPerms does, ie "rte->checkAsUser ? rte->checkAsUser : > GetUserId()". It's a bit of a pain to get hold of the RTE from > postgresGetForeignRelSize or postgresBeginForeignScan, but it's doable. > (Should we modify the APIs for these functions to make that easier?) This issue seems not specific to postgres_fdw. Currently GetUserMapping takes userid and server's oid as parameters, but we would able to hide the complex rule by replacing userid with RTE or something. > I think possibly postgresAcquireSampleRowsFunc should use the foreign > table's owner regardless of the current user ID - if the user has > permission to run ANALYZE then we don't really want the command to > succeed or fail depending on exactly who the user is. That's perhaps > debatable, anybody have another theory? +1. This allows non-owner to ANALYZE foreign tables without having per-user mapping, though public mapping also solves this issue. In implementation level, postgresAcquireSampleRowsFunc has Relation of the target table, so we can get owner's oid by reading rd_rel->relowner. > * AFAICT, the patch expects to use a single connection for all > operations initiated under one foreign server + user mapping pair. > I don't think this can possibly be workable. For instance, we don't > really want postgresIterateForeignScan executing the entire remote query > to completion and stashing the results locally -- what if that's many > megabytes? It uses single-row-mode of libpq and TuplestoreState to keep result locally, so it uses limited memory at a time. If the result is larger than work_mem, overflowed tuples are written to temp file. I think this is similar to materializing query results. > It ought to be pulling the rows back a few at a time, and > that's not going to work well if multiple scans are sharing the same > connection. (We might be able to dodge that by declaring a cursor > for each scan, but I'm not convinced that such a solution will scale up > to writable foreign tables, nested queries, subtransactions, etc.) Indeed the FDW used CURSOR in older versions. Sorry for that I have not looked writable foreign table patch closely yet, but it would require (may be multiple) remote update query executions during scanning? > I think we'd better be prepared to allow multiple similar connections. > The main reason I'm bringing this up now is that it breaks the > assumption embodied in postgres_fdw_get_connections() and > postgres_fdw_disconnect() that foreign server + user mapping can > constitute a unique key for identifying connections. However ... Main reason to use single connection is to make multiple results retrieved from same server in a local query consistent. Shared snapshot might be helpful for this consistency issue, but I've not tried that with FDW. > * I find postgres_fdw_get_connections() and postgres_fdw_disconnect() > to be a bad idea altogether. These connections ought to be a hidden > implementation matter, not something that the user has a view of, much > less control over. Aside from the previous issue, I believe it's a > trivial matter to crash the patch as it now stands by applying > postgres_fdw_disconnect() to a connection that's in active use. I can > see the potential value in being able to shut down connections when a > session has stopped using them, but this is a pretty badly-designed > approach to that. I suggest that we just drop these functions for now > and revisit that problem later. (One idea is some sort of GUC setting > to control how many connections can be held open speculatively for > future use.) Actually these functions follows dblink's similar functions, but having them is a bad decision because FDW can't connect explicitly. As you mentioned, postgres_fdw_disconnect is provided for clean shutdown on remote side (I needed it in my testing). I agree that separate the issue from FDW core. > * deparse.c contains a depressingly large amount of duplication of logic > from ruleutils.c, and can only need more as we expand the set of > constructs that can be pushed to the remote end. This doesn't seem like > a maintainable approach. Was there a specific reason not to try to use > ruleutils.c for this? I'd much rather tweak ruleutils to expose some > additional APIs, if that's what it takes, than have all this redundant > logic. I got a comment about that issue from you, but I might have misunderstood. (2012/03/09 1:18), Tom Lane wrote: > I've been looking at this patch a little bit over the past day or so. > I'm pretty unhappy with deparse.c --- it seems like a real kluge, > inefficient and full of corner-case bugs. After some thought I believe > that you're ultimately going to have to abandon depending on ruleutils.c > for reverse-listing services, and it would be best to bite that bullet > now and rewrite this code from scratch. I thought that writing ruleutils-free SQL constructor in postgres_fdw is better approach because ruleutils might be changed from its own purpose in future. Besides that, as Kaigai-san mentioned in upthread, ruleutils seems to have insufficient capability for building remote PostgreSQL query. -- Shigeru HANADA
Shigeru Hanada wrote: > Tom Lane wrote: >> It ought to be pulling the rows back a few at a time, and >> that's not going to work well if multiple scans are sharing the same >> connection. (We might be able to dodge that by declaring a cursor >> for each scan, but I'm not convinced that such a solution will scale up >> to writable foreign tables, nested queries, subtransactions, etc.) > > Indeed the FDW used CURSOR in older versions. Sorry for that I have > not looked writable foreign table patch closely yet, but it would > require (may be multiple) remote update query executions during > scanning? It would for example call ExecForeignUpdate after each call to IterateForeignScan that produces a row that meets the UPDATE condition. Yours, Laurenz Albe
On Thu, Feb 14, 2013 at 6:45 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: > Shigeru Hanada wrote: >> Tom Lane wrote: >>> It ought to be pulling the rows back a few at a time, and >>> that's not going to work well if multiple scans are sharing the same >>> connection. (We might be able to dodge that by declaring a cursor >>> for each scan, but I'm not convinced that such a solution will scale up >>> to writable foreign tables, nested queries, subtransactions, etc.) >> >> Indeed the FDW used CURSOR in older versions. Sorry for that I have >> not looked writable foreign table patch closely yet, but it would >> require (may be multiple) remote update query executions during >> scanning? > > It would for example call ExecForeignUpdate after each call to > IterateForeignScan that produces a row that meets the UPDATE > condition. Thanks! It seems that ExecForeignUpdate needs another connection for update query, or we need to retrieve all results at the first Iterate call to prepare for possible subsequent update query. -- Shigeru HANADA
Kohei KaiGai <kaigai@kaigai.gr.jp> writes: > 2013/2/14 Tom Lane <tgl@sss.pgh.pa.us>: >> * deparse.c contains a depressingly large amount of duplication of logic >> from ruleutils.c, and can only need more as we expand the set of >> constructs that can be pushed to the remote end. This doesn't seem like >> a maintainable approach. Was there a specific reason not to try to use >> ruleutils.c for this? > Previously, you suggested to implement its own logic for query deparsing, > then Hanada-san rewrite the relevant code. > http://www.postgresql.org/message-id/12181.1331223482@sss.pgh.pa.us [ rereads that... ] Hm, I did make some good points. But having seen the end result of this way, I'm still not very happy; it still looks like a maintenance problem. Maybe some additional flags in ruleutils.c is the least evil way after all. Needs more thought. > Indeed, most of the logic is duplicated. However, it is to avoid bugs in > some corner cases, for instance, function name is not qualified with > schema even if this function is owned by different schema in remote side. That particular reason doesn't seem to hold a lot of water when we're restricting the code to only push over built-in functions/operators anyway. I find it tempting to think about setting search_path explicitly to "pg_catalog" (only) on the remote side, whereupon we'd have to explicitly schema-qualify references to user tables, but built-in functions/operators would not need that (and it wouldn't really matter if ruleutils did try to qualify them). regards, tom lane
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > On Thu, Feb 14, 2013 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * AFAICT, the patch expects to use a single connection for all >> operations initiated under one foreign server + user mapping pair. >> I don't think this can possibly be workable. For instance, we don't >> really want postgresIterateForeignScan executing the entire remote query >> to completion and stashing the results locally -- what if that's many >> megabytes? > It uses single-row-mode of libpq and TuplestoreState to keep result > locally, so it uses limited memory at a time. If the result is larger > than work_mem, overflowed tuples are written to temp file. I think > this is similar to materializing query results. Well, yeah, but that doesn't make it an acceptable solution. Consider for instance "SELECT * FROM huge_foreign_table LIMIT 10". People are not going to be satisfied if that pulls back the entire foreign table before handing them the 10 rows. Comparable performance problems can arise even without LIMIT, for instance in handling of nestloop inner scans. >> I think we'd better be prepared to allow multiple similar connections. > Main reason to use single connection is to make multiple results > retrieved from same server in a local query consistent. Hmm. That could be a good argument, although the current patch pretty much destroys any such advantage by being willing to use READ COMMITTED mode on the far end --- with that, you lose any right to expect snapshot-consistent data anyway. I'm inclined to think that maybe we should always use at least REPEATABLE READ mode, rather than blindly copying the local transaction mode. Or maybe this should be driven by a foreign-server option instead of looking at the local mode at all? Anyway, it does seem like maybe we need to use cursors so that we can have several active scans that we are pulling back just a few rows at a time from. I'm not convinced that that gets us out of the woods though WRT needing only one connection. Consider a query that is scanning some foreign table, and it calls a plpgsql function, and that function (inside an EXCEPTION block) does a query that scans another foreign table on the same server. This second query gets an error on the remote side. If the error is caught via the exception block, and the outer query continues, what then? We could imagine adding enough infrastructure to establish a remote savepoint for each local subtransaction and clean things up on failure, but no such logic is in the patch now, and I think it wouldn't be too simple either. The least painful way to make this scenario work, given what's in the patch, is to allow such a subtransaction to use a separate connection. In any case, I'm pretty well convinced that the connection-bookkeeping logic needs a major rewrite to have any hope of working in subtransactions. I'm going to work on that first and see where it leads. >> * I find postgres_fdw_get_connections() and postgres_fdw_disconnect() >> to be a bad idea altogether. > I agree that separate the issue from FDW core. OK, so we'll drop these from the current version of the patch and revisit the problem of closing connections later. regards, tom lane
Continuing to look at this patch ... I'm wondering if any particular discussion went into choosing the FDW option names "nspname", "relname", and "colname". These don't seem to me like names that we ought to be exposing at the SQL command level. Why not just "schema", "table", "column"? Or perhaps "schema_name", "table_name", "column_name" if you feel it's essential to distinguish that these are names. regards, tom lane
On Sun, Feb 17, 2013 at 8:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Continuing to look at this patch ... I'm wondering if any particular > discussion went into choosing the FDW option names "nspname", "relname", > and "colname". IIRC, there was no deep discussion about those option names. I simply chose "relname" and "nspname" from pg_class and pg_namespace. At that time I thought users would understand those options easily if those names are used in catalog. > These don't seem to me like names that we ought to be > exposing at the SQL command level. Why not just "schema", "table", > "column"? Or perhaps "schema_name", "table_name", "column_name" if you > feel it's essential to distinguish that these are names. I think not-shortened names (words used in documents of conversations) are better now. I prefer "table_name" to "table", because it would be easy to distinguish as name, even if we add new options like "table_foo". Besides, I found a strange(?) behavior in psql \d+ command in no-postfix case, though it wouldn't be a serious problem. In psql \d+ result for postgres_fdw foreign tables, "table" and "column" are quoted, but "schema" is not. Is this behavior of quote_ident() intentional? postgres=# \d+ pgbench1_branches Foreign table "public.pgbench1_branches" Column | Type | Modifiers | FDW Options | Storage | Stats target | Description ----------+---------------+-----------+------------------+----------+--------------+-------------bid | integer | not null | ("column" 'bid') | plain | |bbalance | integer | | | plain | |filler | character(88) | | | extended | | Server: pgbench1 FDW Options: (schema 'public', "table" 'foo') Has OIDs: no We can use "table" and "column" options without quoting (or with quote of course) in CREATE/ALTER FOREIGN TABLE commands, so this is not a barrier against choosing no-postfix names. -- Shigeru HANADA
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > On Sun, Feb 17, 2013 at 8:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> These don't seem to me like names that we ought to be >> exposing at the SQL command level. Why not just "schema", "table", >> "column"? Or perhaps "schema_name", "table_name", "column_name" if you >> feel it's essential to distinguish that these are names. > I think not-shortened names (words used in documents of conversations) > are better now. I prefer "table_name" to "table", because it would be > easy to distinguish as name, even if we add new options like > "table_foo". Yeah. I doubt that these options will be commonly used anyway --- surely it's easier and less confusing to choose names that match the remote table in the first place. So there's no very good reason to keep the option names short. I'll go with "schema_name", "table_name", "column_name" unless someone comes along with a contrary opinion. > In psql \d+ result for postgres_fdw foreign tables, "table" and > "column" are quoted, but "schema" is not. Is this behavior of > quote_ident() intentional? That's probably a consequence of these being keywords of different levels of reserved-ness. If we go with the longer names it won't happen. regards, tom lane
On Fri, Feb 15, 2013 at 12:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > [ rereads that... ] Hm, I did make some good points. But having seen > the end result of this way, I'm still not very happy; it still looks > like a maintenance problem. Maybe some additional flags in ruleutils.c > is the least evil way after all. Needs more thought. I'm working on revising deparser so that it uses ruleutils routines to construct remote query, and re-found an FDW-specific problem which I encountered some months ago. So far ruleutils routines require "deparse context", which is a list of namespace information. Currently deparse_context_for() seems to fit postgres_fdw's purpose, but it always uses names stored in catalogs (pg_class, pg_attribute and pg_namespace), though postgres_fdw wants to replace column/table/schema name with the name specified in relevant FDW options if any. Proper remote query will be generated If postgres_fdw can modify deparse context, but deparse_context is hidden detail of ruleutils.c. IMO disclosing it is bad idea. Given these, I'm thinking to add new deparse context generator which basically construct namespaces from catalogs, but replace one if FDW option *_name was specified for an object. With this context, existing ruleutils would generate expression-strings with proper names, without any change. Is this idea acceptable? -- Shigeru HANADA
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > On Fri, Feb 15, 2013 at 12:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> [ rereads that... ] Hm, I did make some good points. But having seen >> the end result of this way, I'm still not very happy; it still looks >> like a maintenance problem. Maybe some additional flags in ruleutils.c >> is the least evil way after all. Needs more thought. > I'm working on revising deparser so that it uses ruleutils routines to > construct remote query, and re-found an FDW-specific problem which I > encountered some months ago. After further review I'm unconvinced that we can really do much better than what's there now --- the idea of sharing code with ruleutils sounds attractive, but once you look at all the specific issues that ruleutils would have to be taught about, it gets much less so. (In particular I fear we'll find that we have to do some weird stuff to deal with cross-server-version issues.) I've been revising the patch on the assumption that we'll keep deparse.c more or less as is. Having said that, I remain pretty unhappy with the namespace handling in deparse.c. I don't think it serves much purpose to schema-qualify everything when we're restricting what we can access to built-in operators and functions --- the loss of readability outweighs the benefits IMO. Also, there is very little point in schema-qualifying almost everything rather than everything; if you're not 100% then you have no safety against search_path issues. But that's what we've got because the code still relies on format_type to print type names. Now we could get around that complaint by duplicating format_type as well as ruleutils, but I don't think that's the right direction to proceed. I still think it might be a good idea to set search_path to pg_catalog on the remote side, and then schema-qualify only what is not in pg_catalog (which would be nothing, in the current code, so far as types/functions/operators are concerned). regards, tom lane
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > [ postgres_fdw.v5.patch ] Applied with a lot of revisions. There are still a number of loose ends and things that need to be discussed: I'm not at all happy with the planner support functions --- as designed, that code is basically incapable of thinking about more than one remote access path. It needs to be restructured and then extended to be able to generate parameterized remote paths. The "local estimation" mode is pretty bogus as well. I thought the patch could be committed anyway, but I'm going to go back and work on that part later. I doubt that deparseFuncExpr is doing the right thing by printing casts as their underlying function calls. The comment claims that this way is more robust but I rather think it's less so, because it's effectively assuming that the remote server implements casts exactly like the local one, which might be incorrect if the remote is a different Postgres version. I think we should probably change that, but would like to know what the argument was for coding it like this. Also, if this is to be the approach to printing casts, why is RelabelType handled differently? As I mentioned earlier, I think it would be better to force the remote session's search_path setting to just "pg_catalog" and then reduce the amount of explicit schema naming in the queries --- any opinions about that? I took out the checks on collations of operators because I thought they were thoroughly broken. In the first place, looking at operator names to deduce semantics is unsafe (if we were to try to distinguish equality, looking up btree opclass membership would be the way to do that). In the second place, restricting only collation-sensitive operators and not collation-sensitive functions seems just about useless for guaranteeing safety. But we don't have any very good handle on which functions might be safe to send despite having collatable input types, so taking that approach would greatly restrict our ability to send function calls at all. The bigger picture here though is that we're already relying on the user to make sure that remote tables have column data types matching the local definition, so why can't we say that they've got to make sure collations match too? So I think this is largely a documentation issue and we don't need any automated enforcement mechanism, or at least it's silly to try to enforce this when we're not enforcing column type matching (yet?). What might make sense is to try to determine whether a WHERE clause uses any collations different from those of the contained foreign-column Vars, and send it over only if not. That would prevent us from sending clauses that explicitly use collations that might not exist on the remote server. I didn't try to code this though. Another thing that I find fairly suspicious in this connection is that deparse.c isn't bothering to print collations attached to Const nodes. That may be a good idea to avoid needing the assumption that the remote server uses the same collation names we do, but if we're going to do it like this, those Const collations need to be considered when deciding if the expression is safe to send at all. A more general idea that follows on from that is that if we're relying on the user to be sure the semantics are the same, maybe we don't need to be quite so tight about what we'll send over. In particular, if the user has declared a foreign-table column of a non-built-in type, the current code will never send any constraints for that column at all, which seems overly conservative if you believe he's matched the type correctly. I'm not sure exactly how to act on that thought, but I think there's room for improvement there. A related issue is that as coded, is_builtin() is pretty flaky, because what's built-in on our server might not exist at all on the remote side, if it's a different major Postgres version. So what we've got is that the code is overly conservative about what it will send and yet still perfectly capable of sending remote queries that will fail outright, which is not a happy combination. I have no very good idea how to deal with that though. Another thing I was wondering about, but did not change, is that if we're having the remote transaction inherit the local transaction's isolation level, shouldn't it inherit the READ ONLY property as well? regards, tom lane
Tom Lane wrote: > Applied with a lot of revisions. I am thrilled. > As I mentioned earlier, I think it would be better to force the remote > session's search_path setting to just "pg_catalog" and then reduce the > amount of explicit schema naming in the queries --- any opinions about > that? I think that that would make the remore query much more readable. That would improve EXPLAIN VERBOSE output, which is a user visible improvement. > I took out the checks on collations of operators because I thought they > were thoroughly broken. In the first place, looking at operator names > to deduce semantics is unsafe (if we were to try to distinguish equality, > looking up btree opclass membership would be the way to do that). In the > second place, restricting only collation-sensitive operators and not > collation-sensitive functions seems just about useless for guaranteeing > safety. But we don't have any very good handle on which functions might > be safe to send despite having collatable input types, so taking that > approach would greatly restrict our ability to send function calls at all. > > The bigger picture here though is that we're already relying on the user > to make sure that remote tables have column data types matching the local > definition, so why can't we say that they've got to make sure collations > match too? So I think this is largely a documentation issue and we don't > need any automated enforcement mechanism, or at least it's silly to try > to enforce this when we're not enforcing column type matching (yet?). I think that the question what to push down is a different question from checking column data types, because there we can rely on the type input functions to reject bad values. Being permissive on collation issues would lead to user problems along the lines of "my query results are different when I select from a remote table on a different operating system". In my experience many users are blissfully ignorant of issues like collation and encoding. What about the following design principle: Only push down conditions which are sure to return the correct result, provided that the PostgreSQL system objects have not been tampered with. Would it be reasonable to push down operators and functions only if a) they are in the pg_catalog schema b) they have been around for a couple of releases c) they are not collation sensitive? It makes me uncomfortable to think of a FDW that would happily push down conditions that may lead to wrong query results. > Another thing I was wondering about, but did not change, is that if we're > having the remote transaction inherit the local transaction's isolation > level, shouldn't it inherit the READ ONLY property as well? That seems to me like it would be the right thing to do. Yours, Laurenz Albe
On 2013-02-21 14:23:35 +0000, Albe Laurenz wrote: > Tom Lane wrote: > > Another thing I was wondering about, but did not change, is that if we're > > having the remote transaction inherit the local transaction's isolation > > level, shouldn't it inherit the READ ONLY property as well? > > That seems to me like it would be the right thing to do. I am not 100% convinced of that. There might be valid usecases where a standby executes queries on the primary that executes that do DML. And there would be no way out of it I think? Greetings, Andres Freund
Albe Laurenz <laurenz.albe@wien.gv.at> writes: > Tom Lane wrote: >> As I mentioned earlier, I think it would be better to force the remote >> session's search_path setting to just "pg_catalog" and then reduce the >> amount of explicit schema naming in the queries --- any opinions about >> that? > I think that that would make the remore query much more readable. > That would improve EXPLAIN VERBOSE output, which is a user visible > improvement. Yeah, that's really the main point. OPERATOR() is tremendously ugly... >> The bigger picture here though is that we're already relying on the user >> to make sure that remote tables have column data types matching the local >> definition, so why can't we say that they've got to make sure collations >> match too? So I think this is largely a documentation issue and we don't >> need any automated enforcement mechanism, or at least it's silly to try >> to enforce this when we're not enforcing column type matching (yet?). > I think that the question what to push down is a different question > from checking column data types, because there we can rely on the > type input functions to reject bad values. Unfortunately, that's a very myopic view of the situation: there are many cases where datatype semantics can vary without the I/O functions having any idea that anything is wrong. To take one example, what if the underlying column is type citext but the user wrote "text" in the foreign table definition? postgres_fdw would see no reason not to push "col = 'foo'" across, but that clause would behave quite differently on the remote side. Another example is that float8 and numeric will have different opinions about the truth of "1.000000000000000000001 = 1.000000000000000000002", so you're going to get into trouble if you declare an FT column as one when the underlying column is the other, even though the I/O functions for these types will happily take each other's output. So I think (and have written in the committed docs) that users had better be careful to ensure that FT columns are declared as the same type as the underlying columns, even though we can't readily enforce that, at least not for non-builtin types. And there's really no difference between that situation and the collation situation, though I agree with you that the latter is a lot more likely to bite careless users. > What about the following design principle: > Only push down conditions which are sure to return the correct > result, provided that the PostgreSQL system objects have not > been tampered with. That's a nice, simple, and useless-in-practice design principle, because it will toss out many situations that users will want to work; situations that in fact *would* work as long as the users adhere to safe coding conventions. I do not believe that when people ask "why does performance of LIKE suck on my foreign table", they will accept an answer of "we don't allow that to be pushed across because we think you're too stupid to make the remote collation match". If you want something provably correct, the way to get there is to work out a way to check if the remote types and collations really match. But that's a hard problem AFAICT, so if we want something usable in the meantime, we are going to have to accept some uncertainty about what will happen if the user messes up. > Would it be reasonable to push down operators and functions > only if > a) they are in the pg_catalog schema > b) they have been around for a couple of releases > c) they are not collation sensitive? We don't track (b) nor (c), so this suggestion is entirely unimplementable in the 9.3 time frame; nor is it provably safe, unless by (b) you mean "must have been around since 7.4 or so". On a longer time horizon this might be doable, but it would be a lot of work to implement a solution that most people will find far too restrictive. I'd rather see the long-term focus be on doing type/collation matching, so that we can expand not restrict the set of things we can push across. regards, tom lane
Andres Freund <andres@2ndquadrant.com> writes: > On 2013-02-21 14:23:35 +0000, Albe Laurenz wrote: >> Tom Lane wrote: >>> Another thing I was wondering about, but did not change, is that if we're >>> having the remote transaction inherit the local transaction's isolation >>> level, shouldn't it inherit the READ ONLY property as well? >> That seems to me like it would be the right thing to do. > I am not 100% convinced of that. There might be valid usecases where a > standby executes queries on the primary that executes that do DML. And > there would be no way out of it I think? How exactly would it do that via an FDW? Surely if the user tries to execute INSERT/UPDATE/DELETE against a foreign table, the command would get rejected in a read-only transaction, long before we even figure out that the target is a foreign table? Even granting that there's some loophole that lets the command get sent to the foreign server, why's it a good idea to allow that? I rather thought the idea of READ ONLY was to prevent the transaction from making any permanent changes. It's not clear why changes on a remote database would be exempted from that. (Doubtless you could escape the restriction anyway with dblink, but that doesn't mean that postgres_fdw should be similarly ill-defined.) regards, tom lane
On 2013-02-21 09:58:57 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2013-02-21 14:23:35 +0000, Albe Laurenz wrote: > >> Tom Lane wrote: > >>> Another thing I was wondering about, but did not change, is that if we're > >>> having the remote transaction inherit the local transaction's isolation > >>> level, shouldn't it inherit the READ ONLY property as well? > > >> That seems to me like it would be the right thing to do. > > > I am not 100% convinced of that. There might be valid usecases where a > > standby executes queries on the primary that executes that do DML. And > > there would be no way out of it I think? > > How exactly would it do that via an FDW? Surely if the user tries to > execute INSERT/UPDATE/DELETE against a foreign table, the command would > get rejected in a read-only transaction, long before we even figure out > that the target is a foreign table? I was thinking of querying a remote table thats actually a view. Which might be using a function that does caching into a table or something. Not a completely unreasonable design. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2013-02-21 09:58:57 -0500, Tom Lane wrote: >> How exactly would it do that via an FDW? Surely if the user tries to >> execute INSERT/UPDATE/DELETE against a foreign table, the command would >> get rejected in a read-only transaction, long before we even figure out >> that the target is a foreign table? > I was thinking of querying a remote table thats actually a view. Which > might be using a function that does caching into a table or something. > Not a completely unreasonable design. Yeah, referencing a remote view is something that should work fine, but it's not clear to me why it should work differently than it does on the remote server. If you select from that same view in a READ ONLY transaction on the remote, won't it fail? If so, why should that work if it's selected from via a foreign table? regards, tom lane
On 2013-02-21 10:21:34 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2013-02-21 09:58:57 -0500, Tom Lane wrote: > >> How exactly would it do that via an FDW? Surely if the user tries to > >> execute INSERT/UPDATE/DELETE against a foreign table, the command would > >> get rejected in a read-only transaction, long before we even figure out > >> that the target is a foreign table? > > > I was thinking of querying a remote table thats actually a view. Which > > might be using a function that does caching into a table or something. > > Not a completely unreasonable design. > > Yeah, referencing a remote view is something that should work fine, but > it's not clear to me why it should work differently than it does on the > remote server. If you select from that same view in a READ ONLY > transaction on the remote, won't it fail? If so, why should that work > if it's selected from via a foreign table? Sure, it might fail if you use READ ONLY explicitly. Or the code might check it. The point is that one might not have choice about the READ ONLY state of the local transaction if its a HS standby as all transactions are READ ONLY there. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > Sure, it might fail if you use READ ONLY explicitly. Or the code might > check it. The point is that one might not have choice about the READ > ONLY state of the local transaction if its a HS standby as all > transactions are READ ONLY there. [ shrug... ] If you want to use a remote DB to cheat on READ ONLY, there's always dblink. It's not apparent to me that the FDW implementation should try to be complicit in such cheating. (Not that it would work anyway given the command-level checks.) regards, tom lane
Tom Lane wrote: >> I think that the question what to push down is a different question >> from checking column data types, because there we can rely on the >> type input functions to reject bad values. > > Unfortunately, that's a very myopic view of the situation: there > are many cases where datatype semantics can vary without the I/O > functions having any idea that anything is wrong. To take one example, > what if the underlying column is type citext but the user wrote "text" > in the foreign table definition? postgres_fdw would see no reason not > to push "col = 'foo'" across, but that clause would behave quite > differently on the remote side. Another example is that float8 and > numeric will have different opinions about the truth of > "1.000000000000000000001 = 1.000000000000000000002", so you're going > to get into trouble if you declare an FT column as one when the > underlying column is the other, even though the I/O functions for these > types will happily take each other's output. You are right. > So I think (and have written in the committed docs) that users had > better be careful to ensure that FT columns are declared as the same > type as the underlying columns, even though we can't readily enforce > that, at least not for non-builtin types. > > And there's really no difference between that situation and the > collation situation, though I agree with you that the latter is a lot > more likely to bite careless users. That's what I am worried about. >> What about the following design principle: >> Only push down conditions which are sure to return the correct >> result, provided that the PostgreSQL system objects have not >> been tampered with. > > That's a nice, simple, and useless-in-practice design principle, > because it will toss out many situations that users will want to work; > situations that in fact *would* work as long as the users adhere to > safe coding conventions. I do not believe that when people ask "why > does performance of LIKE suck on my foreign table", they will accept an > answer of "we don't allow that to be pushed across because we think > you're too stupid to make the remote collation match". I think that it will be pretty hard to get both reliability and performance to an optimum. I'd rather hear complaints about bad performance than about bad results, but that's just my opinion. >> Would it be reasonable to push down operators and functions >> only if >> a) they are in the pg_catalog schema >> b) they have been around for a couple of releases >> c) they are not collation sensitive? > > We don't track (b) nor (c), so this suggestion is entirely > unimplementable in the 9.3 time frame; nor is it provably safe, > unless by (b) you mean "must have been around since 7.4 or so". My idea was to have a (hand picked) list of functions and operators that are considered safe, but I understand that that is rather ugly. There could of course be no generic method because, as you say, b) and c) are not tracked. > On a longer time horizon this might be doable, but it would be a lot > of work to implement a solution that most people will find far too > restrictive. I'd rather see the long-term focus be on doing > type/collation matching, so that we can expand not restrict the set > of things we can push across. I like that vision, and of course my above idea does not go well with it. Yours, Laurenz Albe
On 21 February 2013 10:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Shigeru Hanada <shigeru.hanada@gmail.com> writes: >> [ postgres_fdw.v5.patch ] > > Applied with a lot of revisions. Bit of an issue with selecting rows: postgres=# SELECT * FROM animals;id | animal_name | animal_type | lifespan ----+-------------+-------------+---------- 1 | cat | mammal | 20 2 | dog | mammal | 12 3 | robin | bird | 12 4 | dolphin | mammal | 30 5 | gecko | reptile | 18 6 | human | mammal | 85 7 | elephant | mammal | 70 8 | tortoise | reptile | 150 (8 rows) postgres=# SELECT animals FROM animals;animals ---------(,,,)(,,,)(,,,)(,,,)(,,,)(,,,)(,,,)(,,,) (8 rows) postgres=# SELECT animals, animal_name FROM animals; animals | animal_name ---------------+-------------(,cat,,) | cat(,dog,,) | dog(,robin,,) | robin(,dolphin,,) | dolphin(,gecko,,) | gecko(,human,,) | human(,elephant,,) | elephant(,tortoise,,) | tortoise (8 rows) postgres=# EXPLAIN (ANALYSE, VERBOSE) SELECT animals FROM animals; QUERYPLAN -----------------------------------------------------------------------------------------------------------------Foreign Scanon public.animals (cost=100.00..100.24 rows=8 width=45) (actual time=0.253..0.255 rows=8 loops=1) Output: animals.* Remote SQL: SELECT NULL, NULL, NULL, NULL FROM public.animalsTotalruntime: 0.465 ms (4 rows) -- Thom
Thom Brown <thom@linux.com> writes: > Bit of an issue with selecting rows: Ooops, looks like I screwed up the logic for whole-row references. Will fix, thanks for the test case! regards, tom lane
On 22 February 2013 14:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thom Brown <thom@linux.com> writes: >> Bit of an issue with selecting rows: > > Ooops, looks like I screwed up the logic for whole-row references. > Will fix, thanks for the test case! Retried after your changes and all is well. Thanks Tom. -- Thom
Tom, all, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Another thing I was wondering about, but did not change, is that if we're > having the remote transaction inherit the local transaction's isolation > level, shouldn't it inherit the READ ONLY property as well? Apologies for bringing this up pretty late, but wrt writable FDW transaction levels, I was *really* hoping that we'd be able to implement autonomous transactions on top of writeable FDWs. It looks like there's no way to do this using the postgres_fdw due to it COMMIT'ing only when the client transaction commits. Would it be possible to have a simply function which could be called to say "commit the transaction on the foreign side for this server/table/connection/whatever"? A nice addition on top of that would be able to define 'auto-commit' for a given table or server. I'll try and find time to work on this, but I'd love feedback on if this is possible and where the landmines are. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > Apologies for bringing this up pretty late, but wrt writable FDW > transaction levels, I was *really* hoping that we'd be able to implement > autonomous transactions on top of writeable FDWs. It looks like there's > no way to do this using the postgres_fdw due to it COMMIT'ing only when > the client transaction commits. Would it be possible to have a simply > function which could be called to say "commit the transaction on the > foreign side for this server/table/connection/whatever"? A nice > addition on top of that would be able to define 'auto-commit' for a > given table or server. TBH I think this is a fairly bad idea. You can get that behavior via dblink if you need it, but there's no way to do it in an FDW without ending up with astonishing (and not in a good way) semantics. A commit would force committal of everything that'd been done through that connection, regardless of transaction/subtransaction structure up to that point; and it would also destroy open cursors. The only way to make this sane at all would be to provide user control of which operations go to which connections; which is inherent in dblink's API but is simply not a concept in the FDW universe. And I don't want to try to plaster it on, either. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > TBH I think this is a fairly bad idea. You can get that behavior via > dblink if you need it, While I appreciate that dblink can do it, I simply don't see it as a good solution to this. > but there's no way to do it in an FDW without > ending up with astonishing (and not in a good way) semantics. A commit > would force committal of everything that'd been done through that > connection, regardless of transaction/subtransaction structure up to > that point; and it would also destroy open cursors. The only way to > make this sane at all would be to provide user control of which > operations go to which connections; which is inherent in dblink's API > but is simply not a concept in the FDW universe. And I don't want to > try to plaster it on, either. This concern would make a lot more sense to me if we were sharing a given FDW connection between multiple client backends/sessions; I admit that I've not looked through the code but the documentation seems to imply that we create one-or-more FDW connection per backend session and there's no sharing going on. A single backend will be operating in a linear fashion through the commands sent to it. As such, I'm not sure that it's quite as bad as it may seem. Perhaps a reasonable compromise would be to have a SERVER option which is along the lines of 'autocommit', where a user could request that any query to this server is automatically committed independent of the client transaction. I'd be happier if we could allow the user to control it, but this would at least allow for 'log tables', which are defined using this server definition, where long-running pl/pgsql code could log progress where other connections could see it. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> ... The only way to >> make this sane at all would be to provide user control of which >> operations go to which connections; which is inherent in dblink's API >> but is simply not a concept in the FDW universe. And I don't want to >> try to plaster it on, either. > This concern would make a lot more sense to me if we were sharing a > given FDW connection between multiple client backends/sessions; I admit > that I've not looked through the code but the documentation seems to > imply that we create one-or-more FDW connection per backend session and > there's no sharing going on. Well, ATM postgres_fdw shares connections across tables and queries; but my point is that that's all supposed to be transparent and invisible to the user. I don't want to have API features that make connections explicit, because I don't think that can be shoehorned into the FDW model without considerable strain and weird corner cases. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > I don't want to have API features that make connections > explicit, because I don't think that can be shoehorned into the FDW > model without considerable strain and weird corner cases. It seems we're talking past each other here. I'm not particularly interested in exposing what connections have been made to other servers via some API (though I could see some debugging use there). What I was hoping for is a way for a given user to say "I want this specific change, to this table, to be persisted immediately". I'd love to have that ability *without* FDWs too. It just happens that FDWs provide a simple way to get there from here and without a lot of muddying of the waters, imv. FDWs are no stranger to remote connections which don't have transactions either, file_fdw will happily return whatever the current contents of the file are with no concern for PG transactions. I would expect a writable file_fdw to act the same and immediately write out any data written to it. Hope that clarifies things a bit. Thanks, Stephen