Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Transactions involving multiple postgres foreign servers, take 2 |
Date | |
Msg-id | 27762b24-73f4-56c9-d03e-79139b7cdd5a@oss.nttdata.com Whole thread Raw |
In response to | Re: Transactions involving multiple postgres foreign servers, take 2 (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Transactions involving multiple postgres foreign servers, take 2
|
List | pgsql-hackers |
On 2021/01/27 14:08, Masahiko Sawada wrote: > On Wed, Jan 27, 2021 at 10:29 AM Fujii Masao > <masao.fujii@oss.nttdata.com> wrote: >> >> >> You fixed some issues. But maybe you forgot to attach the latest patches? > > Yes, I've attached the updated patches. Thanks for updating the patch! I tried to review 0001 and 0002 as the self-contained change. + * An FDW that implements both commit and rollback APIs can request to register + * the foreign transaction by FdwXactRegisterXact() to participate it to a + * group of distributed tranasction. The registered foreign transactions are + * identified by OIDs of server and user. I'm afraid that the combination of OIDs of server and user is not unique. IOW, more than one foreign transactions can havethe same combination of OIDs of server and user. For example, the following two SELECT queries start the different foreigntransactions but their user OID is the same. OID of user mapping should be used instead of OID of user? CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw; CREATE USER MAPPING FOR postgres SERVER loopback OPTIONS (user 'postgres'); CREATE USER MAPPING FOR public SERVER loopback OPTIONS (user 'postgres'); CREATE TABLE t(i int); CREATE FOREIGN TABLE ft(i int) SERVER loopback OPTIONS (table_name 't'); BEGIN; SELECT * FROM ft; DROP USER MAPPING FOR postgres SERVER loopback ; SELECT * FROM ft; COMMIT; + /* Commit foreign transactions if any */ + AtEOXact_FdwXact(true); Don't we need to pass XACT_EVENT_PARALLEL_PRE_COMMIT or XACT_EVENT_PRE_COMMIT flag? Probably we don't need to do this ifpostgres_fdw is only user of this new API. But if we make this new API generic one, such flags seem necessary so that someforeign data wrappers might have different behaviors for those flags. Because of the same reason as above, AtEOXact_FdwXact() should also be called after CallXactCallbacks(is_parallel_worker? XACT_EVENT_PARALLEL_COMMIT : XACT_EVENT_COMMIT)? + /* + * Abort foreign transactions if any. This needs to be done before marking + * this transaction as not running since FDW's transaction callbacks might + * assume this transaction is still in progress. + */ + AtEOXact_FdwXact(false); Same as above. +/* + * This function is called at PREPARE TRANSACTION. Since we don't support + * preparing foreign transactions yet, raise an error if the local transaction + * has any foreign transaction. + */ +void +AtPrepare_FdwXact(void) +{ + if (FdwXactParticipants != NIL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot PREPARE a transaction that has operated on foreign tables"))); +} This means that some foreign data wrappers suppporting the prepare transaction (though I'm not sure if such wappers actuallyexist or not) cannot use the new API? If we want to allow those wrappers to use new API, AtPrepare_FdwXact() shouldcall the prepare callback and each wrapper should emit an error within the callback if necessary. + foreach(lc, FdwXactParticipants) + { + FdwXactParticipant *fdw_part = (FdwXactParticipant *) lfirst(lc); + + if (fdw_part->server->serverid == serverid && + fdw_part->usermapping->userid == userid) Isn't this ineffecient when starting lots of foreign transactions because we need to scan all the entries in the list everytime? +static ConnCacheEntry * +GetConnectionCacheEntry(Oid umid) +{ + bool found; + ConnCacheEntry *entry; + ConnCacheKey key; + + /* First time through, initialize connection cache hashtable */ + if (ConnectionHash == NULL) + { + HASHCTL ctl; + + ctl.keysize = sizeof(ConnCacheKey); + ctl.entrysize = sizeof(ConnCacheEntry); + ConnectionHash = hash_create("postgres_fdw connections", 8, + &ctl, + HASH_ELEM | HASH_BLOBS); Currently ConnectionHash is created under TopMemoryContext. With the patch, since GetConnectionCacheEntry() can be calledin other places, ConnectionHash may be created under the memory context other than TopMemoryContext? If so, that'ssafe? - if (PQstatus(entry->conn) != CONNECTION_OK || - PQtransactionStatus(entry->conn) != PQTRANS_IDLE || - entry->changing_xact_state || - entry->invalidated) ... + if (PQstatus(entry->conn) != CONNECTION_OK || + PQtransactionStatus(entry->conn) != PQTRANS_IDLE || + entry->changing_xact_state) Why did you get rid of the condition "entry->invalidated"? > >> >> I'm reading 0001 and 0002 patches to pick up the changes for postgres_fdw that worth applying independent from 2PC feature.If there are such changes, IMO we can apply them in advance, and which would make the patches simpler. > > Thank you for reviewing the patches! > >> >> + if (PQresultStatus(res) != PGRES_COMMAND_OK) >> + ereport(ERROR, (errmsg("could not commit transaction on server %s", >> + frstate->server->servername))); >> >> You changed the code this way because you want to include the server name in the error message? I agree that it's helpfulto report also the server name that caused an error. OTOH, since this change gets rid of call to pgfdw_rerport_error()for the returned PGresult, the reported error message contains less information. If this understandingis right, I don't think that this change is an improvement. > > Right. It's better to use do_sql_command() instead. > >> Instead, if the server name should be included in the error message, pgfdw_report_error() should be changed so that italso reports the server name? If we do that, the server name is reported not only when COMMIT fails but also when othercommands fail. >> >> Of course, if this change is not essential, we can skip doing this in the first version. > > Yes, I think it's not essential for now. We can improve it later if we want. > >> >> - /* >> - * Regardless of the event type, we can now mark ourselves as out of the >> - * transaction. (Note: if we are here during PRE_COMMIT or PRE_PREPARE, >> - * this saves a useless scan of the hashtable during COMMIT or PREPARE.) >> - */ >> - xact_got_connection = false; >> >> With this change, xact_got_connection seems to never be set to false. Doesn't this break pgfdw_subxact_callback() usingxact_got_connection? > > I think xact_got_connection is set to false in > pgfdw_cleanup_after_transaction() that is called at the end of each > foreign transaction (i.g., in postgresCommitForeignTransaction() and > postgresRollbackForeignTransaction()). > > But as you're concerned below, it's reset for each foreign transaction > end rather than the parent's transaction end. > >> >> + /* Also reset cursor numbering for next transaction */ >> + cursor_number = 0; >> >> Originally this variable is reset to 0 once per transaction end. But with the patch, it's reset to 0 every time when aforeign transaction ends at each connection. This change would be harmless fortunately in practice, but seems not righttheoretically. >> >> This makes me wonder if new FDW API is not good at handling the case where some operations need to be performed once pertransaction end. > > I think that the problem comes from the fact that FDW needs to use > both SubXactCallback and new FDW API. > > If we want to perform some operations at the end of the top > transaction per FDW, not per foreign transaction, we will either still > need to use XactCallback or need to rethink the FDW API design. But > given that we call commit and rollback FDW API for only foreign > servers that actually started a transaction, I’m not sure if there are > such operations in practice. IIUC there is not at least from the > normal (not-sub) transaction termination perspective. One feature in my mind that may not match with this new API is to perform transaction commits on multiple servers in parallel.That's something like the following. As far as I can recall, another proposed version of 2pc on postgres_fdw patchincluded that feature. If we want to implement this to increase the performance of transaction commit in the future,I'm afraid that new API will prevent that. foreach(foreign transactions) send commit command foreach(foreign transactions) wait for reply of commit On second thought, new per-transaction commit/rollback callback is essential when users or the resolver process want to resolvethe specifed foreign transaction, but not essential when backends commit/rollback foreign transactions. That is, evenif we add per-transaction new API for users and resolver process, backends can still use CallXactCallbacks() when theycommit/rollback foreign transactions. Is this understanding right? > > IIUC xact_got_transaction is used to skip iterating over all cached > connections to find open remote (sub) transactions. This is not > necessary anymore at least from the normal transaction termination > perspective. So maybe we can improve it so that it tracks whether any > of the cached connections opened a subtransaction. That is, we set it > true when we created a savepoint on any connections and set it false > at the end of pgfdw_subxact_callback() if we see that xact_depth of > all cached entry is less than or equal to 1 after iterating over all > entries. OK. > Regarding cursor_number, it essentially needs to be unique at least > within a transaction so we can manage it per transaction or per > connection. But the current postgres_fdw rather ensure uniqueness > across all connections. So it seems to me that this can be fixed by > making individual connection have cursor_number and resetting it in > pgfdw_cleanup_after_transaction(). I think this can be in a separate > patch. Maybe, so let's work on this later, at least after we confirm that this change is really necessary. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: