Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id 71113ab1-ed31-3dbc-ce07-e081e17473b6@enterprisedb.com
Whole thread Raw
In response to RE: POC: postgres_fdw insert batching  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: POC: postgres_fdw insert batching  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
On 11/19/20 3:43 AM, tsunakawa.takay@fujitsu.com wrote:
> From: Tomas Vondra <tomas.vondra@enterprisedb.com>
>> Unfortunately, this does not compile for me, because
>> nodeModifyTable calls ExecGetTouchedPartitions, which is not
>> defined anywhere. Not sure what's that about, so I simply
>> commented-out this. That probably fails the partitioned cases, but
>> it allowed me to do some review and testing.
> 
> Ouch, sorry.  I'm ashamed to have forgotten including
> execPartition.c.
> 

No reason to feel ashamed. Mistakes do happen from time to time.

> 
>> The are a couple other smaller changes. E.g. it undoes changes to 
>> finish_foreign_modify, and instead calls separate functions to
>> prepare the bulk statement. It also adds list_make5/list_make6
>> macros, so as to not have to do strange stuff with the parameter
>> lists.
> 
> Thanks, I'll take them thankfully!  I wonder why I didn't think of
> separating deallocate_query() from finish_foreign_modify() ...
> perhaps my brain was dying.  As for list_make5/6(), I saw your first
> patch avoid adding them, so I thought you found them ugly (and I felt
> so, too.)  But thinking now, there's no reason to hesitate it.
> 

I think it's often easier to look changes like deallocate_query with a
bit of distance, not while hacking on the patch and just trying to make
it work somehow.

For the list_make# stuff, I think I've decided to do the simplest thing
possible in extension, without having to recompile the server. But I
think for a proper patch it's better to keep it more readable.

> ...
> 
>> 1) As I mentioned before, I really don't think we should be doing
>> deparsing in execute_foreign_modify - that's something that should
>> happen earlier, and should be in a deparse.c function.
> ...
>> The attached patch tries to address both of these points.
>> 
>> Firstly, it adds a new deparseBulkInsertSql function, that builds a
>> query for the "full" batch, and then uses those two queries - when
>> we get a full batch we use the bulk query, otherwise we use the
>> single-row query in a loop. IMO this is cleaner than deparsing
>> queries ad hoc in the execute_foreign_modify.
> ...
>> Of course, this might be worse when we don't have a full batch,
>> e.g. for a query that insert only 50 rows with batch_size=100. If
>> this case is common, one option would be lowering the batch_size
>> accordingly. If we really want to improve this case too, I suggest
>> we pass more info than just a position of the VALUES clause - that
>> seems a bit too hackish.
> ...
>> Secondly, it adds the batch_size option to server/foreign table,
>> and uses that. This is not complete, though.
>> postgresPlanForeignModify currently passes a hard-coded value at
>> the moment, it needs to lookup the correct value for the 
>> server/table from RelOptInfo or something. And I suppose
>> ModifyTable inftractructure will need to determine the value in
>> order to pass the correct number of slots to the FDW API.
> 
> I can sort of understand your feeling, but I'd like to reconstruct
> the query and prepare it in execute_foreign_modify() because:
> 
> * Some of our customers use bulk insert in ECPG (INSERT ...
> VALUES(record1, (record2), ...) to insert variable number of records
> per query.  (Oracle's Pro*C has such a feature.)  So, I want to be
> prepared to enable such a thing with FDW.
> 
> * The number of records to insert is not known during planning (in
> general), so it feels natural to get prepared during execution phase,
> or not unnatural at least.
> 

I think we should differentiate between "deparsing" and "preparing".

> * I wanted to avoid the overhead of building the full query string
> for 100-record insert statement during query planning, because it may
> be a bit costly for usual 1-record inserts.  (The overhead may be
> hidden behind the high communication cost of postgres_fdw, though.)
> 

Hmm, ok. I haven't tried how expensive that would be, but my assumption
was it's much cheaper than the latency we save. But maybe I'm wrong.

> So, in terms of code cleanness, how about moving my code for
> rebuilding query string from execute_foreign_modify() to some new
> function in deparse.c?
> 

That might work, yeah. I suggest we do this:

1) try to use the same approach for both single-row inserts and larger
batches, to not have a lot of different branches

2) modify deparseInsertSql to produce not the "final" query but some
intermediate representation useful to generate queries inserting
arbitrary number of rows

3) in execute_foreign_modify remember the last number of rows, and only
rebuild/replan the query when it changes

> 
>> 2) I think the GUC should be replaced with an server/table option,
>> similar to fetch_size.
> 
> Hmm, batch_size differs from fetch_size.  fetch_size is a
> postgres_fdw-specific feature with no relevant FDW routine, while
> batch_size is a configuration parameter for all FDWs that implement
> ExecForeignBulkInsert().  The ideas I can think of are:
> 
> 1. Follow JDBC/ODBC and add standard FDW properties.  For example,
> the JDBC standard defines standard connection pool properties such as
> maxPoolSize and minPoolSize.  JDBC drivers have to provide them with
> those defined names.  Likewise, the FDW interface requires FDW
> implementors to handle the foreign server option name
> "max_bulk_insert_tuples" if he/she wants to provide bulk insert
> feature and implement ExecForeignBulkInsert().  The core executor
> gets that setting from the FDW by calling a new FDW routine like
> GetMaxBulkInsertTuples().  Sigh...
> 
> 2. Add a new max_bulk_insert_tuples reloption to CREATE/ALTER FOREIGN
> TABLE.  executor gets the value from Relation and uses it.  (But is
> this a table-specific configuration?  I don't think so, sigh...)
> 

I do agree there's a difference between fetch_size and batch_size. For
fetch_size, it's internal to postgres_fdw - no external code needs to
know about it. For batch_size that's not the case, the ModifyTable core
code needs to be aware of that.

That means the "batch_size" is becoming part of the API, and IMO the way
to do that is by exposing it as an explicit API method. So +1 to add
something like GetMaxBulkInsertTuples.

It still needs to be configurable at the server/table level, though. The
new API method should only inform ModifyTable about the final max batch
size the FDW decided to use.

> 3. Adopt the current USERSET GUC max_bulk_insert_tuples.  I think
> this is enough because the user can change the setting per session,
> application, and database.
> 

I don't think this is usable in practice, because a single session may
be using multiple FDW servers, with different implementations, latency
to the data nodes, etc. It's unlikely a single GUC value will be
suitable for all of them.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Should we document IS [NOT] OF?
Next
From: John Naylor
Date:
Subject: Re: cutting down the TODO list thread