POC: postgres_fdw insert batching - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | POC: postgres_fdw insert batching |
Date | |
Msg-id | 20200628151002.7x5laxwpgvkyiu3q@development Whole thread Raw |
Responses |
Re: POC: postgres_fdw insert batching
Re: POC: postgres_fdw insert batching Re: POC: postgres_fdw insert batching Re: POC: postgres_fdw insert batching |
List | pgsql-hackers |
Hi, One of the issues I'm fairly regularly reminded by users/customers is that inserting into tables sharded using FDWs are rather slow. We do even get it reported on pgsql-bugs from time to time [1]. Some of the slowness / overhead is expected, doe to the latency between machines in the sharded setup. Even just 1ms latency will make it way more expensive than a single instance. But let's do a simple experiment, comparing a hash-partitioned regular partitions, and one with FDW partitions in the same instance. Scripts to run this are attached. The duration of inserting 1M rows to this table (average of 10 runs on my laptop) looks like this: regular: 2872 ms FDW: 64454 ms Yep, it's ~20x slower. On setup with ping latency well below 0.05ms. Imagine how would it look on sharded setups with 0.1ms or 1ms latency, which is probably where most single-DC clusters are :-( Now, the primary reason why the performance degrades like this is that while FDW has batching for SELECT queries (i.e. we read larger chunks of data from the cursors), we don't have that for INSERTs (or other DML). Every time you insert a row, it has to go all the way down into the partition synchronously. For some use cases this may be reduced by having many independent connnections from different users, so the per-user latency is higher but acceptable. But if you need to import larger amounts of data (say, a CSV file for analytics, ...) this may not work. Some time ago I wrote an ugly PoC adding batching, just to see how far would it get us, and it seems quite promising - results for he same INSERT benchmarks look like this: FDW batching: 4584 ms So, rather nice improvement, I'd say ... Before I spend more time hacking on this, I have a couple open questions about the design, restrictions etc. 1) Extend the FDW API? In the patch, the batching is simply "injected" into the existing insert API method, i.e. ExecForeignInsert et al. I wonder if it'd be better to extend the API with a "batched" version of the method, so that we can easily determine whether the FDW supports batching or not - it would require changes in the callers, though. OTOH it might be useful for COPY, where we could do something similar to multi_insert (COPY already benefits from this patch, but it does not use the batching built-into COPY). 2) What about the insert results? I'm not sure what to do about "result" status for the inserted rows. We only really "stash" the rows into a buffer, so we don't know if it will succeed or not. The patch simply assumes it will succeed, but that's clearly wrong, and it may result in reporting a wrong number or rows. The patch also disables the batching when the insert has a RETURNING clause, because there's just a single slot (for the currently inserted row). I suppose a "batching" method would take an array of slots. 3) What about the other DML operations (DELETE/UPDATE)? The other DML operations could probably benefit from the batching too. INSERT was good enough for a PoC, but having batching only for INSERT seems somewhat asmymetric. DELETE/UPDATE seem more complicated because of quals, but likely doable. 3) Should we do batching for COPY insteads? While looking at multi_insert, I've realized it's mostly exactly what the new "batching insert" API function would need to be. But it's only really used in COPY, so I wonder if we should just abandon the idea of batching INSERTs and do batching COPY for FDW tables. For cases that can replace INSERT with COPY this would be enough, but unfortunately it does nothing for DELETE/UPDATE so I'm hesitant to do this :-( 4) Expected consistency? I'm not entirely sure what are the consistency expectations for FDWs. Currently the FDW nodes pointing to the same server share a connection, so the inserted rows might be visible to other nodes. But if we only stash the rows in a local buffer for a while, that's no longer true. So maybe this breaks the consistency expectations? But maybe that's OK - I'm not sure how the prepared statements/cursors affect this. I can imagine restricting the batching only to plans where this is not an issue (single FDW node or something), but it seems rather fragile and undesirable. I was thinking about adding a GUC to enable/disable the batching at some level (global, server, table, ...) but it seems like a bad match because the consistency expectations likely depend on a query. There should be a GUC to set the batch size, though (it's hardcoded to 100 for now). regards [1] https://www.postgresql.org/message-id/CACnz%2BQ1q0%2B2KoJam9LyNMk8JmdC6qYHXWB895Wu2xcpoip18xQ%40mail.gmail.com -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: