Re: Parallel Apply - Mailing list pgsql-hackers
| From | Tomas Vondra |
|---|---|
| Subject | Re: Parallel Apply |
| Date | |
| Msg-id | e4442437-6821-424f-a6bd-5ef8d9b5f9e9@vondra.me Whole thread Raw |
| In response to | RE: Parallel Apply ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
| Responses |
RE: Parallel Apply
|
| List | pgsql-hackers |
Hello Kuroda-san, On 11/18/25 12:00, Hayato Kuroda (Fujitsu) wrote: > Dear Amit, > >> It seems you haven't sent the patch that preserves commit order or the >> commit message of the attached patch is wrong. I think the first patch >> in series should be the one that preserves commit order and then we >> can build a patch that tracks dependencies and allows parallelization >> without preserving commit order. > > I think I attached the correct file. Since we are trying to preserve > the commit order by default, everything was merged into one patch. I agree the goal should be preserving the commit order, unless someone can demonstrate (a) clear performance benefits and (b) correctness. It's not clear to me how would that deal e.g. with crashes, where some of the "future" replicated transactions committed. Maybe it's fine, not sure. But keeping the same commit order just makes it easier to think about the consistency model, no? So it seems natural to target the same commit order first, and then maybe explore if relaxing that would be beneficial for some cases. However, the patch seems fairly large (~80kB, although a fair bit of that is comments). Would it be possible to split it into smaller chunks? Is there some "minimal patch", which could be moved to 0001, and then followed by improvements in 0002, 0003, ...? I sometimes do some "infrastructure" first, and the actual patch in the last part (simply using the earlier parts). I'm not saying it has to be split (or how exactly), but I personally find smaller patches easier to review ... > One point to clarify is that dependency tracking is essential even if we fully > preserve the commit ordering not to violate constrains like PK. Assuming there is > a table which has PK, txn1 inserts a tuple and txn2 updates it. UPDATE statement > in txn2 must be done after committing txn1. > Right. I don't see how we could do parallel apply correct in general case without tracking these dependencies. >> I feel it may be better to just >> discuss preserve commit order patch that also contains some comments >> as to how to extend it further, once that is done, we can do further >> discussion of the other patch. > > I do agree, let me implement one by one. > Some comments / questions after looking at the patch today: 1) The way the patch determines dependencies seems to be the "writeset" approach from other replication systems (e.g. MySQL does that). Maybe we should stick to the same naming? 2) If I understand correctly, the patch maintains a "replica_identity" hash table, with replica identity keys for all changes for all concurrent transactions. How expensive can this be, in terms of CPU and memory? What if I have multiple large batch transactions, each updating millions of rows? 3) Would it make sense to use some alternative data structure? A bloom filter, for example. Just a random idea, not sure if that's a good fit. 4) I've seen the benchmarks posted a couple days ago, and I'm running some tests myself. But it's hard to say if the result is good or bad without knowing what fraction of transactions finds a dependency and has to wait for an earlier one. Would it be possible to track this somewhere? Is there a suitable pg_stats_ view? 5) It's not clear to me how did you measure the TPS in your benchmark. Did you measure how long it takes for the standby to catch up, or what did you do? 6) Did you investigate why the speedup is just ~2.1 with 4 workers, i.e. about half of the "ideal" speedup? Is it bottlenecked on WAL, leader having to determine dependencies, or something else? 7) I'm a bit confused about the different types of dependencies, and at which point they make the workers wait. There are the dependencies due to modifying the same row, in which case the worker waits before starting to apply the changes that hits the dependency. And then there's a dependency to enforce commit order, in which case it waits before commit. Right? Or did I get that wrong? 8) The commit message says: > It would be challenge to check the dependency if the table has user > defined trigger or constraints. the most viable solution might be to > disallow parallel apply for relations whose triggers and constraints > are not marked as parallel-safe or immutable. Wouldn't this have similar issues with verifying these features on partitioned tables as the patch that attempted to allow parallelism for INSERT ... SELECT [1]? AFAICS it was too expensive to do with large partitioning hierarchies. 9) I think it'd be good to make sure the "design" comments explain how the new parts work in more detail. For example, the existing comment at the beginning of applyparallelworker.c goes into a lot of detail, but the patch adds only two fairly short paragraphs. Even the commit message has more detail, which seems a bit strange. 10) For example it would be good to explain what "internal dependency" and "internal relation" are for. I think I understand the internal dependency, I'm still not quite sure why we need internal relation (or rather why we didn't need it before). 11) I think it might be good to have TAP tests that stress this out in various ways. Say, a test that randomly restarts the standby during parallel apply, and checks it does not miss any records, etc. In the online checksums patch this was quite useful. It wouldn't be part of regular check-world, of course. Or maybe it'd be for development only? regards [1] https://www.postgresql.org/message-id/flat/E1lJoQ6-0005BJ-DY%40gemulon.postgresql.org -- Tomas Vondra
pgsql-hackers by date: