parallel mode and parallel contexts - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | parallel mode and parallel contexts |
Date | |
Msg-id | CA+Tgmob8u=J-D-_5SCXOQ9-ZtK_xCHjwa3M29CxBhU3VPPnezw@mail.gmail.com Whole thread Raw |
Responses |
Re: parallel mode and parallel contexts
|
List | pgsql-hackers |
Attached is a patch that adds two new concepts: parallel mode, and parallel contexts. The idea of this code is to provide a framework for specific parallel things that you want to do, such as parallel sequential scan or parallel sort. When you're in parallel mode, certain operations - like DDL, and anything that would update the command counter - are prohibited. But you gain the ability to create a parallel context, which in turn can be used to fire up parallel workers. And if you do that, then your snapshot, combo CID hash, and GUC values will be copied to the worker, which is handy. This patch is very much half-baked. Among the things that aren't right yet: - There's no handling of heavyweight locking, so I'm quite sure it'll be possible to cause undetected deadlocks if you work at it. There are some existing threads on this topic and perhaps we can incorporate one of those concepts into this patch, but this version does not. - There's no provision for copying the parent's XID and sub-XIDs, if any, to the background workers, which means that if you use this and your transaction has written data, you will get wrong answers, because TransactionIdIsCurrentTransactionId() will do the wrong thing. - There's no really deep integration with the transaction system yet. Previous discussions seem to point toward the need to do various types of coordinated cleanup when the parallel phase is done, or when an error happens. In particular, you probably don't want the abort record to get written while there are still possibly backends that are part of that transaction doing work; and you certainly don't want files created by the current transaction to get removed while some other backend is still writing them. The right way to work all of this out needs some deep thought; agreeing on what the design should be is probably harder than implement it. Despite the above, I think this does a fairly good job laying out how I believe parallelism can be made to work in PostgreSQL: copy a bunch of state from the user backend to the parallel workers, compute for a while, and then shut everything down. Meanwhile, while parallelism is running, forbid changes to state that's already been synchronized, so that things don't get out of step. I think the patch it shows how the act of synchronizing state from the master to the workers can be made quite modular and painless, even though it doesn't synchronize everything relevant. I'd really appreciate any design thoughts anyone may have on how to fix the problems mentioned above, how to fix any other problems you foresee, or even just a list of reasons why you think this will blow up. What I think is that we're really pretty close to do real parallelism, and that this is probably the last major piece of infrastructure that we need in order to support parallel execution in a reasonable way. That's a pretty bold statement, but I believe it to be true: despite the limitations of the current version of this patch, I think we're very close to being able to sit down and code up a parallel algorithm in PostgreSQL and have that not be all that hard. Once we get the first one, I expect a whole bunch more to come together far more quickly than the first one did. I would be remiss if I failed to mention that this patch includes work by my colleagues Amit Kapila, Rushabh Lathia, and Jeevan Chalke, as well as my former colleague Noah Misch; and that it would not have been possible without the patient support of EnterpriseDB management. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: