wCTE: why not finish sub-updates at the end, not the beginning? - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | wCTE: why not finish sub-updates at the end, not the beginning? |
Date | |
Msg-id | 20848.1298645916@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: wCTE: why not finish sub-updates at the end, not the beginning?
Re: wCTE: why not finish sub-updates at the end, not the beginning? Re: wCTE: why not finish sub-updates at the end, not the beginning? Re: wCTE: why not finish sub-updates at the end, not the beginning? Re: wCTE: why not finish sub-updates at the end, not the beginning? |
List | pgsql-hackers |
I had what seems to me a remarkably good idea, though maybe someone else can spot a problem with it. Given that we've decided to run the modifying sub-queries all with the same command counter ID, they are logically executing "in parallel". The current implementation takes no advantage of that fact, though: it's based around the idea of running the updates strictly sequentially. I think we should change it so that the updates happen physically, not only logically, concurrently. Specifically, I'm imagining getting rid of the patch's additions to InitPlan and ExecutePlan that find all the modifying sub-queries and force them to be cycled to completion before the main plan runs. Just run the main plan and let it pull tuples from the CTEs as needed. Then, in ExecutorEnd, cycle any unfinished ModifyTable nodes to completion before shutting down the plan. (In the event of an error, we'd never get to ExecutorEnd, but it doesn't matter since whatever updates we did apply are nullified anyhow.) This has a number of immediate and future implementation benefits: 1. RETURNING tuples that aren't actually needed by the main plan don't need to be buffered anywhere. (ExecutorEnd would just pull directly from the ModifyTable nodes, ignoring their parent CTE nodes, in all cases.) 2. In principle, in many common cases the RETURNING tuples wouldn't have to be buffered at all, but could be consumed on-the-fly. I think that right now the CTEScan nodes might still buffer the tuples so they can regurgitate them in case of being rescanned, but it's not hard to see how that could be improved later if it doesn't work immediately. 3. The code could be significantly simpler. Instead of that rather complex and fragile logic in InitPlan to try to locate all the ModifyTable nodes and their CTEScan parents, we could just have ModifyTable nodes add themselves to a list in the EState during ExecInitNode. Then ExecutorEnd just traverses that list. However, the real reason for doing it isn't any of those, but rather to establish the principle that the executions of the modifying sub-queries are interleaved not sequential. We're never going to be able to do any significant optimization of such queries if we have to preserve the behavior that the sub-queries execute sequentially. And I think it's inevitable that users will manage to build such an assumption into their queries if the first release with the feature behaves that way. Comments? regards, tom lane
pgsql-hackers by date: