Re: Using Expanded Objects other than Arrays from plpgsql - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Using Expanded Objects other than Arrays from plpgsql |
Date | |
Msg-id | 2581216.1729794746@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Using Expanded Objects other than Arrays from plpgsql (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
I wrote: > ... I'm still writing up > details, but right now I'm envisioning completely separate sets of > rules for the prosupport case versus the no-prosupport case. So here is the design I've come up with for optimizing R/W expanded object updates in plpgsql without any special knowledge from a prosupport function. AFAICS this requires no assumptions at all about the behavior of called functions, other than the bare minimum "you can't corrupt the object to the point where it wouldn't be cleanly free-able". In particular that means it can work for user-written called functions in plpgsql, SQL, or whatever, not only for C-coded functions. There are two requirements to apply the optimization: * If the assignment statement is within a BEGIN ... EXCEPTION block, its target variable must be declared inside the most-closely-nested such block. This ensures that if an error is thrown from within the assignment statement's expression, we do not care about the value of the target variable, except to the extent of being able to clean it up. * The target variable must be referenced exactly once within the RHS expression. This avoids aliasing hazards such as we discussed upthread. But unlike the existing rule, that reference can be anywhere in the RHS --- it doesn't have to be an argument of the topmost function. So for example we can optimize foo := fee(fi(fo(fum(foo), other_variable), ...)); While I've not tried to write any code yet, I think both of these conditions should be reasonably easy to verify. Given that those conditions are met and the current value of the assignment target variable is a R/W expanded pointer, we can execute the assignment as follows: * Provide the R/W expanded pointer as the value of the Param representing the sole reference. At the same time, apply TransferExpandedObject to reparent the object under the transient eval_mcontext memory context that's being used to evaluate the RHS expression, and then set the target variable's value to NULL. (At this point, the R/W object has exactly the same logical status as any intermediate calculation result that's palloc'd in the eval_mcontext.) * At successful completion of the RHS, assign the result to the target variable normally. This includes, if it's an R/W expanded object, reparenting it under the calling function's main context. If the RHS expression winds up returning the same expanded object (probably mutated), then the outer function regains ownership of it after no more than a couple of TransferExpandedObject calls, which are cheap. If the RHS returns something different, then either the original expanded object got discarded during RHS evaluation, or it will be cleaned up when we reset the eval_mcontext, so that it won't be leaked. I didn't originally foresee the need to transfer the object into the transient memory context. But this design avoids any possibility of double-free attempts, which would be likely to happen if we allow the outer function's variable to hold onto a reference to the object while the RHS is evaluated. A function receiving an R/W reference is entitled to assume that that value is not otherwise referenced and can be freed when it's no longer needed, so it might well get freed during RHS evaluation. By converting the original R/W object into (effectively) a temporary value within the RHS evaluation, we make that assumption valid. So, while this design greatly expands the set of cases we can optimize, it does lose some cases that the old approach could support. I envision addressing that by allowing a prosupport function attached to the RHS' topmost function to "bless" other cases as safe, using reasoning similar to the old rules. (Or different rules, even, but it's on the prosupport function to be sure it's safe.) I don't have a detailed design in mind, but I'm thinking along the lines of just passing the whole RHS expression to the prosupport function and letting it decide what's safe. In any case, we don't need to even call the prosupport function unless there's an exception block or multiple RHS references to the target variable. regards, tom lane
pgsql-hackers by date: