Re: Using Expanded Objects other than Arrays from plpgsql - Mailing list pgsql-hackers

From Michel Pelletier
Subject Re: Using Expanded Objects other than Arrays from plpgsql
Date
Msg-id CACxu=v+dn37zr8gx5xNP-EZY3OLtGLTHrbx_ZkCQc40HpyMLKA@mail.gmail.com
Whole thread Raw
In response to Re: Using Expanded Objects other than Arrays from plpgsql  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Oct 24, 2024 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> ... I'm still writing up
> details, but right now I'm envisioning completely separate sets of
> rules for the prosupport case versus the no-prosupport case.

So here is the design I've come up with for optimizing R/W expanded
object updates in plpgsql without any special knowledge from a
prosupport function.  AFAICS this requires no assumptions at all
about the behavior of called functions, other than the bare minimum
"you can't corrupt the object to the point where it wouldn't be
cleanly free-able".  In particular that means it can work for
user-written called functions in plpgsql, SQL, or whatever, not
only for C-coded functions.

Great, I checked with the upstream library authors and they verified that the object can't be corrupted to where it can't be freed.  Since my expanded objects are just a box around a library handle, I use a MemoryContext callback to call the library free function when the context cleans up, and we can't think of a path where that will fail.
 

There are two requirements to apply the optimization:

* If the assignment statement is within a BEGIN ... EXCEPTION block,
its target variable must be declared inside the most-closely-nested
such block.  This ensures that if an error is thrown from within the
assignment statement's expression, we do not care about the value
of the target variable, except to the extent of being able to clean
it up.

My users are writing algebraic expressions to be done in bulk on GPUs, etc.  I don't think I have to worry too much about wrapping stuff in exception blocks while handling my library objects.

<snip>
While I've not tried to write any code yet, I think both of these
conditions should be reasonably easy to verify.

Given that those conditions are met and the current value of the
assignment target variable is a R/W expanded pointer, we can
execute the assignment as follows:

<snip>
So, while this design greatly expands the set of cases we can
optimize, it does lose some cases that the old approach could
support.  I envision addressing that by allowing a prosupport
function attached to the RHS' topmost function to "bless"
other cases as safe, using reasoning similar to the old rules.
(Or different rules, even, but it's on the prosupport function
to be sure it's safe.)  I don't have a detailed design in mind,
but I'm thinking along the lines of just passing the whole RHS
expression to the prosupport function and letting it decide
what's safe.  In any case, we don't need to even call the
prosupport function unless there's an exception block or
multiple RHS references to the target variable.

That all sounds great, and it sounds like my prosupport function just needs to return true, or set some kind of flag saying aliasing is ok.  I'd like to help as much as possible, but some of that reparenting stuff was pretty deep for me, other than being a quick sanity check case, is there anything I can do to help?
 

                        regards, tom lane

pgsql-hackers by date:

Previous
From: Michel Pelletier
Date:
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
Next
From: Bruce Momjian
Date:
Subject: Re: vacuumdb --analyze-only (e.g., after a major upgrade) vs. partitioned tables: pg_statistic missing stats for the partitioned table itself