Detection of nested function calls - Mailing list pgsql-hackers

From Hugo Mercier
Subject Detection of nested function calls
Date
Msg-id 526A61FB.1050209@oslandia.com
Whole thread Raw
Responses Re: Detection of nested function calls  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: Detection of nested function calls  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi all,

The Oslandia team is involved in PostGIS project for years, with a
current focus on PostGIS 3D support.
With PostGIS queries, nested functions calls that manipulate geometries
are quite common, e.g.: SELECT ST_Union(ST_Intersection(a.geom,
ST_Buffer(b.geom, 50)))

PostGIS functions that manipulate geometries have to unserialize their
input geometries from the 'flat' varlena representation to their own,
and serialize the processed geometries back when returning.
But in such nested call queries, this serialization-unserialization
process is just an overhead.

Avoiding it could then lead to a real gain in terms of performances [1],
especially here when the internal type takes time to serialize (and with
new PostGIS types like rasters or 3D geometries it's really meaningful)

So we thought having a way for user functions to know if they are part
of a nested call could allow them to avoid this serialization phase.

The idea would be to have a boolean flag reachable from a user function
(within FunctionCallInfoData) that says if the current function is
nested or not.

We already investigated such a modification and here is where we are up
to now :
  - we modified the parser with a new boolean member 'nested' to the
FuncExpr struct. Within the parser, we know if a function call is nested
into another one and then we can mark the FuncExpr
  - the executor has been modified so it can take into account this
nested member and pass it to the FunctionCallInfoData structure before
evaluating the function

We are working on a PostGIS branch that takes benefit of this
functionality [2]

You can find in attachment a first draft of the patch.

Obviously, even if this is about a PostGIS use case here, this subject
could be helpful for every other queries using both nested functions and
serialization.

I am quite new to postgresql hacking, so I'm sure there is room for
improvements. But, what about this first proposal ?

I'll be at the PGDay conf in Dublin next week, so we could discuss this
topic.

[1] Talking about performances, we already investigated such
"pass-by-reference" mechanism with PostGIS. Taking a dummy function
"st_copy" that only copies its input geometry to its output with 4
levels of nesting gives encouraging results (passing geometries by
reference is more than 2x faster than (un)serializing) :
https://github.com/Oslandia/sfcgal-tests/blob/master/bench/report_serialization_referenced_vs_native.pdf

[2] https://github.com/Oslandia/postgis/tree/nested_ref_passing
--
Hugo Mercier
Oslandia

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6.2
Next
From: Peter Eisentraut
Date:
Subject: Re: Location for external scripts for Extensions?