Thread: SQL functions not being inlined

SQL functions not being inlined

From
Evan Martin
Date:
Some of my functions are running much slower than doing the same query
"inline" and I'd like to know if there's a way to fix that.

I have a number of tables that store data valid at different times. For
each logical entity there may be multiple rows, valid at different times
(sometimes overlapping). Each such table has valid_time_begin and
valid_time_end columns that specify when that data is valid, eg.

CREATE TABLE thing_timeslice
(
   timeslice_id serial NOT NULL,
   thing_id integer NOT NULL,
   valid_time_begin timestamp NOT NULL,
   valid_time_end timestamp NOT NULL,
   ... other columns ...
   CONSTRAINT pk_thing_timeslice PRIMARY KEY (timeslice_id),
   CONSTRAINT fk_thing_timeslice_thing FOREIGN KEY (thing_id) REFERENCES
thing (thing_id)
)
CREATE INDEX ix_thing_timeslice_thing_id ON thing_timeslice (thing_id);

To simplify querying for data valid at a given time I've created
functions like this:

CREATE OR REPLACE FUNCTION thing_asof(effective_time timestamp)
RETURNS SETOF thing_timeslice
AS $BODY$
     SELECT *
     FROM thing_timeslice
     WHERE ($1, '0'::interval) OVERLAPS (valid_time_begin,
COALESCE(valid_time_end, 'infinity'::timestamp))
$BODY$ LANGUAGE SQL STABLE;

I then select from these functions as though they were tables, often
joining them, eg.

SELECT *
FROM thing_asof('2012-05-01') a
JOIN another_thing_asof('2012-05-01') b ON a.thing_id =
b.referenced_thing_id

This is quite slow, especially when I have a WHERE clause that narrows
down the set of rows from 100,000 to 10 or so. I suspect it's evaluating
the function first and filtering it afterwards and perhaps not using the
index, either. If I manually "inline" my functions the queries get 10x
faster! Eg.

SELECT *
FROM
(
     SELECT *
     FROM thing_timeslice
     WHERE ('2012-05-01', '0'::interval) OVERLAPS (valid_time_begin,
COALESCE(valid_time_end, 'infinity'::timestamp))
) a
JOIN ...

My question is: why is PostgreSQL not doing this inlining automatically?
Are there some gotchas I need to be aware of or is it just not
supported? I'm running 9.1.3.

Regards,

Evan

Re: SQL functions not being inlined

From
Chris Angelico
Date:
On Wed, May 2, 2012 at 12:43 PM, Evan Martin
<postgresql@realityexists.net> wrote:
> Some of my functions are running much slower than doing the same query
> "inline" and I'd like to know if there's a way to fix that.
> [chomp analysis and examples]

Is there any possibility that you could recode your functions as
views? The optimizer can push predicates down into views but not, it
would appear, into functions.

In the specific example given, I would be looking, not for an
OVERLAPS, but a simple comparison:

SELECT * FROM thing_timeslice
WHERE '2012-05-01' BETWEEN valid_time_begin AND
COALESCE(valid_time_end, 'infinity'::timestamp)

That may make it easier for the optimizer.

ChrisA

Re: SQL functions not being inlined

From
Tom Lane
Date:
Evan Martin <postgresql@realityexists.net> writes:
> Some of my functions are running much slower than doing the same query
> "inline" and I'd like to know if there's a way to fix that. ...
> This is quite slow, especially when I have a WHERE clause that narrows
> down the set of rows from 100,000 to 10 or so. I suspect it's evaluating
> the function first and filtering it afterwards and perhaps not using the
> index, either.

Well, rather than suspecting, you should use EXPLAIN to find out whether
the functions are inlined or not.  The particular example you give here
seems inlinable to me, but maybe some of your other cases aren't.

I concur with the other respondent that OVERLAPS is not the most
efficient way to deal with the sort of condition you have here, either.
Postgres basically doesn't know how to optimize OVERLAPS at all, whereas
the planner has heard of BETWEEN and other simple-comparison constructs.

            regards, tom lane

Re: SQL functions not being inlined

From
Evan Martin
Date:
Thanks, Tom (and Chris). Yes, the EXPLAIN output showed a function scan:

SELECT * FROM thing_asof('2012-04-01') WHERE timeslice_id = 1234

Function Scan on thing_asof  (cost=0.25..12.75 rows=5 width=353)
   Filter: ((timeslice_id)::integer = 12345)

I replaced the OVERLAPS with < and <= comparisons (since I want the end
time to be exclusive), so the functions now look like this:

CREATE OR REPLACE FUNCTION thing_asof(effective_time timestamp)
RETURNS SETOF thing_timeslice
AS $BODY$
     SELECT *
     FROM thing_timeslice
     WHERE valid_time_begin <= $1 AND (valid_time_end IS NULL OR $1 <
valid_time_end)
$BODY$ LANGUAGE SQL STABLE;

This worked... at first. I did some simple queries and they showed the
function being inlined (index scan on primary key, seq scan - no
function scan). Very happy with that, I tried changing some other
functions (that depend on these) and then found that the _asof functions
are not being inlined anymore! I swear, I'm not making this up. Nothing
changed in those functions. Same simple query. It was inlined before and
now it's not. I've dropped and re-created the functions, did an ANALYZE,
even restarted PostgreSQL - they're not inlined any more. I really don't
know what to think!

Regards,

Evan


On 2/05/2012 1:44 PM, Tom Lane wrote:
> Evan Martin<postgresql@realityexists.net>  writes:
>> Some of my functions are running much slower than doing the same query
>> "inline" and I'd like to know if there's a way to fix that. ...
>> This is quite slow, especially when I have a WHERE clause that narrows
>> down the set of rows from 100,000 to 10 or so. I suspect it's evaluating
>> the function first and filtering it afterwards and perhaps not using the
>> index, either.
> Well, rather than suspecting, you should use EXPLAIN to find out whether
> the functions are inlined or not.  The particular example you give here
> seems inlinable to me, but maybe some of your other cases aren't.
>
> I concur with the other respondent that OVERLAPS is not the most
> efficient way to deal with the sort of condition you have here, either.
> Postgres basically doesn't know how to optimize OVERLAPS at all, whereas
> the planner has heard of BETWEEN and other simple-comparison constructs.
>
>             regards, tom lane
>


Re: SQL functions not being inlined

From
Evan Martin
Date:
OK, it's now changed back again! I suspended the virtual machine in
which PostgreSQL runs, later resumed it, did some unrelated queries
(SELECT only) and then ran the exact same query as before. It now
returns in 15 ms and uses the index. Here's the query plan:

Index Scan using pk_thing_timeslice on thing_timeslice  (cost=0.00..8.32
rows=1 width=382) (actual time=0.058..0.060 rows=1 loops=1)
   Index Cond: ((timeslice_id)::integer = 12345)
   Filter: (((valid_time_begin)::timestamp without time zone <=
'2012-04-01 00:00:00'::timestamp without time zone) AND ((valid_time_end
IS NULL) OR ('2012-04-01 00:00:00'::timestamp without time zone <
(valid_time_end)::timestamp without time zone)))

Any idea what's going on? It seems like the query optimizer randomly
changes its mind about inlining it.

Regards,

Evan

On 2/05/2012 3:34 PM, Evan Martin wrote:
> Thanks, Tom (and Chris). Yes, the EXPLAIN output showed a function scan:
>
> SELECT * FROM thing_asof('2012-04-01') WHERE timeslice_id = 1234
>
> Function Scan on thing_asof  (cost=0.25..12.75 rows=5 width=353)
>   Filter: ((timeslice_id)::integer = 12345)
>
> I replaced the OVERLAPS with < and <= comparisons (since I want the
> end time to be exclusive), so the functions now look like this:
>
> CREATE OR REPLACE FUNCTION thing_asof(effective_time timestamp)
> RETURNS SETOF thing_timeslice
> AS $BODY$
>     SELECT *
>     FROM thing_timeslice
>     WHERE valid_time_begin <= $1 AND (valid_time_end IS NULL OR $1 <
> valid_time_end)
> $BODY$ LANGUAGE SQL STABLE;
>
> This worked... at first. I did some simple queries and they showed the
> function being inlined (index scan on primary key, seq scan - no
> function scan). Very happy with that, I tried changing some other
> functions (that depend on these) and then found that the _asof
> functions are not being inlined anymore! I swear, I'm not making this
> up. Nothing changed in those functions. Same simple query. It was
> inlined before and now it's not. I've dropped and re-created the
> functions, did an ANALYZE, even restarted PostgreSQL - they're not
> inlined any more. I really don't know what to think!
>
> Regards,
>
> Evan
>
>
> On 2/05/2012 1:44 PM, Tom Lane wrote:
>> Evan Martin<postgresql@realityexists.net>  writes:
>>> Some of my functions are running much slower than doing the same query
>>> "inline" and I'd like to know if there's a way to fix that. ...
>>> This is quite slow, especially when I have a WHERE clause that narrows
>>> down the set of rows from 100,000 to 10 or so. I suspect it's
>>> evaluating
>>> the function first and filtering it afterwards and perhaps not using
>>> the
>>> index, either.
>> Well, rather than suspecting, you should use EXPLAIN to find out whether
>> the functions are inlined or not.  The particular example you give here
>> seems inlinable to me, but maybe some of your other cases aren't.
>>
>> I concur with the other respondent that OVERLAPS is not the most
>> efficient way to deal with the sort of condition you have here, either.
>> Postgres basically doesn't know how to optimize OVERLAPS at all, whereas
>> the planner has heard of BETWEEN and other simple-comparison constructs.
>>
>>             regards, tom lane
>>
>


Re: SQL functions not being inlined

From
Tom Lane
Date:
Evan Martin <postgresql@realityexists.net> writes:
> This worked... at first. I did some simple queries and they showed the
> function being inlined (index scan on primary key, seq scan - no
> function scan). Very happy with that, I tried changing some other
> functions (that depend on these) and then found that the _asof functions
> are not being inlined anymore! I swear, I'm not making this up. Nothing
> changed in those functions. Same simple query. It was inlined before and
> now it's not. I've dropped and re-created the functions, did an ANALYZE,
> even restarted PostgreSQL - they're not inlined any more. I really don't
> know what to think!

[ squint... ]  There are a lot of undocumented restrictions on inlining
in inline_set_returning_function(), but AFAICS none of them are
nondeterministic, nor would the decision depend on anything outside
the function and its arguments.  Can you provide a concrete test case?

            regards, tom lane

Re: SQL functions not being inlined

From
Evan Martin
Date:
Thanks, I went into that function, added log statements everywhere and
figured which check it's failing on:

!heap_attisnull(func_tuple, Anum_pg_proc_proconfig)

and it's because my real function had this at the end:

SET search_path FROM CURRENT;

which I never imagined would make any difference. This still doesn't
explain why it was being inlined sometimes - I didn't add and remove
that bit, it was there the whole time! But at least the fix is simple -
remove it.

Is there any reason this stuff isn't documented? It can have huge
performance implications, so I'm surprised more people don't run into
it. Even better would be some query that checks whether a function is
inlineable - maybe not perfectly, but it could detect a few of the
reasons just from pg_proc, right?

Regards,

Evan

On 2/05/2012 11:41 PM, Tom Lane wrote:
> Evan Martin<postgresql@realityexists.net>  writes:
>> This worked... at first. I did some simple queries and they showed the
>> function being inlined (index scan on primary key, seq scan - no
>> function scan). Very happy with that, I tried changing some other
>> functions (that depend on these) and then found that the _asof functions
>> are not being inlined anymore! I swear, I'm not making this up. Nothing
>> changed in those functions. Same simple query. It was inlined before and
>> now it's not. I've dropped and re-created the functions, did an ANALYZE,
>> even restarted PostgreSQL - they're not inlined any more. I really don't
>> know what to think!
> [ squint... ]  There are a lot of undocumented restrictions on inlining
> in inline_set_returning_function(), but AFAICS none of them are
> nondeterministic, nor would the decision depend on anything outside
> the function and its arguments.  Can you provide a concrete test case?
>
>             regards, tom lane
>


Re: SQL functions not being inlined

From
Tom Lane
Date:
Evan Martin <postgresql@realityexists.net> writes:
> Thanks, I went into that function, added log statements everywhere and
> figured which check it's failing on:
> !heap_attisnull(func_tuple, Anum_pg_proc_proconfig)
> and it's because my real function had this at the end:

> SET search_path FROM CURRENT;

Well, shame on you for omitting that critical detail from your example.

> which I never imagined would make any difference.

The reason SET options disable inlining is that inlining would leave
noplace for the SET to be applied and then removed.  In some cases it's
possible that we could prove that the SET need not occur at runtime,
but the inlining mechanism doesn't have that much knowledge about
configuration parameters.

> This still doesn't
> explain why it was being inlined sometimes - I didn't add and remove
> that bit, it was there the whole time!

That does seem improbable.  You sure you were looking at just one
function?

> Is there any reason this stuff isn't documented? It can have huge
> performance implications, so I'm surprised more people don't run into
> it.

There are a huge number of details of planner behavior that "can have
huge performance implications", and they change frequently.  It's not
clear to me that we can document this stuff in a way that's better than
referring to the source code.

            regards, tom lane

Re: SQL functions not being inlined

From
Evan Martin
Date:
Of course, it seems silly now to not have included "SET search_path FROM current" in my post, but I had no idea what was and wasn't critical - that was the whole problem. Wisdom begins with knowing the right questions to ask! Yes, I was looking at the same function - even checked SVN logs to make sure and it had that "SET" for a long time. Still have no idea what happened there.

Re documentation - yes, the source code is well-commented, but I wouldn't have known where to look in the code if you hadn't told me. As a user, I didn't even know how to tell whether the function was being inlined or not. The EXPLAIN output just said "Function scan" - but it's always said that for me, so how was I to know what it should say? What about a section under chapter 14 (Performance Tips) on functions, which explains:

1) Why a function might run slower than running the same query directly. (Not inlined, plan caching, other reasons?)
2) How to tell whether a function is being inlined.
3) Things that would prevent a function from being inlined. I understand that some of these can change and ideally the doc should note which are inherent limitations and which are current implementation restrictions.

I'm going to write this up as a blog post just for my own reference, but nobody else is going to find it there. I'd love to see it documented properly by someone who knows what they're talking about.

Regards,

Evan

On 4/05/2012 1:30 AM, Tom Lane wrote:
Evan Martin <postgresql@realityexists.net> writes:
Thanks, I went into that function, added log statements everywhere and 
figured which check it's failing on:
!heap_attisnull(func_tuple, Anum_pg_proc_proconfig)
and it's because my real function had this at the end:
SET search_path FROM CURRENT;
Well, shame on you for omitting that critical detail from your example.

which I never imagined would make any difference.
The reason SET options disable inlining is that inlining would leave
noplace for the SET to be applied and then removed.  In some cases it's
possible that we could prove that the SET need not occur at runtime,
but the inlining mechanism doesn't have that much knowledge about
configuration parameters.

This still doesn't 
explain why it was being inlined sometimes - I didn't add and remove 
that bit, it was there the whole time!
That does seem improbable.  You sure you were looking at just one
function?

Is there any reason this stuff isn't documented? It can have huge 
performance implications, so I'm surprised more people don't run into 
it.
There are a huge number of details of planner behavior that "can have
huge performance implications", and they change frequently.  It's not
clear to me that we can document this stuff in a way that's better than
referring to the source code.
		regards, tom lane