Re: Inlining functions with "expensive" parameters - Mailing list pgsql-hackers

From Paul Ramsey
Subject Re: Inlining functions with "expensive" parameters
Date
Msg-id CACowWR1gqzUQ-vrVsfNZft0--mv6accnyQjKGGD5o72WtddUOw@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Inlining functions with "expensive" parameters  (Paul Ramsey <pramsey@cleverelephant.ca>)
Responses Re: Inlining functions with "expensive" parameters  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On Thu, Nov 9, 2017 at 12:11 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
All,

As we try and make PostGIS more "parallel sensitive" we have been added costs to our functions, so that their relative CPU cost is more accurately reflected in parallel plans. 

This has resulted in an odd side effect: some of our "wrapper" functions stop giving index scans in plans [1]. This is a problem!

An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an index operation (geom1 && geom2) with an exact spatial test (_ST_Intersects(geom1, geom2). This is primarily for user convenience, and has worked for us well for a decade and more. Having this construct stop working is definitely a problem.

As we add costs to our functions, the odds increase that one of the parameters to a wrapper might be a costed function. It's not uncommon to see:

ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326))

It's fair to say that we really do depend on our wrappers getting inlined basically all the time. They are simple functions, they do nothing other than 'SELECT func1() AND func2() AND arg1 && arg2'.

However, once costs are added to the parameters, the inlining can be turned off relatively quickly. Here's a PgSQL native example:

    -- Create data table and index. Analyze.
    DROP TABLE IF EXISTS boxen;
    CREATE TABLE boxen AS 
        SELECT row_number() OVER() As gid, 
        box(point(x, y),point(x+1, y+1)) AS b, x, y
        FROM generate_series(-100,100) As y, generate_series(-100,100) As x;
    CREATE INDEX idx_b_geom_gist ON boxen USING gist(b);
    ANALYZE boxen;

    -- An inlined function
    -- When set 'STRICT' it breaks index access
    -- However 'IMMUTABLE' doesn't seem to bother it
    CREATE OR REPLACE FUNCTION good_box(box, box)
        RETURNS boolean
        AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2))) < 3'
        LANGUAGE 'sql';

    -- Start with a low cost circle()
    ALTER FUNCTION circle(point, double precision) COST 1;

    -- [A] Query plan hits index
    EXPLAIN SELECT gid 
    FROM boxen
    WHERE good_box(
        boxen.b, 
        box(circle(point(20.5, 20.5), 2))
        );

    -- [B] Query plan hits index
    EXPLAIN SELECT gid 
    FROM boxen, 
        (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c 
    WHERE good_box(
        boxen.b, 
        box(circle(point(c.x, c.y), 2))
        );

    -- Increase cost of circle
    ALTER FUNCTION circle(point, double precision) COST 100;

    -- [B] Query plan does not hit index!
    EXPLAIN SELECT gid 
    FROM boxen, 
        (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c 
    WHERE good_box(
        boxen.b, 
        box(circle(point(c.x, c.y), 2))
        );
    
The inlining is getting tossed out on a test of how expensive the function parameters are [2]. As a result, we lose what is really the correct plan, and get a sequence scan instead of an index scan.

The test of parameter cost seems quite old (15+ years) and perhaps didn't anticipate highly variable individual function costs (or maybe it did). As it stands though, PostGIS is currently stuck choosing between having costs on our functions or having our inlined wrappers, because we cannot have both at the same time.

I personally find it hard to make the case for not inlining all the time, but that's probably a lack of imagination. The functions in question pass all the other tests of "inlinability" the function cost one seems arbitrary.

 

pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: [HACKERS] Proposal: generic WAL compression
Next
From: Fabrízio de Royes Mello
Date:
Subject: Re: pgsql: Disable installcheck tests for test_session_hooks