Thread: BUG #7651: Superfluous calls to functions used for indexing

BUG #7651: Superfluous calls to functions used for indexing

From
m-pg@8d.no
Date:
The following bug has been logged on the website:

Bug reference:      7651
Logged by:          Morten Minde Neergaard
Email address:      m-pg@8d.no
PostgreSQL version: 9.1.6
Operating system:   Debian Wheezy
Description:        =


Bug: When making an index over a function, then selecting a result that does
not contain the function call (but orders on it), Superfluous function calls
are made. This possibly because the plan creates a projection containing the
function value.

Steps to reproduce:

1. Create a function that gives you a immutable reply. In my real function,
runtime is several ms. The SQL script referred to at the bottom contains a
more minimal, if horribly ugly, example. It prints a NOTICE every time it is
run.
     =

2. Use this function to create an index. In the attached example, this
prints one NOTICE for each row, as expected.
     =

3. Select a small number of rows using e.g. ORDER BY and LIMIT. This returns
correctly, but takes longer than expected to execute. The NOTICE is printed
five times, even though the function should not need be called.


I have made a minimal script reproducing this, and run it with simple test
data. All URLs below.


SQL script: http://xim.akuma.no/psql_bug/psql_bug.txt
Script output: http://xim.akuma.no/psql_bug/psql_bug_log.txt
Data for test: http://xim.akuma.no/psql_bug/file_containing_ints

Re: BUG #7651: Superfluous calls to functions used for indexing

From
Jeff Davis
Date:
On Sun, 2012-11-11 at 15:45 +0000, m-pg@8d.no wrote:
> Bug: When making an index over a function, then selecting a result that does
> not contain the function call (but orders on it), Superfluous function calls
> are made. This possibly because the plan creates a projection containing the
> function value.

It's possible that the function call may be unnecessary, but that is
more of a performance enhancement, not a bug.

Also, the example function has side effects. If you declare functions
with side effects to be IMMUTABLE, you can get all kinds of problems.
You should certainly not rely on an IMMUTABLE function to be called a
specific number of times.

Regards,
    Jeff Davis

Re: BUG #7651: Superfluous calls to functions used for indexing

From
Stuart Bishop
Date:
On Sun, Nov 11, 2012 at 10:45 PM,  <m-pg@8d.no> wrote:

> Bug: When making an index over a function, then selecting a result that does
> not contain the function call (but orders on it), Superfluous function calls
> are made. This possibly because the plan creates a projection containing the
> function value.

I think this is the same issue as was discussed here, dating from
PostgreSQL 8.1:

http://postgresql.1045698.n5.nabble.com/Slow-functional-indexes-td2059587.html


--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

Re: BUG #7651: Superfluous calls to functions used for indexing

From
Morten Minde Neergaard
Date:
At 14:44, Thu 2012-11-15, Jeff Davis wrote:
[…]
> It's possible that the function call may be unnecessary, but that is
> more of a performance enhancement, not a bug.

For me it's a matter of orders of size of performance, so it feels like
a bug ;p

> Also, the example function has side effects. If you declare functions
> with side effects to be IMMUTABLE, you can get all kinds of problems.
> You should certainly not rely on an IMMUTABLE function to be called a
> specific number of times.

The function was only a minimal example, not what I'm running in my
production code =)

At 11:40, Fri 2012-11-16, Stuart Bishop wrote:
> I think this is the same issue as was discussed here, dating from
> PostgreSQL 8.1:
>
> http://postgresql.1045698.n5.nabble.com/Slow-functional-indexes-td2059587.html

Yup, looks like the same thing. Would be nice to have this fixed.

Did, however, find two workarounds possible from PostgreSQL 9.2:

1. Create an index over all the columns you want. The index-only scan
   eliminates recalculation

2. Create an index over (my_complex_function(whatever_columns),
   table_primary_key) and wrap the query as such:
   SELECT a_lot_of_columns FROM my_table WHERE table_primary_key IN
       (SELECT table_primary_key FROM my_table ORDER BY
           my_complex_function(whatever_columns));

Did stumble across a situation where the function was rerun for *all*
rows at some point *despite* having such an index. Will see if I can
reproduce that if I have the time!


Smiles,
--
Morten Minde Neergaard