Thread: Index on immutable function call

Index on immutable function call

From
"Philippe Lang"
Date:
Hi,

I'm trying to figure out how to use an index on an immutable function
call in order to speed up queries.

I came up with this small test:

----------------------------------------
--create database foo;

--drop table indexed_table;

create table indexed_table (
  id serial primary key,
  data1 integer,
  data2 integer
);

create or replace function this_is_a_long_transformation(d integer)
returns integer as $$
  declare
    l    integer;
  begin
    -- wait
    l = 0;
    while l < 100 loop
      l = l + 1;
    end loop;
    -- return same value
    return d;
  end
$$
language plpgsql immutable;

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 1000 * i) as integer),
    cast((select random() * 1000 * i) as integer)
  from generate_series(1, 100000) as i;

-- create index
create index long_transformation_index on indexed_table
(this_is_a_long_transformation(data2));

--select * from indexed_table WHERE data1 > data2;
select * from indexed_table WHERE data1 >
this_is_a_long_transformation(data2);
----------------------------------------

My goal is to make query...

  select * from indexed_table WHERE data1 >
this_is_a_long_transformation(data2);

... as fast as

  select * from indexed_table WHERE data1 > data2;

... with the help of the index "long_transformation_index".


Unfortunately, Postgreql does not use the index at all.

What am I doing wrong? I use the default query tuning options of
Postgresql 8.3.7.

Best regards,

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
"A. Kretschmer"
Date:
In response to Philippe Lang :
> Hi,
>
> I'm trying to figure out how to use an index on an immutable function
> call in order to speed up queries.
> My goal is to make query...
>
>   select * from indexed_table WHERE data1 >
> this_is_a_long_transformation(data2);
>
> ... as fast as
>
>   select * from indexed_table WHERE data1 > data2;
>
> ... with the help of the index "long_transformation_index".
>
>
> Unfortunately, Postgreql does not use the index at all.
>
> What am I doing wrong? I use the default query tuning options of
> Postgresql 8.3.7.

The planner doesn't know the actual parameter for the function, so he
picked out the wrong plan. You can force the planner to re-planning with
a dynamical statement within the function and EXECUTE that.

http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

In short: build a string that contains your query and EXECUTE that
string, within your function.


Regards, Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

Re: Index on immutable function call

From
Alban Hertroys
Date:
On 19 Jan 2010, at 7:40, Philippe Lang wrote:

> Hi,
>
> I'm trying to figure out how to use an index on an immutable function
> call in order to speed up queries.
>
> I came up with this small test:
>
> ----------------------------------------
> --create database foo;
>
> --drop table indexed_table;
>
> create table indexed_table (
>  id serial primary key,
>  data1 integer,
>  data2 integer
> );
>
> create or replace function this_is_a_long_transformation(d integer)
> returns integer as $$
>  declare
>    l    integer;
>  begin
>    -- wait
>    l = 0;
>    while l < 100 loop
>      l = l + 1;
>    end loop;
>    -- return same value
>    return d;
>  end
> $$
> language plpgsql immutable;
>
> -- insert data into table
> insert into indexed_table
>  select
>    i,
>    cast((select random() * 1000 * i) as integer),
>    cast((select random() * 1000 * i) as integer)
>  from generate_series(1, 100000) as i;
>
> -- create index
> create index long_transformation_index on indexed_table
> (this_is_a_long_transformation(data2));
>
> --select * from indexed_table WHERE data1 > data2;
> select * from indexed_table WHERE data1 >
> this_is_a_long_transformation(data2);
> ----------------------------------------
>
> My goal is to make query...
>
>  select * from indexed_table WHERE data1 >
> this_is_a_long_transformation(data2);
>
> ... as fast as
>
>  select * from indexed_table WHERE data1 > data2;
>
> ... with the help of the index "long_transformation_index".
>
>
> Unfortunately, Postgreql does not use the index at all.
>
> What am I doing wrong? I use the default query tuning options of
> Postgresql 8.3.7.


Did you analyse the table?
Can you show us an explain analyse?

What I notice off-hand is that you don't appear to have an index on data1, so Postgres doesn't know for which rows that
is> some_immutable_function(data2). 

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b5579a310607798915529!



Re: Index on immutable function call

From
Alban Hertroys
Date:
On 19 Jan 2010, at 8:26, A. Kretschmer wrote:
> The planner doesn't know the actual parameter for the function, so he
> picked out the wrong plan. You can force the planner to re-planning with
> a dynamical statement within the function and EXECUTE that.


There are no queries in his function, I think you misread his post.

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b5579d610602488920658!



Re: Index on immutable function call

From
Tore Halvorsen
Date:
On Tue, Jan 19, 2010 at 10:22 AM, Alban Hertroys
<dalroi@solfertje.student.utwente.nl> wrote:
>
> On 19 Jan 2010, at 8:26, A. Kretschmer wrote:
>> The planner doesn't know the actual parameter for the function, so he
>> picked out the wrong plan. You can force the planner to re-planning with
>> a dynamical statement within the function and EXECUTE that.
>
>
> There are no queries in his function, I think you misread his post.

Perhaps this might work?

create index transform_index on indexed_table(serial)
where data1 > this_is_a_long_transformation(data2);

--
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
<demo> 2010 Tore Halvorsen || +052 0553034554

Re: Index on immutable function call

From
"A. Kretschmer"
Date:
In response to Alban Hertroys :
>
> On 19 Jan 2010, at 8:26, A. Kretschmer wrote:
> > The planner doesn't know the actual parameter for the function, so he
> > picked out the wrong plan. You can force the planner to re-planning with
> > a dynamical statement within the function and EXECUTE that.
>
>
> There are no queries in his function, I think you misread his post.

You are right, sorry.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

Re: Index on immutable function call

From
"Philippe Lang"
Date:
pgsql-general-owner@postgresql.org wrote:

> Did you analyse the table?

Hi,

Yes, I did.

> Can you show us an explain analyse?

Here is the explain with my initial index:

create index long_transformation2_index on indexed_table
(this_is_a_long_transformation(data2));

-------------------------------------
Seq Scan on indexed_table  (cost=0.00..26791.00 rows=33333 width=12)
(actual time=0.158..5379.933 rows=49740 loops=1)
  Filter: (data1 > this_is_a_long_transformation(data2))
Total runtime: 5606.855 ms
-------------------------------------

> What I notice off-hand is that you don't appear to have an index on
> data1, so Postgres doesn't know for which rows that is >
> some_immutable_function(data2).

I tried adding an index on data1:

create index long_transformation1_index on indexed_table (data1);
create index long_transformation2_index on indexed_table
(this_is_a_long_transformation(data2));

But I still have an sequential scan:

-------------------------------------
Seq Scan on indexed_table  (cost=0.00..26791.00 rows=33333 width=12)
(actual time=0.199..5284.322 rows=49739 loops=1)
  Filter: (data1 > this_is_a_long_transformation(data2))
Total runtime: 5513.676 ms
-------------------------------------

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
"Philippe Lang"
Date:
pgsql-general-owner@postgresql.org wrote:
> On Tue, Jan 19, 2010 at 10:22 AM, Alban Hertroys
> <dalroi@solfertje.student.utwente.nl> wrote:
>>
>> On 19 Jan 2010, at 8:26, A. Kretschmer wrote:
>>> The planner doesn't know the actual parameter for the function, so
>>> he picked out the wrong plan. You can force the planner to
>>> re-planning with a dynamical statement within the function and
>>> EXECUTE that.
>>
>>
>> There are no queries in his function, I think you misread his post.
>
> Perhaps this might work?
>
> create index transform_index on indexed_table(serial) where data1 >
> this_is_a_long_transformation(data2);

Hi,

It does!

With your index alone:

---------------------------
Bitmap Heap Scan on indexed_table  (cost=815.17..10106.08 rows=33333
width=12) (actual time=7.796..236.722 rows=50116 loops=1)
  Recheck Cond: (data1 > this_is_a_long_transformation(data2))
  ->  Bitmap Index Scan on transform_index  (cost=0.00..806.84
rows=33333 width=0) (actual time=7.665..7.665 rows=50116 loops=1)
Total runtime: 459.380 ms
---------------------------

That works just fine, but is there maybe a way of creating a slighly
more "generic" index? If I change the ">" with a "<" in the query, index
cannot of course be used. According to documentation, answer seems to be
"no"...

Thanks, and best regards.

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
Alban Hertroys
Date:
On 19 Jan 2010, at 10:38, Philippe Lang wrote:

>> What I notice off-hand is that you don't appear to have an index on
>> data1, so Postgres doesn't know for which rows that is >
>> some_immutable_function(data2).
>
> I tried adding an index on data1:
>
> create index long_transformation1_index on indexed_table (data1);
> create index long_transformation2_index on indexed_table
> (this_is_a_long_transformation(data2));
>
> But I still have an sequential scan:
>
> -------------------------------------
> Seq Scan on indexed_table  (cost=0.00..26791.00 rows=33333 width=12)
> (actual time=0.199..5284.322 rows=49739 loops=1)
>  Filter: (data1 > this_is_a_long_transformation(data2))
> Total runtime: 5513.676 ms
> -------------------------------------


Ah yes, you would. I did kind of expect this answer, but this confirms it.
About every other row matches your query. Although the planner thinks every one out of three does, that still means a
sequentialscan is probably going to be faster than an index scan. 

I'm quite sure you would get an index scan if you'd reduce the number of rows that match your query significantly, for
exampleby querying for data1 * 100 > this_is... 

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b55861110603998611157!



Re: Index on immutable function call

From
Alban Hertroys
Date:
> Hi,
>
> It does!
>
> With your index alone:
>
> ---------------------------
> Bitmap Heap Scan on indexed_table  (cost=815.17..10106.08 rows=33333
> width=12) (actual time=7.796..236.722 rows=50116 loops=1)
>  Recheck Cond: (data1 > this_is_a_long_transformation(data2))
>  ->  Bitmap Index Scan on transform_index  (cost=0.00..806.84
> rows=33333 width=0) (actual time=7.665..7.665 rows=50116 loops=1)
> Total runtime: 459.380 ms
> ---------------------------
>
> That works just fine, but is there maybe a way of creating a slighly
> more "generic" index? If I change the ">" with a "<" in the query, index
> cannot of course be used. According to documentation, answer seems to be
> "no"...

I have to say I'm a bit surprised this works, as the database still needs to check all these rows for existence.
Apparentlythe index is sufficiently selective with your database tuning parameters. 

That said, if this works then a combined index on (data1, this_is_a_long_transformation(data2)) will probably also work
andgive you the flexibility you need. 

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b55875b10601514515279!



Re: Index on immutable function call

From
"A. Kretschmer"
Date:
In response to Philippe Lang :
>
> My goal is to make query...
>
>   select * from indexed_table WHERE data1 >
> this_is_a_long_transformation(data2);
>
> ... as fast as
>
>   select * from indexed_table WHERE data1 > data2;
>
> ... with the help of the index "long_transformation_index".
>

Around 50% of your data comply with your where-condition, i think, this
is the reason why the planner decides for a seq-scan.

Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

Re: Index on immutable function call

From
"Philippe Lang"
Date:
Alban Hertroys wrote:
>> Hi,
>>
>> It does!
>>
>> With your index alone:
>>
>> ---------------------------
>> Bitmap Heap Scan on indexed_table  (cost=815.17..10106.08 rows=33333
>> width=12) (actual time=7.796..236.722 rows=50116 loops=1)  Recheck
>> Cond: (data1 > this_is_a_long_transformation(data2))
>>  ->  Bitmap Index Scan on transform_index  (cost=0.00..806.84
>> rows=33333 width=0) (actual time=7.665..7.665 rows=50116 loops=1)
>> Total runtime: 459.380 ms ---------------------------
>>
>> That works just fine, but is there maybe a way of creating a slighly
>> more "generic" index? If I change the ">" with a "<" in the query,
>> index cannot of course be used. According to documentation, answer
>> seems to be "no"...
>
> I have to say I'm a bit surprised this works, as the database still
> needs to check all these rows for existence. Apparently the index is
> sufficiently selective with your database tuning parameters.
>
> That said, if this works then a combined index on (data1,
> this_is_a_long_transformation(data2)) will probably also work and
> give you the flexibility you need.

I have tried with a combined index:

create index long_transformation_index on indexed_table (data1,
this_is_a_long_transformation(data2));

Unfortunately, it does not work:

-------------------------------
Seq Scan on indexed_table  (cost=0.00..26791.00 rows=33333 width=12)
(actual time=0.327..5805.199 rows=49959 loops=1)
  Filter: (data1 > this_is_a_long_transformation(data2))
Total runtime: 6340.772 ms
-------------------------------


-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
"Philippe Lang"
Date:
pgsql-general-owner@postgresql.org wrote:
> In response to Philippe Lang :
>>
>> My goal is to make query...
>>
>>   select * from indexed_table WHERE data1 >
>> this_is_a_long_transformation(data2);
>>
>> ... as fast as
>>
>>   select * from indexed_table WHERE data1 > data2;
>>
>> ... with the help of the index "long_transformation_index".
>>
>
> Around 50% of your data comply with your where-condition, i think,
> this is the reason why the planner decides for a seq-scan.

I tried to insert data differently, with a strong dissymetry:

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 10 * i) as integer),
    cast((select random() * 2000 * i) as integer)
  from generate_series(1, 100000) as i;

OR

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 2000 * i) as integer),
    cast((select random() * 10 * i) as integer)
  from generate_series(1, 100000) as i;


I use 3 indexes, just in case:



-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
"Philippe Lang"
Date:
pgsql-general-owner@postgresql.org wrote:
> In response to Philippe Lang :
>>
>> My goal is to make query...
>>
>>   select * from indexed_table WHERE data1 >
>> this_is_a_long_transformation(data2);
>>
>> ... as fast as
>>
>>   select * from indexed_table WHERE data1 > data2;
>>
>> ... with the help of the index "long_transformation_index".
>>
>
> Around 50% of your data comply with your where-condition, i think,
> this is the reason why the planner decides for a seq-scan.
>
> Andreas
> --
> Andreas Kretschmer
> Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
> GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99


Sorry...


I tried to insert data differently, with a strong dissymetry:

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 10 * i) as integer),
    cast((select random() * 2000 * i) as integer)
  from generate_series(1, 100000) as i;

OR

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 2000 * i) as integer),
    cast((select random() * 10 * i) as integer)
  from generate_series(1, 100000) as i;


I use 3 indexes, just in case:

create index long_transformation1_index on indexed_table (data1);
create index long_transformation2_index on indexed_table
(this_is_a_long_transformation(data2));
create index long_transformation3_index on indexed_table (data1,
this_is_a_long_transformation(data2));

But in every case, I end up with a sequential scan. The partial index is
for the moment the only working solution...

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
"Timo Klecker"
Date:
Hello Philippe,

if you always select data1 > this_is_a_long_transformation(data2) you could
use the following index:


create index long_transformation_index_2 on indexed_table ( ( data1 >
this_is_a_long_transformation(data2) ) );



Index Scan using long_transformation_index_2 on indexed_table
(cost=0.25..2450.96 rows=33333 width=12)
  Index Cond: ((data1 > this_is_a_long_transformation(data2)) = true)
  Filter: (data1 > this_is_a_long_transformation(data2))



Greetings
Timo Klecker



-----Ursprüngliche Nachricht-----
Von: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] Im Auftrag von Philippe Lang
Gesendet: Dienstag, 19. Januar 2010 11:19
An: A. Kretschmer; pgsql-general@postgresql.org
Betreff: Re: [GENERAL] Index on immutable function call

pgsql-general-owner@postgresql.org wrote:
> In response to Philippe Lang :
>>
>> My goal is to make query...
>>
>>   select * from indexed_table WHERE data1 >
>> this_is_a_long_transformation(data2);
>>
>> ... as fast as
>>
>>   select * from indexed_table WHERE data1 > data2;
>>
>> ... with the help of the index "long_transformation_index".
>>
>
> Around 50% of your data comply with your where-condition, i think,
> this is the reason why the planner decides for a seq-scan.
>
> Andreas
> --
> Andreas Kretschmer
> Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
> GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99


Sorry...


I tried to insert data differently, with a strong dissymetry:

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 10 * i) as integer),
    cast((select random() * 2000 * i) as integer)
  from generate_series(1, 100000) as i;

OR

-- insert data into table
insert into indexed_table
  select
    i,
    cast((select random() * 2000 * i) as integer),
    cast((select random() * 10 * i) as integer)
  from generate_series(1, 100000) as i;


I use 3 indexes, just in case:

create index long_transformation1_index on indexed_table (data1);
create index long_transformation2_index on indexed_table
(this_is_a_long_transformation(data2));
create index long_transformation3_index on indexed_table (data1,
this_is_a_long_transformation(data2));

But in every case, I end up with a sequential scan. The partial index is
for the moment the only working solution...

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Index on immutable function call

From
Alban Hertroys
Date:
> I have tried with a combined index:
>
> create index long_transformation_index on indexed_table (data1,
> this_is_a_long_transformation(data2));
>
> Unfortunately, it does not work:
>
> -------------------------------
> Seq Scan on indexed_table  (cost=0.00..26791.00 rows=33333 width=12)
> (actual time=0.327..5805.199 rows=49959 loops=1)
>  Filter: (data1 > this_is_a_long_transformation(data2))
> Total runtime: 6340.772 ms
> -------------------------------


Strange. I noticed that the number of records you get from each method differs somewhat, are you recreating the
databaseeach time? 

With the combined index, or just an index on each column; if you disable seqscans (set enable_seqscan to false), at
whatcost does the planner estimate the bitmap index scan that I expect you'll get in that case? Can you show us the
outputof explain for that case? 

I don't get why it'd be estimated so much more expensive than the partial index Tore came up with that it would prefer
aseqscan. Tore's index would create a better balanced tree as serial is guaranteed to be unique, while data1 and data2
aren't(collisions). 

It's all probably an artefact of the randomness of your data - many of the statistics the planner tracks are quite
uselesshere. Real data tends to be a lot less random so estimates are usually much better there. 

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b55902010601090241314!



Re: Index on immutable function call

From
Tore Halvorsen
Date:
On Tue, Jan 19, 2010 at 11:57 AM, Alban Hertroys
> Strange. I noticed that the number of records you get from each method differs somewhat, are you recreating the
databaseeach time? 
>
> With the combined index, or just an index on each column; if you disable seqscans (set enable_seqscan to false), at
whatcost does the planner estimate the bitmap index scan that I expect you'll get in that case? Can you show us the
outputof explain for that case? 
>
> I don't get why it'd be estimated so much more expensive than the partial index Tore came up with that it would
prefera seqscan. Tore's index would create a better balanced tree as serial is guaranteed to be unique, while data1 and
data2aren't (collisions). 
>
> It's all probably an artefact of the randomness of your data - many of the statistics the planner tracks are quite
uselesshere. Real data tends to be a lot less random so estimates are usually much better there. 

I'm not sure, but I think a comparison of the index-termes may be hard
to use as a lookup. I can't quite formulate why, though...

A slightly more generic solution may be somthing like...

CREATE OR REPLACE FUNCTION compare(a numeric, b numeric)
  RETURNS smallint AS // or create an enum to return
$BODY$
select case $1 = $2 when true then 0::smallint else case $1 < $2 when
true then -1::smallint else 1::smallint end end
$BODY$
  LANGUAGE 'sql' IMMUTABLE STRICT;

CREATE INDEX compare_index
  ON indexed_table(compare(data1, this_is_a_long_transformation(data2)));

SELECT * FROM indexed_table where
compare(data1, this_is_a_long_transformation(data2)) in (0, 1); // for
greater-equal

--
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
<demo> 2010 Tore Halvorsen || +052 0553034554

Re: Index on immutable function call

From
Sam Mason
Date:
On Tue, Jan 19, 2010 at 07:40:00AM +0100, Philippe Lang wrote:
> I'm trying to figure out how to use an index on an immutable function
> call in order to speed up queries.
[..]
> Unfortunately, Postgreql does not use the index at all.

Yup, an index isn't going to be very useful in what you're doing.  It's
arranged the "wrong" way around---you want something in the table itself
to compare against.  I'd just have another column in the table that
stores the result of the function call (use a trigger to keep it up to
date) and then create an index on this column if you want/need.

--
  Sam  http://samason.me.uk/

Re: Index on immutable function call

From
"Philippe Lang"
Date:
pgsql-general-owner@postgresql.org wrote:
> Hello Philippe,
>
> if you always select data1 > this_is_a_long_transformation(data2) you
> could use the following index:
>
>
> create index long_transformation_index_2 on indexed_table ( ( data1 >
> this_is_a_long_transformation(data2) ) );
>
>
>
> Index Scan using long_transformation_index_2 on indexed_table
> (cost=0.25..2450.96 rows=33333 width=12)
>   Index Cond: ((data1 > this_is_a_long_transformation(data2)) = true)
>   Filter: (data1 > this_is_a_long_transformation(data2))

Hi Timo,

Thanks, that was certainly what I was searching for...

I tried your solution, but it's slower than the partial index:


1) Index
--------

create index long_transformation4_index on indexed_table ( ( data1 >
this_is_a_long_transformation(data2) ) );
------------------------------
"Index Scan using long_transformation4_index on indexed_table
(cost=0.25..3466.51 rows=33333 width=12) (actual time=0.252..3125.308
rows=50281 loops=1)"
"  Index Cond: ((data1 > this_is_a_long_transformation(data2)) = true)"
"  Filter: (data1 > this_is_a_long_transformation(data2))"
"Total runtime: 3505.435 ms"
------------------------------


2) Partial index
----------------

create index transform_index on indexed_table(id) where data1 >
this_is_a_long_transformation(data2);
------------------------------
"Bitmap Heap Scan on indexed_table  (cost=815.09..10106.01 rows=33333
width=12) (actual time=7.477..237.331 rows=50101 loops=1)"
"  Recheck Cond: (data1 > this_is_a_long_transformation(data2))"
"  ->  Bitmap Index Scan on transform_index  (cost=0.00..806.76
rows=33333 width=0) (actual time=7.339..7.339 rows=50101 loops=1)"
"Total runtime: 459.657 ms"
------------------------------


I guess it's because the partial index is smaller?

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
"Philippe Lang"
Date:
pgsql-general-owner@postgresql.org wrote:
> On Tue, Jan 19, 2010 at 07:40:00AM +0100, Philippe Lang wrote:
>> I'm trying to figure out how to use an index on an immutable function
>> call in order to speed up queries.
> [..]
>> Unfortunately, Postgreql does not use the index at all.
>
> Yup, an index isn't going to be very useful in what you're doing.
> It's arranged the "wrong" way around---you want something in the
> table itself to compare against.  I'd just have another column in the
> table that stores the result of the function call (use a trigger to
> keep it up to date) and then create an index on this column if you
> want/need.

Hi,

I'm already doing "caching with triggers", it works great. The only
problem is that it costs an extra "machinery": a column and triggers.

I was wondering this morning if a proper index could replace this. It
looks like the answer is "yes", but I'm worried about stability: you are
never sure in advance how the planner will work exactly...

Best regards,

Philippe

-----------------------------------------------------------------------
Philippe Lang                   Web    : www.attiksystem.ch
Attik System                    Email  : philippe.lang@attiksystem.ch
rte de la Fonderie 2            Phone  : +41 26 422 13 75
1700 Fribourg                   Mobile : +41 79 351 49 94
Switzerland                     Fax    : +41 26 422 13 76






Re: Index on immutable function call

From
Dean Rasheed
Date:
2010/1/19 Philippe Lang <philippe.lang@attiksystem.ch>:
> That works just fine, but is there maybe a way of creating a slighly
> more "generic" index? If I change the ">" with a "<" in the query, index
> cannot of course be used. According to documentation, answer seems to be
> "no"...
>

You could create an index on the difference:

create index long_transformation_index on indexed_table
((data1-this_is_a_long_transformation(data2)));

then rewrite your queries accordingly:

... WHERE data1-this_is_a_long_transformation(data2) > some const (or
< some const)

HTH,
Dean

Re: Index on immutable function call

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@googlemail.com> writes:
> 2010/1/19 Philippe Lang <philippe.lang@attiksystem.ch>:
>> That works just fine, but is there maybe a way of creating a slighly
>> more "generic" index? If I change the ">" with a "<" in the query, index
>> cannot of course be used. According to documentation, answer seems to be
>> "no"...

> You could create an index on the difference:
> create index long_transformation_index on indexed_table
> ((data1-this_is_a_long_transformation(data2)));

> then rewrite your queries accordingly:
> ... WHERE data1-this_is_a_long_transformation(data2) > some const (or
> < some const)

Yeah.  There's been a lot of noise in this thread about statistics,
but the real problem is this: an indexable condition compares the
indexed value to a constant (or at least something that doesn't
change across rows).  "col1 > foo(col2)" is not able to use an index
on col1, nor an index on foo(col2).  But if you indexed the whole
expression col1 - foo(col2), you could get an index search on
comparisons of that to a constant.

            regards, tom lane