Re: [BUGS] BUG #6325: Useless Index updates - Mailing list pgsql-general

From Craig Ringer
Subject Re: [BUGS] BUG #6325: Useless Index updates
Date
Msg-id 4EDEF25C.4010401@ringerc.id.au
Whole thread Raw
In response to Re: [BUGS] BUG #6325: Useless Index updates  (Daniel Migowski <dmigowski@ikoffice.de>)
List pgsql-general
On 06/12/11 22:51, Daniel Migowski wrote:
> Continuing this talk on general, as requested by Craig.
>
> I have a functional Index on a table that is relative expensive to calculate. Now I noticed on every update of even
index-unrelatedfields of the table the index function is calculated again and again.  

I thought I'd test this out, so I put together a quick test case. It's
clear that HOT is doing its job, because it's not easy to reproduce your
issue in a trivial dummy table. I had to fill up the table so there was
no free space on most of the pages before I could force repeat
evaluation of the expensive function. Anyway, here's a test-case:


DROP TABLE IF EXISTS funcindex;

CREATE OR REPLACE FUNCTION expensive(integer) RETURNS integer AS $$
BEGIN
  RAISE NOTICE 'Sleeping for 1s in expensive(%)',$1;
  PERFORM pg_sleep(1);
  RETURN $1 / 2;
END;
$$ LANGUAGE 'plpgsql' STRICT IMMUTABLE;

CREATE TABLE funcindex (
  id SERIAL PRIMARY KEY,
  somenumber integer not null,
  nonindexed integer not null
);

CREATE INDEX funcindex_somenumber_expensive_idx
ON funcindex( (expensive(somenumber)) );

-- Sleeps 3 times, once per expensive() invocation
INSERT INTO funcindex(somenumber, nonindexed) VALUES (1,1), (2,2), (3,3);

-- Uses index, avoiding evaluating expensive()
SELECT * from funcindex where expensive(somenumber) = 1;

-- Avoids evaluating expensive(): no changes
UPDATE funcindex SET somenumber = somenumber;

-- Avoids evaluating expensive(): no change to indexed col
UPDATE funcindex SET nonindexed = nonindexed + 1;

-- Evaluates expensive(): indexed col changed
UPDATE funcindex SET somenumber = somenumber + 1;

-- Let's put in some proper dummy data to fill the table
-- up and try to defeat HOT:
TRUNCATE TABLE funcindex;

-- Replace the wait while we populate the table
CREATE OR REPLACE FUNCTION expensive(integer) RETURNS integer AS $$
BEGIN
  RETURN $1 / 2;
END;
$$ LANGUAGE 'plpgsql' STRICT IMMUTABLE;

INSERT INTO funcindex(somenumber,nonindexed)
SELECT x.n, x.n*2 FROM (SELECT generate_series(1,10000) AS n) AS x;

-- and make it slow again after
CREATE OR REPLACE FUNCTION expensive(integer) RETURNS integer AS $$
BEGIN
  RAISE NOTICE 'Sleeping for 1s in expensive(%)',$1;
  PERFORM pg_sleep(1);
  RETURN $1 / 2;
END;
$$ LANGUAGE 'plpgsql' STRICT IMMUTABLE;

-- Because there's no free space in the pages to allow HOT to work, this
will
-- basically never end, it'll just sit in endless sleeps.
-- Using a FILLFACTOR makes this complete nearly instantly.
UPDATE funcindex SET somenumber = somenumber+1;




>
> I currenly understand that if the update moves the row to a new location (no HOT replacement), the key to the index
hasto be calculated from the old and the new row to update the index. 
>
> This is expensive in my case, and useless, if the input to the immutable index function has not changed in my update
statement,and as such the calculation should always be done just once. In case of HOT replacement, it hasn't to be done
atall. 

At least on 8.4, when HOT kicks in it appears the evaluation is *not*
done. See test case above.

If HOT can't do its job due to lack of space in the page, then Pg will
evaluate the function, and it would be nice if it didn't have to. It's
not as simple as skipping this for functions, though; Pg would need a
way to determine which fields the result of any expression depended on,
as the slow index expression might be something like:

  ( f(a) * f(g(b)) )

where the result depends on the values of (a) and (b).

pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: Installing different PostgreSQL versions in parallel
Next
From: Rafael Ostertag
Date:
Subject: Re: Installing different PostgreSQL versions in parallel