Re: Refreshing functional index - Mailing list pgsql-general

From David Johnston
Subject Re: Refreshing functional index
Date
Msg-id 064001cd862c$5a2660c0$0e732240$@yahoo.com
Whole thread Raw
In response to Refreshing functional index  (Grzegorz Tańczyk <goliatus@polzone.pl>)
List pgsql-general
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Grzegorz Tanczyk
Sent: Wednesday, August 29, 2012 5:02 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Refreshing functional index

Hello,

I have a problem with functional index feature in Postgres 8.3

There are two tables, lets call them: PARENTS and CHILDREN(with timestamp column)

I created functional index on parents with function, which selects max value of timestamp from child elements(for given
parent_id). 

The problem is that plpgsql function, which returns the value is IMMUTABLE and it works like a cache.  When I insert
newrecord to children table, select over parents with function gives wrong(outdated) results.  

So far I figured out only one way to flush this "cache". It's by calling REINDEX on my index. I guess I should call it
afterevery insert to children table. It's not good for me since it locks the table.  

I'm thinking about partitioning the index by my app, so reindexing will be less painful, but perhaps there is some
othereasier way to solve tihs problem?  

Thanks

=========================================================================================

This is not really a problem rather the behavior is working as designed and you are trying or expecting it to behave
differently.

First I would see whether I actually need an INDEX for whatever problem you are trying to solve.  A view with an
embeddedsub-query to dynamically determine the max(timestamp) is likely to perform decently if there are not too many
rowsfor each child. 

Any other suggestions require guessing about your data properties but if said timestamp is auto-assigned and thus
alwaysincreasing as new children are added (and children are not deleted or change their timestamp) then adding an
INSERTtrigger on the child table that updates either the parent or some other relation-maintaining table would likely
suffice.

Also, your function is STABLE, not IMMUTABLE.  Saying it is IMMUTABLE in order to create the index doesn't magically
overcomethe very reason that STABLE functions cannot be indexed. 

I would also suggest that using a timestamp is probably not the best decision.  It is possible that two transactions
couldbe started at the same time and thus the resultant timestamps would match as well.  Also are you guaranteed that
thetimes with always come from the same source?  Again, they WHY behind your decision is unknown but storing the PK of
the"most recent child" would make more sense conceptually rather than storing a time and having to back into the child. 

If you go that route upon INSERT you simply "UPDATE parent SET child_id_mostrecent = child_pk".  In the face of
concurrencythe last one to commit remains.  Upon deletion, if allowed, you simply set it to NULL so that there is no
"mostrecent".  Another option in that case would be to store an array and pre-pend each new child PK but also truncate
thearray to a maximum of, say 10 children.  Upon delete you would then just remove any references to the deleted child
fromthe array.  For query purposes the first child is the one that matters (if present since an empty array is still
possible). Updates could work the same way if you want to "refresh" the recentness of the children in that situation. 

Lots of options (mostly trigger on the child oriented) but a functional index is not one of them.

David J.






pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Refreshing functional index
Next
From: Craig Ringer
Date:
Subject: Re: String comparision in PostgreSQL