Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

From Gokulakannan Somasundaram
Subject Re: Including Snapshot Info with Indexes
Date
Msg-id 9362e74e0710090041o79817945mc4f91d2c54ede69d@mail.gmail.com
Whole thread Raw
In response to Re: Including Snapshot Info with Indexes  ("Florian G. Pflug" <fgp@phlo.org>)
Responses Re: Including Snapshot Info with Indexes
List pgsql-hackers


On 10/8/07, Florian G. Pflug <fgp@phlo.org> wrote:
Gokulakannan Somasundaram wrote:
> Hi Heikki, I am always slightly late in understanding things. Let me
> try to understand the use of DSM. It is a bitmap index on whether all
> the tuples in a particular block is visible to all the backends,
> whether a particular block contains tuples which are invisible to
> everyone. But i think this will get subjected to the same limitations
> of Bitmap index. Even Oracle suggests the use of Bitmap index for
> only data warehousing tables, where the Bitmap indexes will be
> dropped and recreated after every bulk load. This is not a viable
> alternative for OLTP  transactions. But i think i am late in the game
>  as i haven't participated in those discussions
While the DSM might be similar in spirit to a bitmap index, the actual
implementation has a lot more freedome I'd say, since you can tailor it
exactly to the need of tracking some summarized visibility info. So not
all shortcomings of bitmap indices must necessarily apply to the DSM
also. But of course thats mostly handwavering...

> One Bitmap index block usually maps to lot of blocks in the heap. So
> locking of one page to update the DSM for update/delete/insert would
> hit the concurrency. But again all these are my observation w.r.t
> oracle bitmap indexes. May be i am missing something in DSM.
A simple DSM would probably contain a bit per page that says "all xmin <
GlobalXmin, and all xmax unset or aborted". That bit would only get SET
during VACUUM, and only unset during INSERT/UPDATE/DELETE. If setting it
is protected by a VACUUM-grade lock on the page, we might get away with
no locking during the unset, making the locking overhead pretty small.

Let me try to understand. Do you mean to say some kind of Test and Set implementation for Insert/Update/Delete?
So that would mean that there won't be any lock during the change of bit flags. Why do we need lock to set it then?
It looks like a great idea.

> I couldn't get that piece of discussion in the archive, which
> discusses the design of Retail Vacuum. So please advise me again
> here. Let's take up Retail Vacuuming again. The User defined function
>  which would return different values at different time can be
> classified as non-deterministic  functions. We can say that this
> index cannot be created on a non-deterministic function. This is the
> way it is implemented in Oracle. What they have done is they have
> classified certain built-in operators and functions as deterministic.
> Similarly they have classified a few as non-deterministic operators
> and functions. Can we  follow a similar approach?
Postgres already distinguishes VOLATILE,STABLE and IMMUTABLE functions.
It doesn't, however, risk physical data corruption, even if you get that
classification wrong. The worst that happens AFAIK are wrong query
results - but fixing your function, followed by a REINDEX always
corrects the problme. If you start poking holes into that safety net,
there'll be a lot of pushback I believe - and IMHO rightly so, because
people do, and always will, get such classifications wrong.

A deterministic function is classified as one, which returns the same results, irrespective of how many times, it is invoked. So if we form a classification like that, do you think we will resolve the issue of Retail Vaccum? In the case of User-Defined functions, the user should be defining it as Deterministic. Can we frame a set of guidelines, or may be some test procedure, which can declare a certain function as deterministic? I am just saying from the top of my mind. Even otherwise, if we can even restrict this indexing to only Built-in deterministic functions., don't you think it would help the cause of a majority? I have just made the proposal to create the index with snapshot a optional one.

 
Thanks,
Gokul.

pgsql-hackers by date:

Previous
From: "Gokulakannan Somasundaram"
Date:
Subject: Re: Including Snapshot Info with Indexes
Next
From: Simon Riggs
Date:
Subject: Re: PG on NFS may be just a bad idea