Re: Index optimization ? - Mailing list pgsql-general

From Bo Lorentsen
Subject Re: Index optimization ?
Date
Msg-id 41EBDDD7.9040501@netgroup.dk
Whole thread Raw
In response to Re: Index optimization ?  ("Florian G. Pflug" <fgp@phlo.org>)
Responses Re: Index optimization ?
List pgsql-general
Florian G. Pflug wrote:

> Lets say, you have an query "select * from table where field =
> function()". Now, according to the sql-spec, you would have to
> scan each row in "table", call the function "functio()", and compare the
> result. If the result of the call to "function()" matches the value in
> "field", the you return the row, otherwise you don't return it.

This far I follow :-)

> An index is a tree, where each node has two subnodes/children. Each node
> representents a row of your table (or, rather, references a row - it
> contails only the value of the field you indexed). Additionally,
> the value of the field of the "left" child (and the value of the field
> of its children, and so on) is always guaranteed to be
> smaller-or-equal to the value of field of the node itself, and the value
> of the field of the "right" child is always guaranteed to be
> greater-or-equal to the value of the field of the node.

What you say, is based on the result of the evaluation, the executer
will optimize or performe the index lookup, if the righthand is a
constant, but it will perform a seq scan if the value is not known on
the first query (volatile) ?

To me this sound like the indexes can't be performed per row, but only
per query ? Or, PG is not type aware when maching indexes ?

> Now, when doing an index lookup, you have to know which value to look
> for (lets say you look for 3). Then you look at the top node, and
> compare the value you are looking for to the value of the node. In our
> case, the node has a smaller value then the one we are looking for.
> Because we know that the left child of the toplevel node will have an
> even smaller value, we don't need to look at the left child at all. We
> just check the right child, and there we find our record with "field"=3.

But why can't we evaluate righthand and use this new row value (could
still be 3, but not 'foobar' :-)) as a key to the index ?

> _BUT_ this only works, because we knew for which value to look before we
> started searching. If we the value we look for is constantly changing,
> our index lookup would return bogus results. So, if the value is unknown
> _at_the_beginning_ of the search, you can't use the index, because the
> power of an index search comes from the idea to skip a whole subtree (in
> our case the left one), because we _know_ it can't contain the value
> we are looking for.

But the "value" is unknow yes ... but the type of the value is not, this
will not change from row to row only the value do this.

> Functions marked "immutable" or "stable" is _guaranteed_ to never
> change their value during a select statement, or at least not in an
> unpredictable way. Thats why you can use return values of "immutable"
> or "stable" functions for an index search.

I understand that, I just can't see why an index lookup can't be used on
"per row" basis.

> Lets say, you are doing the following query
> "select * from table where field1 = currval('seq')" and field2 =
> nextval('seq')
>
> Now, the value of "currval('seq')" changes with every row visited. In
> your example, the value of "currval" is actually stable - but postgres
> has no way to know this. To use an index scan for your query, postgres
> would need to know, that only a call to nextval can change the value
> of currval - but this, of course, is a quite difficult dependency for a
> database to track.

But why can't the integer index for "field1" be used, with the "currval"
result, even if it changes ? Are the index lookup only performed ones
(asked this before ) per query ?

/BL

pgsql-general by date:

Previous
From: "Laurent Marzullo"
Date:
Subject: Re: PQexecParams and CURSOR
Next
From: Tom Lane
Date:
Subject: Re: What is postgresql doing for "statement: commit;begin;"