Thread: Partial Indices vs. mixing columns and functions

Partial Indices vs. mixing columns and functions

From
Mike Mascari
Date:
Hello,

I have created table/view pairs like:

CREATE TABLE foo (
key integer not null,
value text not null,
active timestamp not null default now(),
deactive timestamp
);

CREATE VIEW v_foo AS
SELECT * FROM foo WHERE deactive IS NULL;

This allows the user-interface component of the application to query the
v_foo table for selecting "active" records while maintaining a history
of all records for reporting purposes. To enforce uniqueness because
deactive is NULL, I cannot just create an index like:

CREATE UNIQUE INDEX i_foo (value, deactive);

What I can do is create a function like:

CREATE FUNCTION f_foo(oid, timestamp) RETURNS int4 AS '
SELECT 0 WHERE $2 IS NULL
UNION
SELECT oid WHERE $2 IS NOT NULL;
' LANGUAGE 'SQL';

And then create a functional index on foo:

CREATE UNIQUE INDEX i_foo( f_foo(oid, deactive) );

To enforce uniqueness on "active" 'value' columns, I could rewrite the
function to something like:

CREATE FUNCTION f_foo(oid, text, timestamp) RETURNS text AS '
SELECT '0_'||$2 WHERE $3 IS NULL
UNION
SELECT oid::text||'_'||$2 WHERE $3 IS NOT NULL;
' LANGUAGE 'SQL';

but that seems like a real hack and would require a new function for
each table where the unique constraint varies in columns and types. I
could, of course, have 2 tables and a view - 1 for active objects, 1 for
deactive objects, and a view unionizing the 2 together for joins for
reporting purposes. But I humbly request a new feature instead: :-)

Allow for the intermixing of columns and functions in the index
specification so I could write something like:

CREATE UNIQUE INDEX i_foo(value, f_foo(oid, deactive));

Or will Martijn van Oosterhout's new Partial Indices work allow me to
create a unique index like:

CREATE UNIQUE INDEX i_foo ON foo(value)
WHERE deactive IS NULL;

??

That would solve all my problems and answer all my questions...

Mike Mascari
mascarm@mascari.com

Re: Partial Indices vs. mixing columns and functions

From
Martijn van Oosterhout
Date:
On Wed, Jul 11, 2001 at 04:09:51AM -0400, Mike Mascari wrote:
> Hello,
>
> I have created table/view pairs like:

[snip]

Yes, creating a unique partial index should be possible and will do what you
want I think, (I couldn't totally follow what you meant).

However, partial indicies will not support the IS NULL predicates, for the
same reason that IS NULL cannot use an index for lookups.

I'd love to fix that but that's going to be hard (or rather, I havn't
thought of an easy way to do it :).

Maybe someone has a better solution.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: Partial Indices vs. mixing columns and functions

From
Mike Mascari
Date:
Martijn van Oosterhout wrote:
>
> On Wed, Jul 11, 2001 at 04:09:51AM -0400, Mike Mascari wrote:
> > Hello,
> >
> > I have created table/view pairs like:
>
> [snip]
>
> Yes, creating a unique partial index should be possible and will do what you
> want I think, (I couldn't totally follow what you meant).
>
> However, partial indicies will not support the IS NULL predicates, for the
> same reason that IS NULL cannot use an index for lookups.
>
> I'd love to fix that but that's going to be hard (or rather, I havn't
> thought of an easy way to do it :).
>
> Maybe someone has a better solution.

What are the limits of the WHERE expression? Must they be composed of
constant expressions or can they be more complex?

For example,

would I be able to do:

CREATE TABLE foo (
key integer not null,
value text not null,
active timestamp not null default now(),
deactive timestamp not null default now()
);

CREATE VIEW v_foo
AS SELECT * FROM foo WHERE active = deactive;

CREATE UNIQUE INDEX i_foo ON foo(value)
WHERE active = deactive;

Or is the WHERE clauses limited to constant expressions like a check
condition:

CREATE UNIQUE INDEX i_foo ON foo(value)
WHERE value = 'Bar';

Thanks for any info,

Mike Mascari
mascarm@mascari.com

P.S.

I subitted the TRUNCATE TABLE patch back in 6.5. It literally truncates
the underlying relation and all indices using the smgr layer and then
rebuilds the indices. I remember seeing the partial indices work at the
time and attempted to preserve the functionality in the patch, but could
never test it since they had been deactivated well before then. I'm sure
Tom Lane has removed my cruft since then, but it might be a source of a
needed test.

Re: Partial Indices vs. mixing columns and functions

From
Mike Mascari
Date:
I just wrote:

> What are the limits of the WHERE expression? Must they be composed of
> constant expressions or can they be more complex?

[snip]

Sorry. I see the SGML in your patch:

"Each element can only consist of ATTR OP CONST and these can only be
joined by
AND and OR operators."

Mike Mascari
mascarm@mascari.com

Re: Partial Indices vs. mixing columns and functions

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> To enforce uniqueness because
> deactive is NULL, I cannot just create an index like:

> CREATE UNIQUE INDEX i_foo (value, deactive);

It's not clear to me what you are really after here.  You *can* create a
unique index, even though 'deactive' is allowed to be NULL --- what will
happen is that rows containing NULL will never conflict with other
entries.  Is that what you want, or are you trying to say that you don't
want more than one row with 'deactive' NULL for any given 'value' value?

> Or will Martijn van Oosterhout's new Partial Indices work allow me to
> create a unique index like:

> CREATE UNIQUE INDEX i_foo ON foo(value)
> WHERE deactive IS NULL;

This would seem to imply that you want the latter.

As Martijn remarks elsewhere, the above would not be allowed by the
existing code for partial indexes.  But there is no good reason for
that.  The reason for the restriction is that the planner's code for
determining whether a partial index can be used in a query is pretty
limited (with good reason; we don't want to be letting loose a full-tilt
automated theorem prover on every query...).  But the above example
demonstrates that an index can be useful even if it's never used in
a query!

I would say that this example shows that we should rip out the
restrictions on the form of the predicate, and just ensure that the
planner code will give up cleanly if the predicate is not of a form
it can handle.

            regards, tom lane

Re: Partial Indices vs. mixing columns and functions

From
Mike Mascari
Date:
Tom Lane wrote:
>
> Mike Mascari <mascarm@mascari.com> writes:
> > To enforce uniqueness because
> > deactive is NULL, I cannot just create an index like:
>
> > CREATE UNIQUE INDEX i_foo (value, deactive);
>
> It's not clear to me what you are really after here.  You *can* create a
> unique index, even though 'deactive' is allowed to be NULL --- what will
> happen is that rows containing NULL will never conflict with other
> entries.  Is that what you want, or are you trying to say that you don't
> want more than one row with 'deactive' NULL for any given 'value' value?

The latter.

>
> > Or will Martijn van Oosterhout's new Partial Indices work allow me to
> > create a unique index like:
>
> > CREATE UNIQUE INDEX i_foo ON foo(value)
> > WHERE deactive IS NULL;
>
> This would seem to imply that you want the latter.

Yes.

>
> As Martijn remarks elsewhere, the above would not be allowed by the
> existing code for partial indexes.  But there is no good reason for
> that.  The reason for the restriction is that the planner's code for
> determining whether a partial index can be used in a query is pretty
> limited (with good reason; we don't want to be letting loose a full-tilt
> automated theorem prover on every query...).  But the above example
> demonstrates that an index can be useful even if it's never used in
> a query!
>
> I would say that this example shows that we should rip out the
> restrictions on the form of the predicate, and just ensure that the
> planner code will give up cleanly if the predicate is not of a form
> it can handle.

Fantastic! If left in its current state, I would have to use a fake
deactive value (some arbitrary date in the past), or add another column
to enforce uniqueness amongst 'active' records.

>
>                         regards, tom lane

Thanks,

Mike Mascari
mascarm@mascari.com

Re: Partial Indices vs. mixing columns and functions

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
>> Or will Martijn van Oosterhout's new Partial Indices work allow me to
>> create a unique index like:
>> CREATE UNIQUE INDEX i_foo ON foo(value)
>> WHERE deactive IS NULL;

Just for the record: with CVS tip,

regression=# create table foo(value int, deactive int);
CREATE
regression=# CREATE UNIQUE INDEX i_foo ON foo(value)
regression-#  WHERE deactive IS NULL;
CREATE
regression=# insert into foo values (1,2);
INSERT 146307 1
regression=# insert into foo values (1,2);
INSERT 146308 1
regression=# insert into foo values (1,null);
INSERT 146309 1
regression=# insert into foo values (1,null);
ERROR:  Cannot insert a duplicate key into unique index i_foo
regression=# insert into foo values (2,null);
INSERT 146311 1
regression=# insert into foo values (2,null);
ERROR:  Cannot insert a duplicate key into unique index i_foo
regression=#

            regards, tom lane

Re: Partial Indices vs. mixing columns and functions

From
Hiroshi Inoue
Date:
Tom Lane wrote:
>
> Mike Mascari <mascarm@mascari.com> writes:
> >> Or will Martijn van Oosterhout's new Partial Indices work allow me to
> >> create a unique index like:
> >> CREATE UNIQUE INDEX i_foo ON foo(value)
> >> WHERE deactive IS NULL;
>
> Just for the record: with CVS tip,
>

What kind of expression is allowed as the predicate now ?
(Sorry I could have had no cvs access for more than a week).
Both the functions used in functional indexes and the
predicates used in partial indexes must be deterministic.
Are users responsible for it ?

regards,
Hiroshi Inoue

Re: Partial Indices vs. mixing columns and functions

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> What kind of expression is allowed as the predicate now ?

Anything that doesn't involve an aggregate or a subselect.

(But what the planner can recognize as matching a query's
WHERE clause is currently much more restricted, just ANDs
and ORs of simple VAR OP CONST clauses.  This could be
improved in the future.)

> Both the functions used in functional indexes and the
> predicates used in partial indexes must be deterministic.
> Are users responsible for it ?

Yes.  I don't see how the system could enforce that, without
prohibiting lots of useful cases.

            regards, tom lane

Re: Partial Indices vs. mixing columns and functions

From
Hiroshi Inoue
Date:
Tom Lane wrote:
>
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > What kind of expression is allowed as the predicate now ?
>
> Anything that doesn't involve an aggregate or a subselect.
>
> (But what the planner can recognize as matching a query's
> WHERE clause is currently much more restricted, just ANDs
> and ORs of simple VAR OP CONST clauses.  This could be
> improved in the future.)
>
> > Both the functions used in functional indexes and the
> > predicates used in partial indexes must be deterministic.
> > Are users responsible for it ?
>
> Yes.  I don't see how the system could enforce that, without
> prohibiting lots of useful cases.
>

Shouldn't the functions be *cacheable* at least ?

regards,
Hiroshi Inoue

Re: Partial Indices vs. mixing columns and functions

From
Martijn van Oosterhout
Date:
On Mon, Jul 16, 2001 at 09:45:32PM -0400, Tom Lane wrote:
> (But what the planner can recognize as matching a query's
> WHERE clause is currently much more restricted, just ANDs
> and ORs of simple VAR OP CONST clauses.  This could be
> improved in the future.)

I've been think about adding IS (NOT) NULL type clauses and it doesn't look
too hard but I wonder if my logic is right.

> > Both the functions used in functional indexes and the
> > predicates used in partial indexes must be deterministic.
> > Are users responsible for it ?
>
> Yes.  I don't see how the system could enforce that, without
> prohibiting lots of useful cases.
>

Well, isn't there an attribute on a function that indicates whether the same
input gives the same output (cachable?).  I think that's all that would be
required and not restrict any useful cases.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: Partial Indices vs. mixing columns and functions

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> I've been think about adding IS (NOT) NULL type clauses and it doesn't look
> too hard but I wonder if my logic is right.

I think all you need is to add the correct implication rules to
indxpath.c.  Let's see:

    IS NULL => IS NULL

    IS NOT NULL => IS NOT NULL

    IS TRUE => IS TRUE

    IS TRUE => IS NOT FALSE  (but not vice versa)

    IS TRUE => IS NOT UNKNOWN  (but not vice versa)

    IS NOT TRUE => IS NOT TRUE

and similarly to the last four for IS FALSE and IS UNKNOWN.  No?

BTW, it might be a good idea to split out the implication code into a
new file, probably in optimizer/prep or optimizer/utils, rather than
letting it continue to grow where it is.  Doesn't seem like it belongs
in indxpath.c.

            regards, tom lane

Re: Partial Indices vs. mixing columns and functions

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Both the functions used in functional indexes and the
> predicates used in partial indexes must be deterministic.
> Are users responsible for it ?
>>
>> Yes.  I don't see how the system could enforce that, without
>> prohibiting lots of useful cases.

> Shouldn't the functions be *cacheable* at least ?

Hmm.  We could do that, although of course it's not a bulletproof check.

Any objections out there?

            regards, tom lane

Re: Partial Indices vs. mixing columns and functions

From
Martijn van Oosterhout
Date:
On Tue, Jul 17, 2001 at 11:10:13AM -0400, Tom Lane wrote:
> I think all you need is to add the correct implication rules to
> indxpath.c.  Let's see:

[snip]

They look fine. The IS NULL is a NullTest but I havn't looked into the
others yet. What confuses me about the pred_test group of functions is that
there's quite a bit of recursivness going on and two functions that are
almost the same. I can't see the point. And I don't understand the comments
at the beginnings of the functions :(

> BTW, it might be a good idea to split out the implication code into a
> new file, probably in optimizer/prep or optimizer/utils, rather than
> letting it continue to grow where it is.  Doesn't seem like it belongs
> in indxpath.c.

I keep wondering if there is not a better way of doing this. I prefer data
driven approaches yet this seems to produce more and more code as you add
new things.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.