Thread: GiST consistent function, expected arguments; multi-dimensional indexes
In GiST, for each new data type we support we're expected to provide (among other things) a function to determine whether a query is consistent with a particular index entry (given an operator/ strategy). I haven't been able to figure out when the query value being passed (arg 1 in the <datatype>_consistent function) is the actual value (say the value "6"), or a pointer to the value. In the btree_gist contrib examples, I see both cases (int4, a value; text, a pointer). Does anyone know how to tell when when the gist consistent function should expect a pointer or a value (apparently this is setup in the scankey in a generic fashion before getting into the index- specific implementation but can't tell how exactly)? Does this depend on the storage clause in CREATE TYPE? Also, a broader question. GiST is setup to evaluate each column in the index in the order it was specified (e.g. if the index has colX, colY, then if colX satisfies the search, colX is checked, otherwise colY is ignored)...as in traditional btree indexes. I would like to set up a multi-dimensional index that doesn't require all data to be contained in a single column (like an rtree index where the X and Y coordinates would be in different columns rather than a single composite column). I have taken the approach of hacking some of the GiST code to do this (basically, my code looks at each itup on a page and evaluates on multiple columns at once, depending on strategy and according to my algorithm). This bypasses some of the existing GiST code but takes advantage of a lot of it (recovery features, etc.), but it's a reluctant and probably temporary hack. So my question....is this type of initiative already in the works in some project out there? Hate to duplicate effort. GiST takes you a long way but not quite all the way for these types of scenarios. Thanks Eric
Re: GiST consistent function, expected arguments; multi-dimensional indexes
From
Martijn van Oosterhout
Date:
On Wed, Jun 27, 2007 at 09:32:13AM -0700, Eric wrote: > In GiST, for each new data type we support we're expected to provide > (among other things) a function to determine whether a query is > consistent with a particular index entry (given an operator/ > strategy). I haven't been able to figure out when the query value > being passed (arg 1 in the <datatype>_consistent function) is the > actual value (say the value "6"), or a pointer to the value. In the > btree_gist contrib examples, I see both cases (int4, a value; text, a > pointer). Does anyone know how to tell when when the gist consistent > function should expect a pointer or a value (apparently this is setup > in the scankey in a generic fashion before getting into the index- > specific implementation but can't tell how exactly)? Does this depend > on the storage clause in CREATE TYPE? Everything is always passed as a Datum, so yes, it's is determined by the storage clause in CREATE TYPE. > Also, a broader question. GiST is setup to evaluate each column in > the index in the order it was specified (e.g. if the index has colX, > colY, then if colX satisfies the search, colX is checked, otherwise > colY is ignored)...as in traditional btree indexes. I would like to > set up a multi-dimensional index that doesn't require all data to be > contained in a single column (like an rtree index where the X and Y > coordinates would be in different columns rather than a single > composite column). The usual approach to this is to define the index on a composite of the values. For example, if you have a table with two points that you want to index together, you do: CREATE INDEX foo ON bar((box(point1,point2))); i.e. a functional index on the result of combining the points. It does mean you need to use the same syntax when doing the queries, but it works with modifying any internal code at all... Given you can use rowtypes more easily these days, it's quite possible you use build an operator class on a row type... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
> > Everything is always passed as a Datum, so yes, it's is determined by > the storage clause in CREATE TYPE. Still not sure what to do in some scenarios. One example is the gist example code for btree (btree_gist). If you look at the int4 example consistent function, it gets an int32 value (param 1). For other data types, it would get a pointer to a value. Is the rule anything <= 4 bytes it's a value, above that it's a pointer? See the code below... Datum gbt_int4_consistent(PG_FUNCTION_ARGS) { GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);int32 query = PG_GETARG_INT32(1);int32KEY *kkk = (int32KEY*) DatumGetPointer(entry->key); > > The usual approach to this is to define the index on a composite of > the values. For example, if you have a table with two points that you > want to index together, you do: > > CREATE INDEX foo ON bar((box(point1,point2))); > > i.e. a functional index on the result of combining the points. It does > mean you need to use the same syntax when doing the queries, but it > works with modifying any internal code at all... > > Given you can use rowtypes more easily these days, it's quite possible > you use build an operator class on a row type... > > Have a nice day, > -- > Martijn van Oosterhout <klep...@svana.org> http://svana.org/kleptog/ > > > From each according to his ability. To each according to his ability to litigate. > > Thanks Martijn. I will consider that approach.
Re: GiST consistent function, expected arguments; multi-dimensional indexes
From
Martijn van Oosterhout
Date:
On Sun, Jul 01, 2007 at 07:20:08PM -0700, Eric wrote: > > > > > Everything is always passed as a Datum, so yes, it's is determined by > > the storage clause in CREATE TYPE. > > Still not sure what to do in some scenarios. One example is the gist > example code for btree (btree_gist). If you look at the int4 example > consistent function, it gets an int32 value (param 1). For other > data types, it would get a pointer to a value. Is the rule anything > <= 4 bytes it's a value, above that it's a pointer? See the code > below... Why guess. You know the type ID of what you're manipulating, right. Then the function: get_typlenbyval(Oid typid, int16 *typlen, bool *typbyval); Returns the byval flag (true if passed by value, false if passed by reference) and the typlen field will be either a positive integer, representing the number of bytes, or negative for variable length. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
I guess you can also get this before writing code from select typbyval from pg_type where typname='mytype' ...thanks again.
Re: GiST consistent function, expected arguments; multi-dimensional indexes
From
Martijn van Oosterhout
Date:
On Mon, Jul 02, 2007 at 10:44:55AM -0700, Eric wrote: > I guess you can also get this before writing code from > > select typbyval from pg_type where typname='mytype' Note that the flag might not be constant. For example int8 is not byval currently whereas it could be on a 64-bit architecture. However, variable-length values are always byref. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.