Re: Array comparison & prefix search - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Array comparison & prefix search
Date
Msg-id b42b73150912050654x55fca251i9a51979abe1ea290@mail.gmail.com
Whole thread Raw
In response to Re: Array comparison & prefix search  (Sam Mason <sam@samason.me.uk>)
Responses Re: Array comparison & prefix search  (Sam Mason <sam@samason.me.uk>)
List pgsql-general
On Sat, Dec 5, 2009 at 4:54 AM, Sam Mason <sam@samason.me.uk> wrote:
> On Sat, Dec 05, 2009 at 02:23:13AM +0100, Denes Daniel wrote:
>> 2009/12/4 Sam Mason <sam@samason.me.uk>
>> >  CREATE INDEX test_my_idx ON test (type,(ident[1]));
>>
>> Sorry, but this approach is no good, since I may search like:
>> SELECT * FROM test WHERE type = 'three' AND (ident[1] = 'foo' AND ident[2] =
>> 'bar');
>> or for the first 3 items in an array with 6 items, or any other prefix...
>
> Would a GIN index help?  You'd be able to ask if a 'foo' appears
> anywhere in the array (or some subset if you want).  You can then have a
> subsequent filter that actually expresses the clause you want.  Not sure
> what selectivity you're dealing with and if this would be a problem.

GIN is a pretty heavy price to pay for something that should be btree
indexable.  Also note he is using a multi column index with array as
second column...that would be pretty awkward with GIN.

>> The arrays are all the same length for a given type, but for type
>> 'twenty-three' they may be 23 items long, or even longer for another type,
>> so I can't create an index for all possible cases that way. And yet, all the
>> information needed is in the primary index, I just don't know how to get
>> PostgeSQL to use it.
>
> Arrays and PG (not sure how well other databases handle this case
> either) don't work too well.  Have you thought about normalising your
> schema a bit to give the database more help?

Normalizing the data loses the nice property of being able to order
the entire structure using a single index.  He's using the array as if
it was a string...it's basically an optimization.

>> And why is it this way when I'm using an ARRAY[], and the other way when
>> using ROW()?
>
> I'd say ROW is doing the wrong thing here, but I think other people may
> well disagree with me.  Composite/non-atomic types don't exist in the
> SQL spec much (AFAIK) hence their behavior is somewhat ad-hoc and tends
> to reflect the original use case rather than being too consistent.

yeah, pg composite type handling with nulls is all over the place.
you can get just about everything to work though.

merlin

pgsql-general by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pg_attribute.attnum - wrong column ordinal?
Next
From: Denes Daniel
Date:
Subject: Re: Array comparison & prefix search