Re: GiST opclass and varlena - Mailing list pgsql-hackers

From Dragan Zubac
Subject Re: GiST opclass and varlena
Date
Msg-id 47E92EE7.7030704@vlayko.tv
Whole thread Raw
In response to GiST opclass and varlena  (Dimitri Fontaine <dfontaine@hi-media.com>)
Responses Re: GiST opclass and varlena
Re: GiST opclass and varlena
List pgsql-hackers
Hello

Not so familiar with all this math :) ,but here's my solution,which I 
must admit keep things going at the moment:

db=> \d prefix                         Table "public.prefix" Column  |  Type   |                      Modifiers
           
 
----------+---------+-----------------------------------------------------id       | bigint  | not null default
nextval('prefix_id_seq'::regclass)prefix  | text    |operator | integer |
 
Indexes:   "prefix_pkey" PRIMARY KEY, btree (id)   "prefix_index" UNIQUE, btree (prefix)


and we're using a procedure to match prefices (longest prefix 
match),with simething like:

...

-- CHECK PREFIX START

while tmp_length <= char_length(d_number) loop

-- take the number and try to find it in prefix table
-- if not found,decrease it by removing last number
-- and try again
       tmp_dest_number := substring (d_number from 1 for tmp_length);
       select into operator_temp operator from prefix       where prefix=tmp_dest_number;                
       if not found then            tmp_length := tmp_length + 1;       else

-- if we have a match with some prefix
-- take the operator from that row
               operatorfound := true;               operator_out := operator_temp;               exit;
            end if;     
 

end loop;

-- CHECK PREFIX STOP

....

only 'semantic' problem You might have with this approach is that number 
is like 16511xxxxx,which belongs to some Hawaii island operator :),but 
the problem is that all You have in Your prefix table is 16xxxxxx,which 
You mark to belong to operator USA Something. In that case,Your system 
will think of 16511xxxxxx number as it belongs to USA Something operator 
and not Hawaii island operator :( Only solution to this is to always 
have up-to-date prefix table,and populate it even with the prefices Your 
system does not support,because Your system then reject that number and 
it will have a good/precise reason why he did it :)

Some poorly  measurement showed some 60-80 matching/sec with this 
algorithm of matching prefices and a couple of concurrent database 
connections.

Sincerely

Dragan

Dimitri Fontaine wrote:
> Hi,
>
> I'm trying to code a GiST opclass to index prefix searches (select ... from t 
> where t.prefix @> query), now using a prefix_range datatype. This datatype is 
> a varlena one, and storing it to disk and indexing it with BTrees work ok, 
> but I'm failing to have my GiST opclass working, here's the problem:
>
> postgres=# create index idx_prefix on ranges using gist(prefix 
> gist_prefix_range_ops);
> NOTICE:  gpr_picksplit(): entryvec->n= 234 maxoff= 232 l= 176 r=  56 l+r= 232 
> unionL='01[0-7]' unionR='01[4-7]'
> NOTICE:  gpr_picksplit(): v->spl_ldatum='01[0-7]' v->spl_rdatum='01[4-7]'
> ERROR:  invalid memory alloc request size 3049878020
>
> The code is available at pgfoundry here:
>   http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/prefix/prefix/
>
> The previous support for prefixes as text is still there (but will get 
> deprecated soon --- or so I hope), and the new datatype and its usage not 
> well commented nor documented currenlty. If this show up as a requirement to 
> get your attention, please state it and I'll work on documenting prefix_range 
> first.
>
> I'm looking for some help on how to resolve the shown index creation problem, 
> which I think is related to how I give data to GiST in its spl_ldatum and 
> spl_rdatum from the user defined picksplit() method, lines 1101 and 1102 in 
> prefix.c (version 1.26).
>
> Regards,
>   



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: PG East and Washington DC tour
Next
From: Alvaro Herrera
Date:
Subject: Re: partial dump of patch queue to wiki