Re: WIP: SP-GiST, Space-Partitioned GiST - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: WIP: SP-GiST, Space-Partitioned GiST
Date
Msg-id CAPpHfdtKdjaXdY93=O+EYht4wFgE2A2B8Sf3AErN4=p7gn+FBw@mail.gmail.com
Whole thread Raw
In response to WIP: SP-GiST, Space-Partitioned GiST  (Oleg Bartunov <obartunov@gmail.com>)
List pgsql-hackers
Hi!

Ie expect some problems in support of comparison operators for text, because locale string comparison can have unexpected behaviour. 
Let's see the example. Create table with words and add extra leading space to some of them.

test=# create table dict(id serial, word text);
NOTICE:  CREATE TABLE will create implicit sequence "dict_id_seq" for serial column "dict.id"
CREATE TABLE
test=# \copy dict(word) from '/usr/share/dict/american-english';
test=# update dict set word = ' '||word where id%2=0;
UPDATE 49284

I use Ubuntu 11.04 with ru_RU.utf8 locale. So, comparison operators ignores leading spaces.

test=# select * from dict where word between 'cart' and 'cary';
  id   |      word      
-------+----------------
  3029 | Carter
  3031 | Cartesian
  3033 | Carthage's
  3035 | Cartier
  3037 | Cartwright
  3039 | Caruso
  3041 | Carver
 28419 | cart
 28421 | carted
 28423 | cartel's
 28425 | cartilage
 28427 | cartilages
 28429 | carting
 28431 | cartographer's
 28433 | cartography
 28435 | carton
 28437 | cartons
 28439 | cartoon's
 28441 | cartooning
 28443 | cartoonist's
 28445 | cartoons
 28447 | cartridge's
 28449 | carts
 28451 | cartwheel's
 28453 | cartwheeling
 28455 | carve
 28457 | carver
 28459 | carvers
 28461 | carving
 28463 | carvings
  3030 |  Carter's
  3032 |  Carthage
  3034 |  Carthaginian
  3036 |  Cartier's
  3038 |  Cartwright's
  3040 |  Caruso's
  3042 |  Carver's
 28420 |  cart's
 28422 |  cartel
 28424 |  cartels
 28426 |  cartilage's
 28428 |  cartilaginous
 28430 |  cartographer
 28432 |  cartographers
 28434 |  cartography's
 28436 |  carton's
 28438 |  cartoon
 28440 |  cartooned
 28442 |  cartoonist
 28444 |  cartoonists
 28446 |  cartridge
 28448 |  cartridges
 28450 |  cartwheel
 28452 |  cartwheeled
 28454 |  cartwheels
 28456 |  carved
 28458 |  carver's
 28460 |  carves
 28462 |  carving's
(59 rows)

But if I create spgist index query result differs.

test=# create index dict_idx on dict using spgist (word);
CREATE INDEX
test=# select * from dict where word between 'cart' and 'cary';
  id   |      word      
-------+----------------
 28419 | cart
 28421 | carted
 28423 | cartel's
 28425 | cartilage
 28427 | cartilages
 28429 | carting
 28431 | cartographer's
 28433 | cartography
 28435 | carton
 28437 | cartons
 28439 | cartoon's
 28441 | cartooning
 28443 | cartoonist's
 28445 | cartoons
 28447 | cartridge's
 28449 | carts
 28451 | cartwheel's
 28453 | cartwheeling
 28455 | carve
 28457 | carver
 28459 | carvers
 28461 | carving
 28463 | carvings
(23 rows)

------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: WIP: SP-GiST, Space-Partitioned GiST
Next
From: Robert Haas
Date:
Subject: Re: [v9.1] sepgsql - userspace access vector cache