Re: contrib/ltree patches - Mailing list pgsql-hackers

From Dan Langille
Subject Re: contrib/ltree patches
Date
Msg-id 3DE49D45.28806.DD0519D0@localhost
Whole thread Raw
In response to Re: contrib/ltree patches  (Teodor Sigaev <teodor@stack.net>)
List pgsql-hackers
On 27 Nov 2002 at 12:16, Teodor Sigaev wrote:

> Dan Langille wrote:
> > I have been looking at contrib/ltree in the PostgreSQL repository. 
> > I've modified the code to allow / as a node delimiter instead of .
> > which is the default.

> What is the reason for changing delimiter?

My tree represents a file system.  Here are some entries:

# select id, pathname from element_pathnames order by pathname; 77024 | doc/de_DE.ISO8859-1 77028 |
doc/de_DE.ISO8859-1/books84590 | doc/de_DE.ISO8859-1/books/Makefile.inc 77029 | doc/de_DE.ISO8859-1/books/faq 84591 |
doc/de_DE.ISO8859-1/books/faq/Makefile77030 | doc/de_DE.ISO8859-1/books/faq/book.sgml 77691 |
doc/de_DE.ISO8859-1/books/handbook77704 | doc/de_DE.ISO8859-1/books/handbook/Makefile
 
110592 | doc/de_DE.ISO8859-1/books/handbook/advanced-networking

> > Below are the patches to make this change.  I have also moved the
> > delimiter to a DEFINE so that other customizations are easily done. 
> > This is a work in progress.

> It's good.

Thank you.  More patches will follow as I get closer to my objective.

> > -#define ISALNUM(x)    ( isalnum((unsigned int)(x)) || (x) == '_' )
> > +#define ISALNUM(x)    ( isalnum((unsigned int)(x)) || (x) == '_' ||
> > +#(x) == NODE_DELIMITER )

> It seems to me  that it's mistake. ISALNUM shoud define correct
> character in name of node (level).  Try to test with incorrect ltree
> value 'a..b'.

I just did some simple tests and I see what you mean:

ltree_test=# select * from tree;id |     pathname
----+------------------ 1 | /ports 2 | ports/security 2 | ports//security 2 | /ports//security 2 | a..b
(5 rows)

Then I removed NODE_DELIMITER from ISALNUM and tried again:

ltree_test=# insert into tree values (2, '/ports//security');
ERROR:  Syntax error in position 0 near '/'
ltree_test=# insert into tree values (2, 'ports//security');
ERROR:  Syntax error in position 6 near '/'
ltree_test=# insert into tree values (2, 'ports/security');
INSERT 29955201 1
ltree_test=# insert into tree values (2, 'ports/security/');
ERROR:  Unexpected end of line
ltree_test=# insert into tree values (2, 'ports/security/things');
INSERT 29955202 1

ltree_test=# select * from tree;id |       pathname
----+----------------------- 1 | /ports 2 | ports/security 2 | ports//security 2 | /ports//security 2 | a..b 2 |
ports/security2 | ports/security/things
 
(7 rows)

Removing NODE_DELIMITER from ISALNUM makes sense.  Thank you.  Here 
is the reason why NODE_DELIMITER was added. My initial data sample 
was of the form "/usr/local/" (i.e. it started with a 
NODE_DELIMITER).  I have since changed my data so it does not start 
with a leading / because queries were not working.

Based upon the sample data I was using (approximately 120,000 nodes 
as taken from a real file system), I had to change ISALNUM as I went 
along.  Here is the current definition for ISALNUM:

#define ISALNUM(x)   ( isalnum((unsigned int)(x)) || (x) == '_' || 
(x) == '-' || (x) == '.' || (x) == '+' || (x) == ':' || (x) == '~' || 
(x) == '%' || (x) == ',' || (x) == '#')

Given that I am trying to allow any valid filename, I think ISALNUM 
needs to allow any ASCII character.

I also think I will need to modify the parsing within lquery_in to 
allow escaping of characters it recognizes but which may be part of a 
file name (e.g. :%~ may be part of a file name, but these are special 
characters to lquery_in).  That I think will be the biggest change.

Thank you for your interest and help.

-- 
Dan Langille : http://www.langille.org/



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Boolean casting in 7.3 -> changed?
Next
From: Manfred Koizar
Date:
Subject: next value expression