[HACKERS] Re: LIMIT clause (and a question about indexing arrays) - Mailing list pgsql-hackers

From HANNU
Subject [HACKERS] Re: LIMIT clause (and a question about indexing arrays)
Date
Msg-id 38da4acb632bfabb74c9ffd2654ccc16
Whole thread Raw
List pgsql-hackers
I think that we don't need a LIMIT clause
What we could need would be a pseudo-column called ROW_NUM
or something like that, so that we could do

 select a,b from tbl where row_num between 5000 and 5000+10;

or if postgresql does not jet support 'between'

 select a,b from tbl where row_num >= 5000 and row_num < 5000+10;

to get the effect of

begin;
declare cur cursor for select a,b from tbl;
fetch 5000 in cur;
fetch 10 in cur;
end;

To be fast this should also be implemented as a special case to the
optimiser.

BTW, do we have a way to introduce special cases to the optimiser,
like the above or like using indexes for aggregate functions (min, max,
count(*), ...). Maybe we could even implement optimiser hints a la oracle
which can use specially formatted comments to tweak the optimiser
performance?

- -------

Now my second question(s) to anyone familiar with the implementation
of indexes and how they are interfaced to the optimiser (Vadim ?) .

Is there a way to have individual array elements indexed, I mean
to have a query use an index to find a record which has some word
as its 2nd array element:

select * from tbl where words[2]='god';

or even better, as any element:

select * from tbl where words[?]='god';

(my array referencing syntax may be completely wrong as I have not used
arrays in PG for a long time)

The reason I ask is that I want to implement a full text search (not
regex but complete word index) for postgres text (and possibly new
compressed text (ztext ?) datatypes, and having a working array indexing
would enable me to do it by having a function that convers text to an
array of words and then have a functional index on it)

Currently I have a working full text indexing as a python script using bsd
db library. It has worke well for about 1,5 years on a FreeBSD box and still
does a search of type "text ~ '(a and b and c) and not(d or e)'" in less
than 3 sec on a database of 38000 articles, 15 000 000 words, 136 000 000
bytes, thats on 166MHz Pentium, together with starting CGI scripts.

But I need to have a similar system on Linux and bsd db does not work well
on linux :( , it corrupts the database every now and then.

If I get going, I plan to implement a similar full text indexing for
postgres as well (supporting and, or, not, near, before, afterand grouping
with parentheses), but if making the array element indexing work is a lot
easier, I'd do it first.

BTW, does anyone actually use the GiST indexes?

PS, I have a raw implementation of access to Postgres (currently tested on
6.0) in pure python (no C). So if anyone is interested I may give it out
as it is. I planned to make it more complete but have been unable to work on
it for more than a month now. The advantage of having a pure python client
is that you can use it from any platform supported by python (and that is
almost anywhere). Currently it supports just the simple query interface
and no large objects. And the documentation is almost non-existant ;)
If you don't kow what python is see http://www.python.org/ .

Cheers,
Hannu Krosing

------------------------------

pgsql-hackers by date:

Previous
From: "Martin S. Utesch"
Date:
Subject: [HACKERS] Compile HELP for 'rint' needed (HPUX 9.01)
Next
From: "Thomas G. Lockhart"
Date:
Subject: [HACKERS] postgres snapshots