Re: Custom index structure and strange count problem - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Custom index structure and strange count problem |
Date | |
Msg-id | AANLkTiklX474n-w6zRG3vJ6c1Y8X2gdVNKSycp-G7iSU@mail.gmail.com Whole thread Raw |
In response to | Custom index structure and strange count problem (Carsten Kropf <ckropf2@fh-hof.de>) |
Responses |
Re: Custom index structure and strange count problem
|
List | pgsql-hackers |
On Wed, Jun 9, 2010 at 4:35 AM, Carsten Kropf <ckropf2@fh-hof.de> wrote: > Hi *, > during the last few months I've been building a new index structure as part of a research project. > Everything seems to work properly, however I have some strange issues with the count sql command. > I introduced some custom structures (mainly document and hybrid_query) with which my index access method is supposed towork. > There is an operator "&&" which is supposed to use my index structure (what also works properly). > The function that maps to the operator "&&" is called hybrid_index_query, which I use to compare my results given fromthe index with the real results that are supposed to appear in the final result set. > Having described the outer circumstances (in a very short way), I will now show the strange stuff that happens: > > test=# select id from documents where hybrid_index_query(to_document(words, points), row('radio pleas news'::tsvector,'[(-90,-180),(90, 180)]')) order by id; > id > ------ > 2137 > 2151 > 2168 > 2207 > 2208 > 2209 > 2210 > 2211 > 2266 > 2296 > (10 rows) > > This query takes a sequential scan and works properly (returning 10 rows). > > test=# select id from documents where to_document(words, points) && row('pleas radio news'::tsvector, '[(-90,-180),(90,180)]')order by id; > id > ------ > 2137 > 2151 > 2168 > 2207 > 2208 > 2209 > 2210 > 2211 > 2266 > 2296 > (10 rows) > > This query uses my index structure and returns the same result as in the sequential scan above. > Until here, everything seems to work fine. However, if I issue the same queries using the count aggregate function in SQL,there are some odd results: > test=# select count(*) from documents where hybrid_index_query(to_document(words, points), row('radio pleas news'::tsvector,'[(-90,-180),(90, 180)]')); > count > ------- > 10 > (1 row) > > Using the sequential scan, still, everything seems fine. > However, if I now do the index scan (my function will be called 11 times, returning false at the end), I get the followingresult: > test=# select count(*) from documents where to_document(words, points) && row('pleas radio news'::tsvector, '[(-90,-180),(90,180)]'); > count > ------- > 7 > (1 row) > > This seems strange, because the same query returned 10 rows (when I didn't use the aggregate). If I issue queries thatcount the id column, I receive the following: > test=# select count(id) from documents where hybrid_index_query(to_document(words, points), row('radio pleas news'::tsvector,'[(-90,-180),(90, 180)]')); > count > ------- > 10 > (1 row) > > test=# select count(id) from documents where to_document(words, points) && row('pleas radio news'::tsvector, '[(-90,-180),(90,180)]'); > count > ------- > 10 > (1 row) > > These two queries do again return the same results. > Thus, I don't know, what's wrong here, does anybody know about that behaviour, or is it my fault that the results are wrong,somehow? > Thanks in advance I am guessing this is a bug in your code - have you used EXPLAIN to verify that the second-to-last of the above queries is really hitting your code? If so, I'd recommend attaching with gdb and setting a breakpoint wherever you return the tuples, and then poke around... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: