Re: Custom index structure and strange count problem - Mailing list pgsql-hackers
| From | Robert Haas |
|---|---|
| Subject | Re: Custom index structure and strange count problem |
| Date | |
| Msg-id | AANLkTiklX474n-w6zRG3vJ6c1Y8X2gdVNKSycp-G7iSU@mail.gmail.com Whole thread |
| In response to | Custom index structure and strange count problem (Carsten Kropf <ckropf2@fh-hof.de>) |
| Responses |
Re: Custom index structure and strange count problem
|
| List | pgsql-hackers |
On Wed, Jun 9, 2010 at 4:35 AM, Carsten Kropf <ckropf2@fh-hof.de> wrote:
> Hi *,
> during the last few months I've been building a new index structure as part of a research project.
> Everything seems to work properly, however I have some strange issues with the count sql command.
> I introduced some custom structures (mainly document and hybrid_query) with which my index access method is supposed
towork.
> There is an operator "&&" which is supposed to use my index structure (what also works properly).
> The function that maps to the operator "&&" is called hybrid_index_query, which I use to compare my results given
fromthe index with the real results that are supposed to appear in the final result set.
> Having described the outer circumstances (in a very short way), I will now show the strange stuff that happens:
>
> test=# select id from documents where hybrid_index_query(to_document(words, points), row('radio pleas
news'::tsvector,'[(-90,-180),(90, 180)]')) order by id;
> id
> ------
> 2137
> 2151
> 2168
> 2207
> 2208
> 2209
> 2210
> 2211
> 2266
> 2296
> (10 rows)
>
> This query takes a sequential scan and works properly (returning 10 rows).
>
> test=# select id from documents where to_document(words, points) && row('pleas radio news'::tsvector,
'[(-90,-180),(90,180)]')order by id;
> id
> ------
> 2137
> 2151
> 2168
> 2207
> 2208
> 2209
> 2210
> 2211
> 2266
> 2296
> (10 rows)
>
> This query uses my index structure and returns the same result as in the sequential scan above.
> Until here, everything seems to work fine. However, if I issue the same queries using the count aggregate function in
SQL,there are some odd results:
> test=# select count(*) from documents where hybrid_index_query(to_document(words, points), row('radio pleas
news'::tsvector,'[(-90,-180),(90, 180)]'));
> count
> -------
> 10
> (1 row)
>
> Using the sequential scan, still, everything seems fine.
> However, if I now do the index scan (my function will be called 11 times, returning false at the end), I get the
followingresult:
> test=# select count(*) from documents where to_document(words, points) && row('pleas radio news'::tsvector,
'[(-90,-180),(90,180)]');
> count
> -------
> 7
> (1 row)
>
> This seems strange, because the same query returned 10 rows (when I didn't use the aggregate). If I issue queries
thatcount the id column, I receive the following:
> test=# select count(id) from documents where hybrid_index_query(to_document(words, points), row('radio pleas
news'::tsvector,'[(-90,-180),(90, 180)]'));
> count
> -------
> 10
> (1 row)
>
> test=# select count(id) from documents where to_document(words, points) && row('pleas radio news'::tsvector,
'[(-90,-180),(90,180)]');
> count
> -------
> 10
> (1 row)
>
> These two queries do again return the same results.
> Thus, I don't know, what's wrong here, does anybody know about that behaviour, or is it my fault that the results are
wrong,somehow?
> Thanks in advance
I am guessing this is a bug in your code - have you used EXPLAIN to
verify that the second-to-last of the above queries is really hitting
your code? If so, I'd recommend attaching with gdb and setting a
breakpoint wherever you return the tuples, and then poke around...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
pgsql-hackers by date: