Re: Custom index structure and strange count problem - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Custom index structure and strange count problem
Date
Msg-id AANLkTiklX474n-w6zRG3vJ6c1Y8X2gdVNKSycp-G7iSU@mail.gmail.com
Whole thread Raw
In response to Custom index structure and strange count problem  (Carsten Kropf <ckropf2@fh-hof.de>)
Responses Re: Custom index structure and strange count problem
List pgsql-hackers
On Wed, Jun 9, 2010 at 4:35 AM, Carsten Kropf <ckropf2@fh-hof.de> wrote:
> Hi *,
> during the last few months I've been building a new index structure as part of a research project.
> Everything seems to work properly, however I have some strange issues with the count sql command.
> I introduced some custom structures (mainly document and hybrid_query) with which my index access method is supposed
towork. 
> There is an operator "&&" which is supposed to use my index structure (what also works properly).
> The function that maps to the operator "&&" is called hybrid_index_query, which I use to compare my results given
fromthe index with the real results that are supposed to appear in the final result set. 
> Having described the outer circumstances (in a very short way), I will now show the strange stuff that happens:
>
> test=# select id from documents where hybrid_index_query(to_document(words, points), row('radio pleas
news'::tsvector,'[(-90,-180),(90, 180)]')) order by id; 
>  id
> ------
>  2137
>  2151
>  2168
>  2207
>  2208
>  2209
>  2210
>  2211
>  2266
>  2296
> (10 rows)
>
> This query takes a sequential scan and works properly (returning 10 rows).
>
> test=# select id from documents where to_document(words, points) && row('pleas radio news'::tsvector,
'[(-90,-180),(90,180)]')order by id; 
>  id
> ------
>  2137
>  2151
>  2168
>  2207
>  2208
>  2209
>  2210
>  2211
>  2266
>  2296
> (10 rows)
>
> This query uses my index structure and returns the same result as in the sequential scan above.
> Until here, everything seems to work fine. However, if I issue the same queries using the count aggregate function in
SQL,there are some odd results: 
> test=# select count(*) from documents where hybrid_index_query(to_document(words, points), row('radio pleas
news'::tsvector,'[(-90,-180),(90, 180)]')); 
>  count
> -------
>    10
> (1 row)
>
> Using the sequential scan, still, everything seems fine.
> However, if I now do the index scan (my function will be called 11 times, returning false at the end), I get the
followingresult: 
> test=# select count(*) from documents where to_document(words, points) && row('pleas radio news'::tsvector,
'[(-90,-180),(90,180)]');
>  count
> -------
>     7
> (1 row)
>
> This seems strange, because the same query returned 10 rows (when I didn't use the aggregate). If I issue queries
thatcount the id column, I receive the following: 
> test=# select count(id) from documents where hybrid_index_query(to_document(words, points), row('radio pleas
news'::tsvector,'[(-90,-180),(90, 180)]')); 
>  count
> -------
>    10
> (1 row)
>
> test=# select count(id) from documents where to_document(words, points) && row('pleas radio news'::tsvector,
'[(-90,-180),(90,180)]');
>  count
> -------
>    10
> (1 row)
>
> These two queries do again return the same results.
> Thus, I don't know, what's wrong here, does anybody know about that behaviour, or is it my fault that the results are
wrong,somehow? 
> Thanks in advance

I am guessing this is a bug in your code - have you used EXPLAIN to
verify that the second-to-last of the above queries is really hitting
your code?  If so, I'd recommend attaching with gdb and setting a
breakpoint wherever you return the tuples, and then poke around...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: hstore ==> and deprecate =>
Next
From: "David E. Wheeler"
Date:
Subject: Re: hstore ==> and deprecate =>