On Mon, Apr 14, 2008 at 1:34 PM, Chris Browne <cbbrowne@acm.org> wrote:
> "tosbalok@gmail.com" <tosbalok@gmail.com> writes:
> > Another test. In postgres I added an index to the userid column and
> > then counted distinct userids. The average run time over three
> > queries was 4666 seconds, or 78 minutes. Unbelievable.
> >
> > On SQL Server, with *no* index, the same query takes on average 414
> > seconds, or about 7 minutes. Ten times faster!
First, in general- use the EXPLAIN and EXPLAIN ANALYZE feature of
postgresql. That will tell you a lot about what your queries are doing
and why they're taking so long.
Second, make sure you've ANALYZE'd your table after creating it and
the index, which gives the planner the statistics necessary to make
intelligent choices.
For instance, your count of distinct userids is probably not using the
index you just created. If it still isn't using it after you ANALYZE
the table, try rewriting the query using group by (select count(*)
from (select userid from mytable group by userid) tmp). I recently had
a similar performance issue on a 75m row table, and the above helped.
VACUUM ANALYZE tables, and then remember that EXPLAIN and EXPLAIN
ANALYZE are your best friends.
--
- David T. Wilson
david.t.wilson@gmail.com