Re: Performance question 83 GB Table 150 million rows, distinct select - Mailing list pgsql-performance

From Alan Hodgson
Subject Re: Performance question 83 GB Table 150 million rows, distinct select
Date
Msg-id 201111161527.57443.ahodgson@simkin.ca
Whole thread Raw
In response to Performance question 83 GB Table 150 million rows, distinct select  (Tory M Blue <tmblue@gmail.com>)
Responses Re: Performance question 83 GB Table 150 million rows, distinct select  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On November 16, 2011 02:53:17 PM Tory M Blue wrote:
> We now have about 180mill records in that table. The database size is
> about 580GB and the userstats table which is the biggest one and the
> one we query the most is 83GB.
>
> Just a basic query takes 4 minutes:
>
> For e.g. select count(distinct uid) from userstats where log_date
> >'11/7/2011'
>
> Just not sure if this is what to expect, however there are many other
> DB's out there bigger than ours, so I'm curious what can I do?

That query should use an index on log_date if one exists. Unless the planner
thinks it would need to look at too much of the table.

Also, the normal approach to making large statistics tables more manageable is
to partition them by date range.

pgsql-performance by date:

Previous
From: Tory M Blue
Date:
Subject: Performance question 83 GB Table 150 million rows, distinct select
Next
From: Scott Marlowe
Date:
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select