Home > mailing lists

Re: Performance question 83 GB Table 150 million rows, distinct select - Mailing list pgsql-performance

From	Scott Marlowe
Subject	Re: Performance question 83 GB Table 150 million rows, distinct select
Date	November 16, 2011 19:32:29
Msg-id	CAOR=d=3OLUfQgoTrMMAoAo-Hq2z13TpzTP7VuXcq6pvHFpKsCg@mail.gmail.com Whole thread
In response to	Re: Performance question 83 GB Table 150 million rows, distinct select (Alan Hodgson <ahodgson@simkin.ca>)
Responses	Re: Performance question 83 GB Table 150 million rows, distinct select
List	pgsql-performance

Tree view

On Wed, Nov 16, 2011 at 4:27 PM, Alan Hodgson <ahodgson@simkin.ca> wrote:
> On November 16, 2011 02:53:17 PM Tory M Blue wrote:
>> We now have about 180mill records in that table. The database size is
>> about 580GB and the userstats table which is the biggest one and the
>> one we query the most is 83GB.
>>
>> Just a basic query takes 4 minutes:
>>
>> For e.g. select count(distinct uid) from userstats where log_date
>> >'11/7/2011'
>>
>> Just not sure if this is what to expect, however there are many other
>> DB's out there bigger than ours, so I'm curious what can I do?
>
> That query should use an index on log_date if one exists. Unless the planner
> thinks it would need to look at too much of the table.

Agreed.  We'd need to know how selective that where clause is.  Seeing
some forced index usage versus regular explain analyze would be
useful.  i.e.

set enable_seqscan=off;
explain analyze select ...

> Also, the normal approach to making large statistics tables more manageable is
> to partition them by date range.

If the OP's considering partitioning, they should really consider
upgrading to 9.1 which has much better performance of things like
aggregates against partition tables.

pgsql-performance by date:

From: Alan Hodgson
Date: 16 November 2011, 19:28:20
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select

From: Samuel Gendler
Date: 16 November 2011, 19:52:58
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select

Re: Performance question 83 GB Table 150 million rows, distinct select - Mailing list pgsql-performance

Previous

Next