Home > mailing lists

Re: Strategy for doing number-crunching - Mailing list pgsql-novice

From	Tom Lane
Subject	Re: Strategy for doing number-crunching
Date	January 4, 2012 17:11:48
Msg-id	22136.1325711481@sss.pgh.pa.us Whole thread Raw
In response to	Re: Strategy for doing number-crunching (Matthew Foster <matthew.foster@noaa.gov>)
Responses	Re: Strategy for doing number-crunching
List	pgsql-novice

Tree view

Matthew Foster <matthew.foster@noaa.gov> writes:
> On Wed, Jan 4, 2012 at 10:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Matthew Foster <matthew.foster@noaa.gov> writes:
>>> We have a database with approximately 130M rows, and we need to produce
>>> statistics (e.g. mean, standard deviation, etc.) on the data.  Right now,
>>> we're generating these stats via a single SELECT, and it is extremely
>>> slow...like it can take hours to return results.

>> What datatype are the columns being averaged?  If "numeric", consider
>> casting to float8 before applying the aggregates.  You'll lose some
>> precision but it'll likely be orders of magnitude faster.

> The data are type double.

Hmm.  In that case I think you have some other problem that's hidden in
details you didn't show us.  It should not take "hours" to process only
130M rows.  This would best be taken up on pgsql-performance; please see
http://wiki.postgresql.org/wiki/Slow_Query_Questions

            regards, tom lane

pgsql-novice by date:

From: Matthew Foster
Date: 04 January 2012, 14:28:41
Subject: Re: Strategy for doing number-crunching

From: Carlos Mennens
Date: 04 January 2012, 17:30:22
Subject: Verify My Database Isn't Slammed

Re: Strategy for doing number-crunching - Mailing list pgsql-novice

Previous

Next