Re: wip: functions median and percentile - Mailing list pgsql-hackers

From Robert Haas
Subject Re: wip: functions median and percentile
Date
Msg-id AANLkTinejbxAwrAsHxVWCOs0pyFdm-s5AXpJZ-GuWwY7@mail.gmail.com
Whole thread Raw
In response to Re: wip: functions median and percentile  (Hitoshi Harada <umi.tanuki@gmail.com>)
List pgsql-hackers
On Fri, Oct 1, 2010 at 10:11 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> 2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
>> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>>> 2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
>>>> This patch needs a few work - can share a compare functionality with
>>>> tuplesort.c, but I would to verify a concept now.
>>
>>> Sorry for delay. I read the patch and it seems the result is sane. For
>>> window function calls, I agree that the current tuplesort is not
>>> enough to implement median functions and the patch introduces its own
>>> memsort mechanism, although memsort has too much copied from
>>> tuplesort. It looks to me not so difficult to modify the existing
>>> tuplesort to guarantee staying in memory always if an option to do so
>>> is specified from caller. I think that option can be used by other
>>> cases in the core code.
>>
>> If this patch tries to force the entire sort to happen in memory,
>> it is not committable.  What will happen when you get a lot of data?
>> You need to be working on a variant that will work anyway, not working
>> on an unacceptable lobotomization of the main sort code.
>
> What about array_agg()? Doesn't it exceed memory even if the huge data come in?

So, if you have 512MB of RAM in the box and you build and return a 1GB
array, it's going to be a problem.  Period, full stop.  The interim
memory consumption cannot be less than the size of the final result.

If you have 512MB of RAM in the box and you want to aggregate 1GB of
data and return a 4 byte integer, it's only a problem if your
implementation is bad.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: wip: functions median and percentile
Next
From: Robert Haas
Date:
Subject: Re: I: About "Our CLUSTER implementation is pessimal" patch