Re: Performance of aggregates over set-returning functions - Mailing list pgsql-performance

From Tom Lane
Subject Re: Performance of aggregates over set-returning functions
Date
Msg-id 25214.1199848043@sss.pgh.pa.us
Whole thread Raw
In response to Performance of aggregates over set-returning functions  ("John Smith" <sodgodofall@gmail.com>)
Responses Re: Performance of aggregates over set-returning functions
List pgsql-performance
"John Smith" <sodgodofall@gmail.com> writes:
> Consider the following query:
>          postgres=# select count(*) from generate_series(1,100000000000000000);
> A vmstat while this query was running seems to suggest that the
> generate_series() was being materialized to disk first and then the
> count computed

Yeah, that's what I'd expect given the way nodeFunctionscan.c works.

> A small variation in the query (i.e. placing the generate_series in
> the select list without a from clause) exhibited a different
> behaviour:
>          postgres=# select count(*) from (select
> generate_series(1,100000000000000000)) as A;
> This time, Postgres seemed to be trying to build the entire
> generate_series() result set in memory and eventually started to swap
> out until the swap space limit after which it crashed.

Hmm, at worst you should get an "out of memory" error --- that's what
I get.  I think you're suffering from the dreaded "OOM kill" disease.

> Interestingly though, when the range in the generate_series() was
> small enough to fit in 4 bytes of memory (e.g.
> generate_series(1,1000000000) ), the above query completed consuming
> only negligible amount of memory. So, it looked like the aggregate
> computation was being pipelined with the tuples returned from
> generate_series().

It's pipelined either way.  But int8 is a pass-by-reference data type,
and it sounds like we have a memory leak for this case.

> Third question: more generally, is there a way I can ask Postgres to
> report an Out-of-memory the moment it tries to consume greater than a
> certain percentage of memory (instead of letting it hit the swap and
> eventually die or thrash) ?

First, you want to turn off memory overcommit in your kernel;
second, you might find that running the postmaster under conservative
ulimit settings would make the behavior nicer.

            regards, tom lane

pgsql-performance by date:

Previous
From: "John Smith"
Date:
Subject: Performance of aggregates over set-returning functions
Next
From: "John Smith"
Date:
Subject: Re: Performance of aggregates over set-returning functions