Thread: work_mem greater than 2GB issue

work_mem greater than 2GB issue

From
wickro
Date:
Hi everyone,

I have a largish table (> 8GB). I'm doing a very simple single group
by on. I am the only user of this database. If I set work mem to
anything under 2GB (e.g. 1900MB) the postmaster process stops at that
value while it's peforming it's group by. There is only one hash
operation so that is what I would expect. But anything larger and it
eats up all memory until it can't get anymore (around 7.5GB on a 8GB
machine). Has anyone experienced anything of this sort before.

Cheers

Re: work_mem greater than 2GB issue

From
Tom Lane
Date:
wickro <robwickert@gmail.com> writes:
> I have a largish table (> 8GB). I'm doing a very simple single group
> by on. I am the only user of this database. If I set work mem to
> anything under 2GB (e.g. 1900MB) the postmaster process stops at that
> value while it's peforming it's group by. There is only one hash
> operation so that is what I would expect. But anything larger and it
> eats up all memory until it can't get anymore (around 7.5GB on a 8GB
> machine). Has anyone experienced anything of this sort before.

It's possible that you've found a bug, but you have not provided nearly
enough information to let anyone reproduce it for investigation.
What Postgres version is this exactly?  Is it a 32- or 64-bit build?
What is the exact query you're executing, and what does EXPLAIN show
as its plan?

            regards, tom lane

Re: work_mem greater than 2GB issue

From
Gregory Stark
Date:
wickro <robwickert@gmail.com> writes:

> Hi everyone,
>
> I have a largish table (> 8GB). I'm doing a very simple single group
> by on. I am the only user of this database. If I set work mem to
> anything under 2GB (e.g. 1900MB) the postmaster process stops at that
> value while it's peforming it's group by. There is only one hash
> operation so that is what I would expect. But anything larger and it
> eats up all memory until it can't get anymore (around 7.5GB on a 8GB
> machine). Has anyone experienced anything of this sort before.

What does EXPLAIN say for both cases? I suspect what's happening is that the
planner is estimating it will need 2G to has all the values and in fact it
would need >8G. So for values under 2G it uses a sort and not a hash at all,
for values over 2G it's trying to use a hash and failing.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

Re: work_mem greater than 2GB issue

From
Henry
Date:
Quoting wickro <robwickert@gmail.com>:
> I have a largish table (> 8GB). I'm doing a very simple single group
> by on.

This doesn't answer your question, but you might want to take
advantage of table partitioning:
http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html

I've recently gone through this exercise (several tables were 10GB+,
some almost 30GB) and if your WHERE clauses qualify, then expect
significant performance gains with /much/ better memory consumption.

You only have one large table, so partitioning it should be painless
and not take too long (unlike our scenario).

Cheers
Henry


Attachment

Re: work_mem greater than 2GB issue

From
wickro
Date:
You're right. At a certain work_mem threshold it switches over to a
HashAggregate sort method.
When it does, it eats up alot of memory. For GroupAggregate it only
uses the max of work_mem.

I'm using Postgresql 8.3.3 64bit on Centos 5.
The query I'm running is:

select keyword, partner_id, sum(num_searches) as num_searches
    from partner_country_keywords
    group by partner_id, keyword

against

create table partner_country_keywords (
    keyword varchar,
    partner_id integer,
    country char(2),
    num_searches integer
);

The table partner_country_keywords is 8197 MB large according to
pg_class with 126,171,000 records.

EXPLAIN at work_mem = 2GB:

HashAggregate  (cost=3257160.40..3414874.00 rows=12617088 width=28)
  ->  Seq Scan on partner_country_keywords  (cost=0.00..2310878.80
rows=126170880 width=28)

This will continue to eat memory to about 7.2GB or so (on a 8GB
machine)

EXPLAIN at work_mem = 1300MB:

"GroupAggregate  (cost=22306417.24..23725839.64 rows=12617088
width=28)"
"  ->  Sort  (cost=22306417.24..22621844.44 rows=126170880 width=28)"
"        Sort Key: partner_id, keyword"
"        ->  Seq Scan on partner_country_keyword
(cost=0.00..2310878.80 rows=126170880 width=28)"

So this is a planning mistake? Should a hash be allowed to grow larger
than work_mem before it starts to use the disk?

On May 14, 4:11 pm, st...@enterprisedb.com (Gregory Stark) wrote:
> wickro <robwick...@gmail.com> writes:
> > Hi everyone,
>
> > I have a largish table (> 8GB). I'm doing a very simple single group
> > by on. I am the only user of this database. If I set work mem to
> > anything under 2GB (e.g. 1900MB) the postmaster process stops at that
> > value while it's peforming it's group by. There is only one hash
> > operation so that is what I would expect. But anything larger and it
> > eats up all memory until it can't get anymore (around 7.5GB on a 8GB
> > machine). Has anyone experienced anything of this sort before.
>
> What does EXPLAIN say for both cases? I suspect what's happening is that the
> planner is estimating it will need 2G to has all the values and in fact it
> would need >8G. So for values under 2G it uses a sort and not a hash at all,
> for values over 2G it's trying to use a hash and failing.
>
> --
>   Gregory Stark
>   EnterpriseDB          http://www.enterprisedb.com
>   Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
>
> --
> Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general



Re: work_mem greater than 2GB issue

From
Tom Lane
Date:
wickro <robwickert@gmail.com> writes:
> So this is a planning mistake? Should a hash be allowed to grow larger
> than work_mem before it starts to use the disk?

HashAggregate doesn't have any ability to spill to disk.  The planner
will not select a HashAggregate if it thinks the required hash table
would be larger than work_mem.  What you've evidently got here is a
misestimate of the required hash table size, which most likely is
stemming from a bad estimate of the number of groups.  How does that
estimate (12617088 here) compare to reality?  Have you tried increasing
the statistics target for partner_id and keyword (or the whole table)?

            regards, tom lane

Re: work_mem greater than 2GB issue

From
wickro
Date:
> HashAggregate doesn't have any ability to spill to disk.  The planner
> will not select a HashAggregate if it thinks the required hash table
> would be larger than work_mem.  What you've evidently got here is a
> misestimate of the required hash table size, which most likely is
> stemming from a bad estimate of the number of groups.  How does that
> estimate (12617088 here) compare to reality?  Have you tried increasing
> the statistics target for partner_id and keyword (or the whole table)?

Looking at the pg_statistics table the stats for one of the columns
(keyword) is seriously underestimated.
By increasing the stats target to 1000 it gets a bit better but still
underestimated by a a factor of about 10.
I guess that's why postgresql thinks it can fit it all into a hash.

Thanks for the help.

Cheers
Rob