Thread: work_mem greater than 2GB issue
Hi everyone, I have a largish table (> 8GB). I'm doing a very simple single group by on. I am the only user of this database. If I set work mem to anything under 2GB (e.g. 1900MB) the postmaster process stops at that value while it's peforming it's group by. There is only one hash operation so that is what I would expect. But anything larger and it eats up all memory until it can't get anymore (around 7.5GB on a 8GB machine). Has anyone experienced anything of this sort before. Cheers
wickro <robwickert@gmail.com> writes: > I have a largish table (> 8GB). I'm doing a very simple single group > by on. I am the only user of this database. If I set work mem to > anything under 2GB (e.g. 1900MB) the postmaster process stops at that > value while it's peforming it's group by. There is only one hash > operation so that is what I would expect. But anything larger and it > eats up all memory until it can't get anymore (around 7.5GB on a 8GB > machine). Has anyone experienced anything of this sort before. It's possible that you've found a bug, but you have not provided nearly enough information to let anyone reproduce it for investigation. What Postgres version is this exactly? Is it a 32- or 64-bit build? What is the exact query you're executing, and what does EXPLAIN show as its plan? regards, tom lane
wickro <robwickert@gmail.com> writes: > Hi everyone, > > I have a largish table (> 8GB). I'm doing a very simple single group > by on. I am the only user of this database. If I set work mem to > anything under 2GB (e.g. 1900MB) the postmaster process stops at that > value while it's peforming it's group by. There is only one hash > operation so that is what I would expect. But anything larger and it > eats up all memory until it can't get anymore (around 7.5GB on a 8GB > machine). Has anyone experienced anything of this sort before. What does EXPLAIN say for both cases? I suspect what's happening is that the planner is estimating it will need 2G to has all the values and in fact it would need >8G. So for values under 2G it uses a sort and not a hash at all, for values over 2G it's trying to use a hash and failing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
Quoting wickro <robwickert@gmail.com>: > I have a largish table (> 8GB). I'm doing a very simple single group > by on. This doesn't answer your question, but you might want to take advantage of table partitioning: http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html I've recently gone through this exercise (several tables were 10GB+, some almost 30GB) and if your WHERE clauses qualify, then expect significant performance gains with /much/ better memory consumption. You only have one large table, so partitioning it should be painless and not take too long (unlike our scenario). Cheers Henry
Attachment
You're right. At a certain work_mem threshold it switches over to a HashAggregate sort method. When it does, it eats up alot of memory. For GroupAggregate it only uses the max of work_mem. I'm using Postgresql 8.3.3 64bit on Centos 5. The query I'm running is: select keyword, partner_id, sum(num_searches) as num_searches from partner_country_keywords group by partner_id, keyword against create table partner_country_keywords ( keyword varchar, partner_id integer, country char(2), num_searches integer ); The table partner_country_keywords is 8197 MB large according to pg_class with 126,171,000 records. EXPLAIN at work_mem = 2GB: HashAggregate (cost=3257160.40..3414874.00 rows=12617088 width=28) -> Seq Scan on partner_country_keywords (cost=0.00..2310878.80 rows=126170880 width=28) This will continue to eat memory to about 7.2GB or so (on a 8GB machine) EXPLAIN at work_mem = 1300MB: "GroupAggregate (cost=22306417.24..23725839.64 rows=12617088 width=28)" " -> Sort (cost=22306417.24..22621844.44 rows=126170880 width=28)" " Sort Key: partner_id, keyword" " -> Seq Scan on partner_country_keyword (cost=0.00..2310878.80 rows=126170880 width=28)" So this is a planning mistake? Should a hash be allowed to grow larger than work_mem before it starts to use the disk? On May 14, 4:11 pm, st...@enterprisedb.com (Gregory Stark) wrote: > wickro <robwick...@gmail.com> writes: > > Hi everyone, > > > I have a largish table (> 8GB). I'm doing a very simple single group > > by on. I am the only user of this database. If I set work mem to > > anything under 2GB (e.g. 1900MB) the postmaster process stops at that > > value while it's peforming it's group by. There is only one hash > > operation so that is what I would expect. But anything larger and it > > eats up all memory until it can't get anymore (around 7.5GB on a 8GB > > machine). Has anyone experienced anything of this sort before. > > What does EXPLAIN say for both cases? I suspect what's happening is that the > planner is estimating it will need 2G to has all the values and in fact it > would need >8G. So for values under 2G it uses a sort and not a hash at all, > for values over 2G it's trying to use a hash and failing. > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! > > -- > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general
wickro <robwickert@gmail.com> writes: > So this is a planning mistake? Should a hash be allowed to grow larger > than work_mem before it starts to use the disk? HashAggregate doesn't have any ability to spill to disk. The planner will not select a HashAggregate if it thinks the required hash table would be larger than work_mem. What you've evidently got here is a misestimate of the required hash table size, which most likely is stemming from a bad estimate of the number of groups. How does that estimate (12617088 here) compare to reality? Have you tried increasing the statistics target for partner_id and keyword (or the whole table)? regards, tom lane
> HashAggregate doesn't have any ability to spill to disk. The planner > will not select a HashAggregate if it thinks the required hash table > would be larger than work_mem. What you've evidently got here is a > misestimate of the required hash table size, which most likely is > stemming from a bad estimate of the number of groups. How does that > estimate (12617088 here) compare to reality? Have you tried increasing > the statistics target for partner_id and keyword (or the whole table)? Looking at the pg_statistics table the stats for one of the columns (keyword) is seriously underestimated. By increasing the stats target to 1000 it gets a bit better but still underestimated by a a factor of about 10. I guess that's why postgresql thinks it can fit it all into a hash. Thanks for the help. Cheers Rob