Re: Postgres12 looking for possible HashAggregate issue workarounds? - Mailing list pgsql-performance

From David Rowley
Subject Re: Postgres12 looking for possible HashAggregate issue workarounds?
Date
Msg-id CAApHDvogcaqa3JU9u06SP3Eh155D0wGtzy+d-x6XfN07bJ270g@mail.gmail.com
Whole thread Raw
In response to RE: Postgres12 looking for possible HashAggregate issue workarounds?  (João Paulo Luís <joao.luis@pdmfc.com>)
Responses RE: Postgres12 looking for possible HashAggregate issue workarounds?
List pgsql-performance
On Sun, 18 Dec 2022 at 23:44, João Paulo Luís <joao.luis@pdmfc.com> wrote:
> Meanwhile, as a one-time workaround I've disabled the hashagg algorithm,

The way the query planner determines if Hash Aggregate's hash table
will fit in work_mem or not is based on the n_distinct estimate of the
columns being grouped on.  You may want to review what analyze set
n_distinct to on this table. That can be done by looking at:

select attname,n_distinct from pg_Stats where tablename =
'sentencesource' and attname = 'sentence';

If what that's set to does not seem realistic, then you can overwrite this with:

ALTER TABLE sentencesource ALTER COLUMN sentence SET (n_distinct = N);

Please see the paragraph in [1] about n_distinct.  Using an absolute
value is likely not a great idea if the table is going to grow. You
could maybe give it a better estimate about how many times values are
repeated by setting some negative value, as described in the
documents. You'll need to analyze the table again after changing this
setting.

David

[1] https://www.postgresql.org/docs/12/sql-altertable.html



pgsql-performance by date:

Previous
From: João Paulo Luís
Date:
Subject: RE: Postgres12 looking for possible HashAggregate issue workarounds?
Next
From: Frits Jalvingh
Date:
Subject: temp_file_limit?