Thread: Postgres12 looking for possible HashAggregate issue workarounds?

Postgres12 looking for possible HashAggregate issue workarounds?

From
João Paulo Luís
Date:
Hi! Sorry to post to this mailing list, but I could not find many tips working around HashAggregate issues.

In a research project involving text repetition analysis (on top of public documents)
I have a VirtualMachine (CPU AMD Epyc 7502P, 128GB RAM, 12TB HDD, 2TB SSD),
running postgres 12.12 (Ubuntu 12.12-0ubuntu0.20.04.1)
and some tables with many rows:

nsoamt=> ANALYSE VERBOSE SentenceSource;
INFO:  analyzing "public.sentencesource"
INFO:  "sentencesource": scanned 30000 of 9028500 pages, containing 3811990 live rows and 268323 dead rows; 30000 rows in sample, 1147218391 estimated total rows
ANALYZE
nsoamt=> ANALYSE VERBOSE SentenceToolCheck;
INFO:  analyzing "public.sentencetoolcheck"
INFO:  "sentencetoolcheck": scanned 30000 of 33536425 pages, containing 498508 live rows and 25143 dead rows; 30000 rows in sample, 557272538 estimated total rows
ANALYZE
nsoamt=> ANALYZE VERBOSE Document;
INFO:  analyzing "public.document"
INFO:  "document": scanned 30000 of 34570 pages, containing 1371662 live rows and 30366 dead rows; 30000 rows in sample, 1580612 estimated total rows
ANALYZE

The estimates for the number of rows above are accurate.

I am running this query

        SELECT COUNT(*), COUNT(NULLIF(Stchk.haserrors,'f'))
            FROM SentenceToolCheck Stchk
            WHERE EXISTS (SELECT SSrc.sentence
                      FROM SentenceSource SSrc, Document Doc
                      WHERE SSrc.sentence = Stchk.id
                      AND Doc.id = SSrc.document
                      AND Doc.source ILIKE '/bigpostgres/misc/arxiv/arxiv/arxiv/pdf/%');

and I have 2 (related?) problems


1 - the query is making a postgresql project have 76.7 GB resident RAM usage.
Having a WORK_MEM setting of 2GB (and "simple" COUNT() results),
that was not expected.
(I risk oom-killer killing my postgres as soon as I run another concurrent
query.)

The memory settings are:

work_mem = 2GB
shared_buffers = 16GB
maintenance_work_mem = 1GB



2 - the query never finishes... (it is over 3x24hours execution by now,
and I have no ideia how far from finishing it is).

The EXPLAIN plan is:

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=28630195.79..28630195.80 rows=1 width=16)
   ->  Nested Loop  (cost=26397220.49..28628236.23 rows=261275 width=1)
         ->  HashAggregate  (cost=26397219.92..26399832.67 rows=261275 width=8)
               Group Key: ssrc.sentence
               ->  Hash Join  (cost=73253.21..23635527.52 rows=1104676957 width=8)
                     Hash Cond: (ssrc.document = doc.id)
                     ->  Seq Scan on sentencesource ssrc  (cost=0.00..20540394.02 rows=1151189402 width=16)
                     ->  Hash  (cost=54310.40..54310.40 rows=1515425 width=4)
                           ->  Seq Scan on document doc  (cost=0.00..54310.40 rows=1515425 width=4)
                                 Filter: (source ~~* '/bigpostgres/misc/arxiv/arxiv/arxiv/pdf/%'::text)
         ->  Index Scan using pk_sentencetoolcheck on sentencetoolcheck stchk  (cost=0.57..8.53 rows=1 width=9)
               Index Cond: (id = ssrc.sentence)
 JIT:
   Functions: 20
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(15 rows)

The rows=1515425 estimate on Seq Scan on document doc  (cost=0.00..54310.40 rows=1515425 width=4) seems right.

The rows=1104676957 estimate on Hash Join  (cost=73253.21..23635527.52 rows=1104676957 width=8) also seems right.

The rows=261275 on HashAggregate  (cost=26397219.92..26399832.67 rows=261275 width=8) seems VERY WRONG!
I was expecting something like rows=1.0E+09 instead.


On a laptop (with just 80% of the rows, 32GB RAM, but all SSD disks),
I finish the query in a few hours (+/- 2 hours).

The EXPLAIN plan is different on the laptop:

                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=216688374.89..216688374.90 rows=1 width=16)
   ->  Nested Loop  (cost=211388557.47..216686210.27 rows=288616 width=1)
         ->  Unique  (cost=211388556.90..215889838.75 rows=288616 width=8)
               ->  Sort  (cost=211388556.90..213639197.82 rows=900256370 width=8)
                     Sort Key: ssrc.sentence
                     ->  Hash Join  (cost=56351.51..28261726.31 rows=900256370 width=8)
                           Hash Cond: (ssrc.document = doc.id)
                           ->  Seq Scan on sentencesource ssrc  (cost=0.00..16453055.44 rows=948142144 width=16)
                           ->  Hash  (cost=38565.65..38565.65 rows=1084069 width=4)
                                 ->  Seq Scan on document doc  (cost=0.00..38565.65 rows=1084069 width=4)
                                       Filter: (source ~~* '/bigpostgres/misc/arxiv/arxiv/arxiv/pdf/%'::text)
         ->  Index Scan using pk_sentencetoolcheck on sentencetoolcheck stchk  (cost=0.57..2.76 rows=1 width=9)
               Index Cond: (id = ssrc.sentence)
 JIT:
   Functions: 18
   Options: Inlining true, Optimization true, Expressions true, Deforming true

(The Unique rows estimation is also very wrong, but at least the query finishes).

I would guess that HashAggregate is behaving very badly (using to much RAM beyond WORK_MEM, amd also badly estimating the #rows and taking forever...)

Any suggestions ?


João Luís

Senior Developer

joao.luis@pdmfc.com 

+351 210 337 700


Confidentiality

The information in this message is confidential and privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of the message, or any action or omission taken by you in reliance on it is prohibited. 

Please contact the sender immediately if you have received this message by mistake.

Thank you for your cooperation.

Re: Postgres12 looking for possible HashAggregate issue workarounds?

From
Justin Pryzby
Date:
On Fri, Dec 16, 2022 at 03:24:17PM +0000, João Paulo Luís wrote:
> Hi! Sorry to post to this mailing list, but I could not find many tips working around HashAggregate issues.
> 
> In a research project involving text repetition analysis (on top of public documents)
> I have a VirtualMachine (CPU AMD Epyc 7502P, 128GB RAM, 12TB HDD, 2TB SSD),
> running postgres 12.12 (Ubuntu 12.12-0ubuntu0.20.04.1)
> and some tables with many rows:

> 1 - the query is making a postgresql project have 76.7 GB resident RAM usage.
> Having a WORK_MEM setting of 2GB (and "simple" COUNT() results),
> that was not expected.
> (I risk oom-killer killing my postgres as soon as I run another concurrent
> query.)

> The rows=261275 on HashAggregate  (cost=26397219.92..26399832.67 rows=261275 width=8) seems VERY WRONG!
> I was expecting something like rows=1.0E+09 instead.

> I would guess that HashAggregate is behaving very badly (using to much RAM beyond WORK_MEM, amd also badly estimating
the#rows and taking forever...)
 

Huge memory use sounds like what was fixed in postgres 13.

https://www.postgresql.org/docs/13/release-13.html

Allow hash aggregation to use disk storage for large aggregation result
sets (Jeff Davis)

Previously, hash aggregation was avoided if it was expected to use more
than work_mem memory. Now, a hash aggregation plan can be chosen despite
that. The hash table will be spilled to disk if it exceeds work_mem
times hash_mem_multiplier.

This behavior is normally preferable to the old behavior, in which once
hash aggregation had been chosen, the hash table would be kept in memory
no matter how large it got — which could be very large if the planner
had misestimated. If necessary, behavior similar to that can be obtained
by increasing hash_mem_multiplier.

-- 
Justin



RE: Postgres12 looking for possible HashAggregate issue workarounds?

From
João Paulo Luís
Date:
Thank you. It seems it is precisely that problem.

(I will discuss with the rest of the team upgrade possibilities, as I guess it will never be backported to the bugfixes of version 12.)

Meanwhile, as a one-time workaround I've disabled the hashagg algorithm,

SET enable_hashagg=off;

repeated the query, and it finished in 1h28m (and the RAM resident memory stayed just a little above the 16GB of shared_buffers).

Happy holidays!


João Luís

Senior Developer

joao.luis@pdmfc.com 

+351 210 337 700


Confidentiality

The information in this message is confidential and privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of the message, or any action or omission taken by you in reliance on it is prohibited. 

Please contact the sender immediately if you have received this message by mistake.

Thank you for your cooperation.


De: Justin Pryzby <pryzby@telsasoft.com>
Enviado: 16 de dezembro de 2022 16:06
Para: João Paulo Luís <joao.luis@pdmfc.com>
Cc: pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
Assunto: Re: Postgres12 looking for possible HashAggregate issue workarounds?
 
[Não costuma receber e-mails de pryzby@telsasoft.com. Saiba por que motivo isto é importante em https://aka.ms/LearnAboutSenderIdentification. ]

CAUTION: External E-mail


On Fri, Dec 16, 2022 at 03:24:17PM +0000, João Paulo Luís wrote:
> Hi! Sorry to post to this mailing list, but I could not find many tips working around HashAggregate issues.
>
> In a research project involving text repetition analysis (on top of public documents)
> I have a VirtualMachine (CPU AMD Epyc 7502P, 128GB RAM, 12TB HDD, 2TB SSD),
> running postgres 12.12 (Ubuntu 12.12-0ubuntu0.20.04.1)
> and some tables with many rows:

> 1 - the query is making a postgresql project have 76.7 GB resident RAM usage.
> Having a WORK_MEM setting of 2GB (and "simple" COUNT() results),
> that was not expected.
> (I risk oom-killer killing my postgres as soon as I run another concurrent
> query.)

> The rows=261275 on HashAggregate  (cost=26397219.92..26399832.67 rows=261275 width=8) seems VERY WRONG!
> I was expecting something like rows=1.0E+09 instead.

> I would guess that HashAggregate is behaving very badly (using to much RAM beyond WORK_MEM, amd also badly estimating the #rows and taking forever...)

Huge memory use sounds like what was fixed in postgres 13.

https://www.postgresql.org/docs/13/release-13.html

Allow hash aggregation to use disk storage for large aggregation result
sets (Jeff Davis)

Previously, hash aggregation was avoided if it was expected to use more
than work_mem memory. Now, a hash aggregation plan can be chosen despite
that. The hash table will be spilled to disk if it exceeds work_mem
times hash_mem_multiplier.

This behavior is normally preferable to the old behavior, in which once
hash aggregation had been chosen, the hash table would be kept in memory
no matter how large it got — which could be very large if the planner
had misestimated. If necessary, behavior similar to that can be obtained
by increasing hash_mem_multiplier.

--
Justin

Re: Postgres12 looking for possible HashAggregate issue workarounds?

From
David Rowley
Date:
On Sun, 18 Dec 2022 at 23:44, João Paulo Luís <joao.luis@pdmfc.com> wrote:
> Meanwhile, as a one-time workaround I've disabled the hashagg algorithm,

The way the query planner determines if Hash Aggregate's hash table
will fit in work_mem or not is based on the n_distinct estimate of the
columns being grouped on.  You may want to review what analyze set
n_distinct to on this table. That can be done by looking at:

select attname,n_distinct from pg_Stats where tablename =
'sentencesource' and attname = 'sentence';

If what that's set to does not seem realistic, then you can overwrite this with:

ALTER TABLE sentencesource ALTER COLUMN sentence SET (n_distinct = N);

Please see the paragraph in [1] about n_distinct.  Using an absolute
value is likely not a great idea if the table is going to grow. You
could maybe give it a better estimate about how many times values are
repeated by setting some negative value, as described in the
documents. You'll need to analyze the table again after changing this
setting.

David

[1] https://www.postgresql.org/docs/12/sql-altertable.html



RE: Postgres12 looking for possible HashAggregate issue workarounds?

From
João Paulo Luís
Date:
Thank you David Rowley (best peformance fix so far)!

nsoamt=> select attname,n_distinct from pg_Stats where tablename = 'sentencesource' and attname = 'sentence';
 attname  | n_distinct
----------+------------
 sentence |     255349
(1 row)

select count(*), count(distinct sentence) from sentencesource;
   count    |   count
------------+-----------
 1150174041 | 885946963
(1 row)

-- Seems badly estimated to me.

-- I expect +/-80% of rows to have a distinct value. Manual says -1 is for all rows being distinct.
nsoamt=> ALTER TABLE sentencesource ALTER COLUMN sentence SET (n_distinct = -1);

nsoamt=> ANALYZE VERBOSE sentencesource ;
INFO:  analyzing "public.sentencesource"
INFO:  "sentencesource": scanned 30000 of 9028500 pages, containing 3819977 live rows and 260307 dead rows; 30000 rows in sample, 1149622078 estimated total rows
ANALYZE

nsoamt=> select attname,n_distinct from pg_Stats where tablename = 'sentencesource' and attname = 'sentence';
 attname  | n_distinct
----------+------------
 sentence |         -1
(1 row)


nsoamt=> EXPLAIN SELECT COUNT(*), COUNT(NULLIF(Stchk.haserrors,'f'))
            FROM SentenceToolCheck Stchk
            WHERE EXISTS (SELECT SSrc.sentence
                      FROM SentenceSource SSrc, Document Doc
                      WHERE SSrc.sentence = Stchk.id
                      AND Doc.id = SSrc.document
                      AND Doc.source ILIKE '/bigpostgres/misc/arxiv/arxiv/arxiv/pdf/%');
                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=275199757.84..275199757.85 rows=1 width=16)
   ->  Gather  (cost=275199757.62..275199757.83 rows=2 width=16)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=275198757.62..275198757.63 rows=1 width=16)
               ->  Hash Join  (cost=228004096.84..273527643.59 rows=222815204 width=1)
                     Hash Cond: (stchk.id = ssrc.sentence)
                     ->  Parallel Seq Scan on sentencetoolcheck stchk  (cost=0.00..35858393.80 rows=232196880 width=9)
                     ->  Hash  (cost=209905168.81..209905168.81 rows=1103172722 width=8)
                           ->  Unique  (cost=204389305.20..209905168.81 rows=1103172722 width=8)
                                 ->  Sort  (cost=204389305.20..207147237.01 rows=1103172722 width=8)
                                       Sort Key: ssrc.sentence
                                       ->  Hash Join  (cost=73287.01..23615773.05 rows=1103172722 width=8)
                                             Hash Cond: (ssrc.document = doc.id)
                                             ->  Seq Scan on sentencesource ssrc  (cost=0.00..20524720.16 rows=1149622016 width=16)
                                             ->  Hash  (cost=54327.65..54327.65 rows=1516749 width=4)
                                                   ->  Seq Scan on document doc  (cost=0.00..54327.65 rows=1516749 width=4)
                                                         Filter: (source ~~* '/bigpostgres/misc/arxiv/arxiv/arxiv/pdf/%'::text)
 JIT:
   Functions: 25
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(20 rows)

And the query finished in Time: 2637891.352 ms (43:57.891) (the best performance so far, although sentencesource fits in RAM :-)


Curious, now that I've manually set it to -1, who/what will change that setting in the future (not ANALYZE?) ?
It will stay that way until some one else (human user) changes it ? (How do I set it back to "automatic"?)

Hope that there is a way that this poor estimation is fixed in the future releases...

João Luís

Senior Developer

joao.luis@pdmfc.com 

+351 210 337 700


Confidentiality

The information in this message is confidential and privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of the message, or any action or omission taken by you in reliance on it is prohibited. 

Please contact the sender immediately if you have received this message by mistake.

Thank you for your cooperation.


De: David Rowley <dgrowleyml@gmail.com>
Enviado: 18 de dezembro de 2022 11:06
Para: João Paulo Luís <joao.luis@pdmfc.com>
Cc: Justin Pryzby <pryzby@telsasoft.com>; pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
Assunto: Re: Postgres12 looking for possible HashAggregate issue workarounds?
 
[Não costuma receber e-mails de dgrowleyml@gmail.com. Saiba por que motivo isto é importante em https://aka.ms/LearnAboutSenderIdentification. ]

CAUTION: External E-mail


On Sun, 18 Dec 2022 at 23:44, João Paulo Luís <joao.luis@pdmfc.com> wrote:
> Meanwhile, as a one-time workaround I've disabled the hashagg algorithm,

The way the query planner determines if Hash Aggregate's hash table
will fit in work_mem or not is based on the n_distinct estimate of the
columns being grouped on.  You may want to review what analyze set
n_distinct to on this table. That can be done by looking at:

select attname,n_distinct from pg_Stats where tablename =
'sentencesource' and attname = 'sentence';

If what that's set to does not seem realistic, then you can overwrite this with:

ALTER TABLE sentencesource ALTER COLUMN sentence SET (n_distinct = N);

Please see the paragraph in [1] about n_distinct.  Using an absolute
value is likely not a great idea if the table is going to grow. You
could maybe give it a better estimate about how many times values are
repeated by setting some negative value, as described in the
documents. You'll need to analyze the table again after changing this
setting.

David

[1] https://www.postgresql.org/docs/12/sql-altertable.html