Home > mailing lists

Re: Single query uses all system memory - Mailing list pgsql-general

From	Fabio Pardi
Subject	Re: Single query uses all system memory
Date	June 26, 2018 16:51:42
Msg-id	5f160fdf-7fa3-f840-1d5d-b827141dbd45@portavita.eu Whole thread Raw
In response to	Re: Single query uses all system memory (Magnus Hagander <magnus@hagander.net>)
List	pgsql-general

Tree view

Hi Ivar,

I agree with Magnus. As addition, also i would change:

effective_cache_size
and

maintenance_work_mem

Then disable OOM killer, change logging in order to log files on disk to see if your work_mem is too low and spills on disk, rethink your data structure and, overall, I think that you can have a look to this page, which I find a good starting point for a proper tuning:

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

regards,

fabio pardi

On 26/06/18 12:13, Magnus Hagander wrote:

On Tue, Jun 26, 2018 at 11:40 AM, Ivar Fredriksen <ivar.fredriksen@gmail.com> wrote:
A single large query is able to spend all the system memory (as reported by top), and the oom_reaper kills postgres. See bottom of email for an example query and logs.

Expected behavior would be that postgres is not killed and the query is limited to the shared_buffer memory, potentially failing, but not killing the postgres process.

Then your expectations are completely wrong. shared_buffers have nothing to do with limiting the memory of a query. shared_buffers set the size of the PostgreSQL cache, not the working set. That's controlled by work_mem (see below).

Do I have some fundamental flaw in my understanding of this, or is there some sort of configuration that should be in place that is missing? The offending sample query has been
rewritten to not use joins, and will no longer crash postgres. I am not asking anyone to spend a lot of time analyzing the query itself, it is just provided as an example for when the problem will occur.
My question is more in a broader sense why one query is eating through all system memory, and is there anything I can do to prevent this from happening?
We have set shared_buffers to 256MB on a test-system that has 1GB memory. Production machines have more resources, but the errors are happening in exactly the same way, so I assume (perhaps wrongly) that using the test
system specs and logs might give me the answers I need to figure out what is happening.

Technical details are provided below, a big thanks to anyone who can provide any help or insight to this.

Regards,
Ivar Fredriksen

PostgreSQL version number:
PostgreSQL 10.4 (Debian 10.4-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
Installed with the debian packages for postgres found at: deb http://apt.postgresql.org/pub/repos/apt/

Changes made to the settings in the postgresql.conf file:
             name             |             current_setting             |        source
------------------------------+-----------------------------------------+----------------------
application_name             | psql                                    | client
checkpoint_completion_target | 0.9                                     | configuration file
client_encoding              | UTF8                                    | client
cluster_name                 | 10/main                                 | configuration file
DateStyle                    | ISO, MDY                                | configuration file
default_statistics_target    | 100                                     | configuration file
default_text_search_config   | pg_catalog.english                      | configuration file
dynamic_shared_memory_type   | posix                                   | configuration file
effective_cache_size         | 1536MB                                  | configuration file
external_pid_file            | /var/run/postgresql/10-main.pid         | configuration file
lc_messages                  | C                                       | configuration file
lc_monetary                  | C                                       | configuration file
lc_numeric                   | en_US.UTF-8                             | configuration file
lc_time                      | C                                       | configuration file
listen_addresses             | *                                       | configuration file
log_line_prefix              | %m [%p] %q%u@%d                         | configuration file
log_timezone                 | UTC                                     | configuration file
maintenance_work_mem         | 128MB                                   | configuration file
max_connections              | 100                                     | configuration file
max_stack_depth              | 2MB                                     | environment variable
max_wal_size                 | 2GB                                     | configuration file
min_wal_size                 | 1GB                                     | configuration file
port                         | 5432                                    | configuration file
search_path                  | "$user", public, pg_catalog             | configuration file
shared_buffers               | 256MB                                   | configuration file
ssl                          | on                                      | configuration file
ssl_cert_file                | /etc/ssl/certs/ssl-cert-snakeoil.pem    | configuration file
ssl_key_file                 | /etc/ssl/private/ssl-cert-snakeoil.key | configuration file
standard_conforming_strings | off                                     | configuration file
stats_temp_directory        | /var/run/postgresql/10-main.pg_stat_tmp | configuration file
tcp_keepalives_idle          | 0                                       | configuration file
TimeZone                     | UTC                                     | configuration file
unix_socket_directories      | /var/run/postgresql                     | configuration file
wal_buffers                  | 16MB                                    | configuration file
work_mem                     | 1310kB                                  | configuration file
(35 rows)

Operating system and version:
Debian GNU/Linux 9.4 (stretch)
Linux ip-172-31-30-104 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1 (2018-04-29) x86_64 GNU/Linux

Machine specs:
CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
Memory: 1GB
HDD: General purpose SSD from amazon ec2, only one disk

With only 1GB of memory, your value for work_mem is probably too high for a query like yours.

Even with 1MB, that's 1MB per join. You seem to have about 350 joins? That would be 350Mb right there. Plus a lot of more overhead.

In general, what's up with 350 joins?! That seems to be, ahem, quite insane.

For a query like that you probably have to reduce everything to bare minimums -- I'd try with shared_buffers at maybe 64MB and work_mem at 100kB and see if it passes. (it'll probably still be very slow, but that's to be expected from a query like that)

But I'd sum it up as (1) don't do 350 joins. Just don't. And (2) if you have to, then a machine with just 1Gb is simply too small.

//Magnus

pgsql-general by date:

From: Andreas Kretschmer
Date: 26 June 2018, 16:32:44
Subject: Re: We find few queries running three times simultaneously with sameparameters on postgres db

From: Saurabh Agrawal
Date: 26 June 2018, 17:49:01
Subject: Re: We find few queries running three times simultaneously with sameparameters on postgres db

Re: Single query uses all system memory - Mailing list pgsql-general

Previous

Next