Re: Analyze against a table with geometry columns runs out of memory - Mailing list pgsql-general

From Paul Ramsey
Subject Re: Analyze against a table with geometry columns runs out of memory
Date
Msg-id etPan.536b657b.6b68079a.9d4e@Butterfly.local
Whole thread Raw
In response to Analyze against a table with geometry columns runs out of memory  (Roxanne Reid-Bennett <rox@tara-lu.com>)
Responses Re: Analyze against a table with geometry columns runs out of memory  (Roxanne Reid-Bennett <rox@tara-lu.com>)
List pgsql-general
Roxanne, you seem to have isolated the problem to a particular geometry column, which speaks to this being a PostGIS problem. Since the analyze code was re-written in 2.1, and your issue is coming up in a 2.0>2.1 upgrade, that further points to the issue potentially being a PostGIS problem. Unless the same data works in a PgSQL 9.2/PostGIS 2.1 combo, it seems clear that PgSQL 9.3 is not the culprit here. (Though I would love to be reassured that 9.2/2.1 combo also does not work, since that eliminates a bad interaction between 9.3/2.1 as the issue.)

At some point in order to debug I’ll probably need a copy of the data, or access to a system that has the data and a dev environment. Please do file a ticket at http://trac.osgeo.org/postgis on this issue.

P

-- 
Paul Ramsey
http://cleverelephant.ca
http://postgis.net

On May 7, 2014 at 11:15:10 PM, Roxanne Reid-Bennett (rox@tara-lu.com) wrote:

Hello,

We are working out the upgrade of our servers from Postgres 9.1 and
Postgis 2.0 to Postgres 9.3 and Postgis 2.1
After building the base stack, The System Admin restored the database
from a backup. [I'll ask for more details if you need them]

I have 3 tables with geometry columns in them that when they are
autovacuumed, vacuumed, or analyzed run the system out of memory. I
have isolated that the problem for one of the tables is related to a
geometry column. I have tables in the system that are much larger on
disk with geometry columns in them that vacuum analyze just fine, so it
isn't just that they have geometry columns. Two of the tables are
related to each other, the other is a load of Government supplied data
and completely separate in detail and concept for data.

Using the smallest table... we looked at maintenance_work_mem and tried
several runs with varying values [16MB, 64MB, 256MB, and 500MB]. Larger
maintenance_work_mem allows the process to run longer before it starts
gobbling up swap, but the process still spends most of it's time in
"uninterruptible sleep (usually IO)" state and just eats up the swap
until all of the memory is gone.

Smallest table definition, config and log file entries, etc follow
below. If I have failed to provide necessary or desired information,
just ask.

We have noted that the memory management was changed going into 9.3 -
but we haven't been able to find anything that would indicate any known
issues ... This problem caused us to take a hard look at the stack
again, and we will be building a new stack anyway because we need a
newer GEOS - but we are seriously considering dropping Postgres back to
9.2.

I am out of ideas on what else to try after maintenance_work_mem ...
Does anybody have any suggestions/questions/observations for me?

Thank you.

Roxanne
------------------

VirutualBox: 4.1.24 Intel Xeon 2.13 GHz (8) 48 Gb RAM
Virtual Box instance: 64 Bit 4 Processors Base Memory: 12Gb

running Ubuntu 12.04.1 LTS
Linux 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux

Postgres: PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
PostGis: POSTGIS="2.1.2 r12389" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel.
4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08"
LIBXML="2.7.8" LIBJSON="UNKNOWN" TOPOLOGY RASTER

Postgres.conf entries (of probable interest - if I didn't list it, it's
probably defaulted):

max_connections = 100
shared_buffers = 4089196kB
work_mem = 128MB
maintenance_work_mem = 64MB
checkpoint_segments = 64
checkpoint_timeout = 30min
checkpoint_completion_target = 0.75
effective_cache_size = 4089196kB
default_statistics_target = 200
autovacuum_max_workers = 1 [this is normally set to 3]

Analyzing the original table "activity" failed. Using a copy of the
original table with no indexes, no foreign keys, no constraints also
failed. However, dropping one of the two geometry columns (region) out
of the copy allowed it to succeed. Taking a copy of just "region" which
contains (Multi)Polygons and the primary key via "CREATE TABLE ... as
(Select...)", from the original table "activity" to create
temp.region... analyze runs out of memory. The following were run
against temp.region.

smallest/shortest table definition from \d:

Table "temp.region"
Column | Type | Modifiers
-------------+-------------------------+-----------
activity_id | integer |
region | geometry(Geometry,4326) |

<shell>
HQ4_Staging=# analyze verbose temp.region;
INFO: 00000: analyzing "temp.region"
LOCATION: do_analyze_rel, analyze.c:335
INFO: 00000: "region": scanned 1022 of 1022 pages, containing 52990
live rows and 0 dead rows; 52990 rows in sample, 52990 estimated total rows
LOCATION: acquire_sample_rows, analyze.c:1299
The connection to the server was lost. Attempting reset: Failed.
</shell>

Duration of the above was approximately 1.25 hrs.

The Log files show:
<postgres>
2014-05-07 16:56:56 EDT|2054| LOG: server process (PID 6663) was
terminated by signal 9: Killed
2014-05-07 16:56:56 EDT|2054| DETAIL: Failed process was running:
analyze verbose temp.region;
2014-05-07 16:56:56 EDT|2054| LOG: terminating any other active server
processes
<syslog>
May 7 16:56:55 hq4-staging-database kernel: [458605.351369] postgres
invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0

... [I have the full stack if anyone wants it]

May 7 16:56:55 hq4-staging-database kernel: [458605.408021] Out of
memory: Kill process 6663 (postgres) score 920 or sacrifice child
May 7 16:56:55 hq4-staging-database kernel: [458605.412287] Killed
process 6663 (postgres) total-vm:20269840kB, anon-rss:8625876kB,
file-rss:3082472kB


This is a test box, which we know is much slower/smaller than our
production box, but normally sufficient. As a test box, we had no other
major activity going on. We do have a cron job that looks for reports
to be run, but no reports were run during these tests and the cron job
doesn't interact with this table.

I did find bug #2267: "Server crash from analyze table" from last April
- it's test case does NOT crash this stack [which is good because it
shouldn't]


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Stuart Bishop
Date:
Subject: Re: Ubuntu Packages / Config Files
Next
From: Vincent de Phily
Date:
Subject: Receiving many more rows than expected