Thread: Would like to know how analyze works technically
I have loaded data for a database conversion I am working on and I have created the indexes I believe I will need. My current design uses a schema for each client that will access the database. I am using schema because there is one set of global tables that I need to query with the tables within the different schema. I currently have 2000 active schema, but due to the overhead of creating the schema and associated tables I have created schema with empty databases for 8000 to cover the next several years of growth. Each schema has 130 tables. I am currently just in the testing phase and I am running this on a dual core xeon machine with 8GB of RAM and a 500GB SSD to hold the data. The current database size is 200GB. Running "analyze verbose;" and watching top, the system starts out using no swap data and about 4GB of cached memory and about 1GB of used memory. As it runs, the amount of used RAM climbs, and eventually the used swap memory increases to 100% and after being at that level for a couple of minutes, the analyze function crashes and indicates "server closed the connection unexpectedly." Also, as it progresses the scroll of tables being analyzed slows. Is it normal to have the used memory to continue to grow when performing analyze? I would think that since it is performing a table by table operation it would be releasing used memory when moving on to the next table. Is there anything I can do to get the analyze function to finish? -- View this message in context: http://postgresql.nabble.com/Would-like-to-know-how-analyze-works-technically-tp5844197.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
TonyS <tony@exquisiteimages.com> writes: > Running "analyze verbose;" and watching top, the system starts out using no > swap data and about 4GB of cached memory and about 1GB of used memory. As it > runs, the amount of used RAM climbs, and eventually the used swap memory > increases to 100% and after being at that level for a couple of minutes, the > analyze function crashes and indicates "server closed the connection > unexpectedly." ANALYZE is supposed to work in a constrained amount of memory, though that amount depends quite a bit on what the data is and what you've got the statistics target set to. We've heard reports that there are memory leaks in (some versions of?) PostGIS's analyze support. Whether that would apply to you would depend on whether you're using PostGIS. Hard to say more without a lot more concrete info about what your data is, what PG version you're using, etc. regards, tom lane
>
>
> TonyS <[hidden email]> writes:
>
>> Running "analyze verbose;" and watching top, the system starts out
>> using no swap data and about 4GB of cached memory and about 1GB of used
>> memory. As it runs, the amount of used RAM climbs, and eventually the
>> used swap memory increases to 100% and after being at that level for a
>> couple of minutes, the analyze function crashes and indicates "server
>> closed the connection unexpectedly."
>
> ANALYZE is supposed to work in a constrained amount of memory, though
> that amount depends quite a bit on what the data is and what you've got the
> statistics target set to.
>
> We've heard reports that there are memory leaks in (some versions of?)
> PostGIS's analyze support. Whether that would apply to you would depend
> on whether you're using PostGIS.
>
> Hard to say more without a lot more concrete info about what your
> data is, what PG version you're using, etc.
>
> regards, tom lane
>
Thanks for the response Tom.
I am not using PostGIS. The data in my system is mostly along the lines of
what you would see in an accounts payable, accounts receivable, and
billing type situation. Names and addresses of individuals, information
about billing, payments received, payments sent etc.
All of my indexes are b-tree indexes.
Currently, the largest individual table is 1.8GB.
select version() returns:
PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit
OS: Ubuntu 14.04.1 LTS
Physical memory: 8GB
The postgresql log has these entries at the crash point:
2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated by
signal 9: Killed
2015-04-01 06:24:38 EDT DETAIL: Failed process was running: analyze verbose;
2015-04-01 06:24:38 EDT LOG: terminating any other active server processes
I started this process at 11PM, so it ran for about 7.5 hours before
crashing.
Is there anything else that would be helpful?
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Wed, 1 Apr 2015 04:33:07 -0700 (MST) TonyS <tony@exquisiteimages.com> wrote: > On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote: > > > > TonyS <tony@exquisiteimages.com> writes: > > > >> Running "analyze verbose;" and watching top, the system starts out > >> using no swap data and about 4GB of cached memory and about 1GB of used > >> memory. As it runs, the amount of used RAM climbs, and eventually the > >> used swap memory increases to 100% and after being at that level for a > >> couple of minutes, the analyze function crashes and indicates "server > >> closed the connection unexpectedly." > > > > ANALYZE is supposed to work in a constrained amount of memory, though > > that amount depends quite a bit on what the data is and what you've got the > > statistics target set to. > > > > We've heard reports that there are memory leaks in (some versions of?) > > PostGIS's analyze support. Whether that would apply to you would depend > > on whether you're using PostGIS. > > > > Hard to say more without a lot more concrete info about what your > > data is, what PG version you're using, etc. > > > > regards, tom lane > > > > Thanks for the response Tom. > > I am not using PostGIS. The data in my system is mostly along the lines of > what you would see in an accounts payable, accounts receivable, and > billing type situation. Names and addresses of individuals, information > about billing, payments received, payments sent etc. > > All of my indexes are b-tree indexes. > > Currently, the largest individual table is 1.8GB. > > select version() returns: > PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu > 4.8.2-19ubuntu1) 4.8.2, 64-bit > > OS: Ubuntu 14.04.1 LTS > > Physical memory: 8GB > > The postgresql log has these entries at the crash point: > 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated by > signal 9: Killed > 2015-04-01 06:24:38 EDT DETAIL: Failed process was running: analyze verbose; > 2015-04-01 06:24:38 EDT LOG: terminating any other active server processes > > I started this process at 11PM, so it ran for about 7.5 hours before > crashing. > > Is there anything else that would be helpful? Don't know if I'm on the right track with this, but what is maintenance_work_mem set to on this system? -- Bill Moran
It sounds like your system had crashed several times. My suggestion would be first ensure that your tables and indexes are not corrupted. Second suggestion is to ensure your index is tightly represents the data you are accessing. The tighter it is the faster the response time. The less memory and CPU usage. Of course these are basic for any good DB but these essential before moving to more complex issues. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Bill Moran Sent: 01 April 2015 13:48 To: TonyS Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Would like to know how analyze works technically On Wed, 1 Apr 2015 04:33:07 -0700 (MST) TonyS <tony@exquisiteimages.com> wrote: > On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote: > > > > TonyS <tony@exquisiteimages.com> writes: > > > >> Running "analyze verbose;" and watching top, the system starts out > >> using no swap data and about 4GB of cached memory and about 1GB of > >> used memory. As it runs, the amount of used RAM climbs, and > >> eventually the used swap memory increases to 100% and after being > >> at that level for a couple of minutes, the analyze function crashes > >> and indicates "server closed the connection unexpectedly." > > > > ANALYZE is supposed to work in a constrained amount of memory, > > though that amount depends quite a bit on what the data is and what > > you've got the statistics target set to. > > > > We've heard reports that there are memory leaks in (some versions > > of?) PostGIS's analyze support. Whether that would apply to you > > would depend on whether you're using PostGIS. > > > > Hard to say more without a lot more concrete info about what your > > data is, what PG version you're using, etc. > > > > regards, tom lane > > > > Thanks for the response Tom. > > I am not using PostGIS. The data in my system is mostly along the > lines of what you would see in an accounts payable, accounts > receivable, and billing type situation. Names and addresses of > individuals, information about billing, payments received, payments sent etc. > > All of my indexes are b-tree indexes. > > Currently, the largest individual table is 1.8GB. > > select version() returns: > PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu > 4.8.2-19ubuntu1) 4.8.2, 64-bit > > OS: Ubuntu 14.04.1 LTS > > Physical memory: 8GB > > The postgresql log has these entries at the crash point: > 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated > by signal 9: Killed > 2015-04-01 06:24:38 EDT DETAIL: Failed process was running: analyze > verbose; > 2015-04-01 06:24:38 EDT LOG: terminating any other active server > processes > > I started this process at 11PM, so it ran for about 7.5 hours before > crashing. > > Is there anything else that would be helpful? Don't know if I'm on the right track with this, but what is maintenance_work_mem set to on this system? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
>
>>>> Running "analyze verbose;" and watching top, the system starts out
>>>> using no swap data and about 4GB of cached memory and about 1GB of
>>>> used memory. As it runs, the amount of used RAM climbs, and
>>>> eventually the used swap memory increases to 100% and after being at
>>>> that level for a couple of minutes, the analyze function crashes and
>>>> indicates "server closed the connection unexpectedly."
>>>
>>> ANALYZE is supposed to work in a constrained amount of memory, though
>>> that amount depends quite a bit on what the data is and what you've
>>> got the statistics target set to.
>>>
>>> We've heard reports that there are memory leaks in (some versions
>>> of?) PostGIS's analyze support. Whether that would apply to you would
>>> depend on whether you're using PostGIS.
>>>
>>> Hard to say more without a lot more concrete info about what your
>>> data is, what PG version you're using, etc.
>>>
>>> regards, tom lane
>>>
>>
>> Thanks for the response Tom.
>>
>>
>>
>
> Don't know if I'm on the right track with this, but what is
> maintenance_work_mem set to on this system?
>
> --
> Bill Moran
>
Hello Bill,
maintenance_work_mem is set to 480MB. I haven't changed that from what
pgtune suggested.
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
TonyS <tony@exquisiteimages.com> wrote: > The postgresql log has these entries at the crash point: > 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated by signal 9: Killed > 2015-04-01 06:24:38 EDT DETAIL: Failed process was running: analyze verbose; That was almost certainly the action of the OS's Out Of Memory Killer process. > Is there anything else that would be helpful? Unfortunately, leaving the OOM killer enabled causes the best evidence to be destroyed. If you disable the OOM killer and run this again, when memory is exhausted the database process attempting to allocate memory will dump a map of where its memory was allocated. That should give us something to work with regarding the cause. Try: vm.overcommit_memory = 2 vm.overcommit_ratio = 80 Also, it would be useful to see the output of this: SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override'); -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, 1 Apr 2015 06:26:36 -0700 (MST) TonyS <tony@exquisiteimages.com> wrote: > On Wed, April 1, 2015 8:48 am, Bill Moran [via PostgreSQL] wrote: > > > > >>>> Running "analyze verbose;" and watching top, the system starts out > >>>> using no swap data and about 4GB of cached memory and about 1GB of > >>>> used memory. As it runs, the amount of used RAM climbs, and > >>>> eventually the used swap memory increases to 100% and after being at > >>>> that level for a couple of minutes, the analyze function crashes and > >>>> indicates "server closed the connection unexpectedly." > >>> > >>> ANALYZE is supposed to work in a constrained amount of memory, though > >>> that amount depends quite a bit on what the data is and what you've > >>> got the statistics target set to. > >>> > >>> We've heard reports that there are memory leaks in (some versions > >>> of?) PostGIS's analyze support. Whether that would apply to you would > >>> depend on whether you're using PostGIS. > >>> > >>> Hard to say more without a lot more concrete info about what your > >>> data is, what PG version you're using, etc. > > > > Don't know if I'm on the right track with this, but what is > > maintenance_work_mem set to on this system? > > > > Hello Bill, > > maintenance_work_mem is set to 480MB. I haven't changed that from what > pgtune suggested. Doesn't seem unreasonable, so my guess isn't right. -- Bill Moran
>
>
> It sounds like your system had crashed several times.
>
>
> My suggestion would be first ensure that your tables and indexes are not
> corrupted.
>
> Second suggestion is to ensure your index is tightly represents the data
> you are accessing. The tighter it is the faster the response time. The
> less memory and CPU usage.
>
> Of course these are basic for any good DB but these essential before
> moving to more complex issues.
>
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Bill Moran
> Sent: 01 April 2015 13:48
> To: TonyS
> Cc: [hidden email]
> Subject: Re: [GENERAL] Would like to know how analyze works technically
>
>
> On Wed, 1 Apr 2015 04:33:07 -0700 (MST)
> TonyS <[hidden email]> wrote:
>
>
>> On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote:
>>
>>>
>>> TonyS <[hidden email]> writes:
>>>
>>>
>>>> Running "analyze verbose;" and watching top, the system starts out
>>>> using no swap data and about 4GB of cached memory and about 1GB of
>>>> used memory. As it runs, the amount of used RAM climbs, and
>>>> eventually the used swap memory increases to 100% and after being at
>>>> that level for a couple of minutes, the analyze function crashes and
>>>> indicates "server closed the connection unexpectedly."
>>>
Thanks for the suggestion. What command/tool do you use to check a
PostgreSQL database for corruption?
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Because your system had crashed. I would check everything from bottom up.
The duration of operation and memory usage does suggest it may be hitting a recurring problem.
First do a simple check on your hard disk. Making sure it is healthy there are no hanging indexes.
Then rebuild the whole database, this time ensuring *your indexing is much tighter than before*, also as a test to be less dependent on memory. I personally use SSDs for my DBs. It saves a lot of time. The other advantage of SSD is *the relative* lack of performance degradation due to fragmentation of data.
Once you are happy with this setup then try to optimise the performance. You might find the cost of a few good SSD will more than pay for themselves in terms of the cost of your team’s time.
Otherwise indexing shouldn’t take so long.
Hope this helps.
Good luck.
Best Regards
Farjad
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of TonyS
Sent: 01 April 2015 14:46
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Would like to know how analyze works technically
On Wed, April 1, 2015 9:25 am, FarjadFarid(ChkNet) [via PostgreSQL] wrote:
>
>
> It sounds like your system had crashed several times.
>
>
> My suggestion would be first ensure that your tables and indexes are not
> corrupted.
>
> Second suggestion is to ensure your index is tightly represents the data
> you are accessing. The tighter it is the faster the response time. The
> less memory and CPU usage.
>
> Of course these are basic for any good DB but these essential before
> moving to more complex issues.
>
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Bill Moran
> Sent: 01 April 2015 13:48
> To: TonyS
> Cc: [hidden email]
> Subject: Re: [GENERAL] Would like to know how analyze works technically
>
>
> On Wed, 1 Apr 2015 04:33:07 -0700 (MST)
> TonyS <[hidden email]> wrote:
>
>
>> On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote:
>>
>>>
>>> TonyS <[hidden email]> writes:
>>>
>>>
>>>> Running "analyze verbose;" and watching top, the system starts out
>>>> using no swap data and about 4GB of cached memory and about 1GB of
>>>> used memory. As it runs, the amount of used RAM climbs, and
>>>> eventually the used swap memory increases to 100% and after being at
>>>> that level for a couple of minutes, the analyze function crashes and
>>>> indicates "server closed the connection unexpectedly."
>>>
Thanks for the suggestion. What command/tool do you use to check a
PostgreSQL database for corruption?
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> TonyS <[hidden email]> wrote:
>
>
>> The postgresql log has these entries at the crash point:
>> 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated
>> by signal 9: Killed 2015-04-01 06:24:38 EDT DETAIL: Failed process was
>> running: analyze verbose;
>>
>
> That was almost certainly the action of the OS's Out Of Memory
> Killer process.
>
>
>> Is there anything else that would be helpful?
>>
>
> Unfortunately, leaving the OOM killer enabled causes the best
> evidence to be destroyed. If you disable the OOM killer and run this
> again, when memory is exhausted the database process attempting to
> allocate memory will dump a map of where its memory was allocated. That
> should give us something to work with regarding the cause. Try:
>
> vm.overcommit_memory = 2 vm.overcommit_ratio = 80
>
> Also, it would be useful to see the output of this:
>
>
> SELECT name, current_setting(name), source
> FROM pg_settings
> WHERE source NOT IN ('default', 'override');
>
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
Thanks for the follow up Kevin.
I made the changes to overcommit and started running the analyze operation
again about 2 hours ago, so if it stays true to form, it should be
crashing in about 5-1/2 hours.
The output from the query you suggested is:
name,current_setting,source
autovacuum,off,configuration file
checkpoint_completion_target,0.9,configuration file
checkpoint_segments,16,configuration file
checkpoint_timeout,1h,configuration file
client_encoding,WIN1252,session
default_statistics_target,10,configuration file
default_text_search_config,pg_catalog.english,configuration file
effective_cache_size,5132MB,configuration file
external_pid_file,/var/run/postgresql/9.3-main.pid,configuration file
fsync,off,configuration file
lc_messages,en_US.UTF-8,configuration file
lc_monetary,en_US.UTF-8,configuration file
lc_numeric,en_US.UTF-8,configuration file
lc_time,en_US.UTF-8,configuration file
listen_addresses,*,configuration file
log_line_prefix,%t ,configuration file
log_timezone,localtime,configuration file
maintenance_work_mem,480MB,configuration file
max_connections,5,configuration file
max_locks_per_transaction,512,configuration file
max_stack_depth,2MB,environment variable
port,5432,configuration file
shared_buffers,1920MB,configuration file
ssl,on,configuration file
ssl_cert_file,/etc/ssl/certs/ssl-cert-snakeoil.pem,configuration file
ssl_key_file,/etc/ssl/private/ssl-cert-snakeoil.key,configuration file
synchronous_commit,off,configuration file
TimeZone,localtime,configuration file
unix_socket_directories,/var/run/postgresql,configuration file
wal_buffers,8MB,configuration file
work_mem,1536MB,configuration file
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of TonyS
Sent: Wednesday, April 01, 2015 12:15 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Would like to know how analyze works technically
On Wed, April 1, 2015 9:35 am, Kevin Grittner-5 [via PostgreSQL] wrote:
>
>
> TonyS <[hidden email]> wrote:
>
>
>> The postgresql log has these entries at the crash point:
>> 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated
>> by signal 9: Killed 2015-04-01 06:24:38 EDT DETAIL: Failed process was
>> running: analyze verbose;
>>
>
> That was almost certainly the action of the OS's Out Of Memory
> Killer process.
>
>
>> Is there anything else that would be helpful?
>>
>
> Unfortunately, leaving the OOM killer enabled causes the best
> evidence to be destroyed. If you disable the OOM killer and run this
> again, when memory is exhausted the database process attempting to
> allocate memory will dump a map of where its memory was allocated. That
> should give us something to work with regarding the cause. Try:
>
> vm.overcommit_memory = 2 vm.overcommit_ratio = 80
>
> Also, it would be useful to see the output of this:
>
>
> SELECT name, current_setting(name), source
> FROM pg_settings
> WHERE source NOT IN ('default', 'override');
>
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
Thanks for the follow up Kevin.
I made the changes to overcommit and started running the analyze operation
again about 2 hours ago, so if it stays true to form, it should be
crashing in about 5-1/2 hours.
The output from the query you suggested is:
name,current_setting,source
autovacuum,off,configuration file
checkpoint_completion_target,0.9,configuration file
checkpoint_segments,16,configuration file
checkpoint_timeout,1h,configuration file
client_encoding,WIN1252,session
default_statistics_target,10,configuration file
default_text_search_config,pg_catalog.english,configuration file
effective_cache_size,5132MB,configuration file
external_pid_file,/var/run/postgresql/9.3-main.pid,configuration file
fsync,off,configuration file
lc_messages,en_US.UTF-8,configuration file
lc_monetary,en_US.UTF-8,configuration file
lc_numeric,en_US.UTF-8,configuration file
lc_time,en_US.UTF-8,configuration file
listen_addresses,*,configuration file
log_line_prefix,%t ,configuration file
log_timezone,localtime,configuration file
maintenance_work_mem,480MB,configuration file
max_connections,5,configuration file
max_locks_per_transaction,512,configuration file
max_stack_depth,2MB,environment variable
port,5432,configuration file
shared_buffers,1920MB,configuration file
ssl,on,configuration file
ssl_cert_file,/etc/ssl/certs/ssl-cert-snakeoil.pem,configuration file
ssl_key_file,/etc/ssl/private/ssl-cert-snakeoil.key,configuration file
synchronous_commit,off,configuration file
TimeZone,localtime,configuration file
unix_socket_directories,/var/run/postgresql,configuration file
wal_buffers,8MB,configuration file
work_mem,1536MB,configuration file
---
Ø work_mem,1536MB,configuration file
IIRC, your RAM is 8GB. Your work_mem is too high. Actual memory used for sorting, etc… could be multiples of work_mem setting.
That could be the reason for your memory problems. I’d suggest to set it to 16MB, and see if you can avoid “on disk” sorting. If not – gradually increase work_mem.
Regards,
Igor Neyman
Tony,
Before assuming that this is a bug/problem with postgresql and therefore out of your control.
It is best to ensure there is no problem with the set up you have. It will save you bundle of time.
I have several development system but regularly use postgresql DBs with SSD on a portable (slow CPU)
with several dbs *each* with over 6 million records just in one table and several hundred tables and still don’t get any problem.
It would be good to read your post on the final solution.
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of FarjadFarid(ChkNet)
Sent: 01 April 2015 17:14
To: 'TonyS'; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Would like to know how analyze works technically
Because your system had crashed. I would check everything from bottom up.
The duration of operation and memory usage does suggest it may be hitting a recurring problem.
First do a simple check on your hard disk. Making sure it is healthy there are no hanging indexes.
Then rebuild the whole database, this time ensuring *your indexing is much tighter than before*, also as a test to be less dependent on memory. I personally use SSDs for my DBs. It saves a lot of time. The other advantage of SSD is *the relative* lack of performance degradation due to fragmentation of data.
Once you are happy with this setup then try to optimise the performance. You might find the cost of a few good SSD will more than pay for themselves in terms of the cost of your team’s time.
Otherwise indexing shouldn’t take so long.
Hope this helps.
Good luck.
Best Regards
Farjad
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of TonyS
Sent: 01 April 2015 14:46
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Would like to know how analyze works technically
On Wed, April 1, 2015 9:25 am, FarjadFarid(ChkNet) [via PostgreSQL] wrote:
>
>
> It sounds like your system had crashed several times.
>
>
> My suggestion would be first ensure that your tables and indexes are not
> corrupted.
>
> Second suggestion is to ensure your index is tightly represents the data
> you are accessing. The tighter it is the faster the response time. The
> less memory and CPU usage.
>
> Of course these are basic for any good DB but these essential before
> moving to more complex issues.
>
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Bill Moran
> Sent: 01 April 2015 13:48
> To: TonyS
> Cc: [hidden email]
> Subject: Re: [GENERAL] Would like to know how analyze works technically
>
>
> On Wed, 1 Apr 2015 04:33:07 -0700 (MST)
> TonyS <[hidden email]> wrote:
>
>
>> On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote:
>>
>>>
>>> TonyS <[hidden email]> writes:
>>>
>>>
>>>> Running "analyze verbose;" and watching top, the system starts out
>>>> using no swap data and about 4GB of cached memory and about 1GB of
>>>> used memory. As it runs, the amount of used RAM climbs, and
>>>> eventually the used swap memory increases to 100% and after being at
>>>> that level for a couple of minutes, the analyze function crashes and
>>>> indicates "server closed the connection unexpectedly."
>>>
Thanks for the suggestion. What command/tool do you use to check a
PostgreSQL database for corruption?
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>>
>> TonyS <[hidden email]</user/SendEmail.jtp?type=node&node=5844292&i=0>>
>> wrote:
>>
>>
>
> name,current_setting,source autovacuum,off,configuration file
> synchronous_commit,off,configuration file TimeZone,localtime,configuration
> file unix_socket_directories,/var/run/postgresql,configuration file
> wal_buffers,8MB,configuration file work_mem,1536MB,configuration file
>
>
> ---
>
>
> ? work_mem,1536MB,configuration file
>
>
>
> IIRC, your RAM is 8GB. Your work_mem is too high. Actual memory used
> for sorting, etc... could be multiples of work_mem setting.
>
> That could be the reason for your memory problems. I'd suggest to set it
> to 16MB, and see if you can avoid "on disk" sorting. If not - gradually
> increase work_mem.
>
> Regards,
>
>
> Igor Neyman
>
Thanks Igor,
I will try changing that. I pretty much just let pgtune set all of those
values for me.
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Wed, 1 Apr 2015 10:47:32 -0700 (MST) TonyS <tony@exquisiteimages.com> wrote: > On Wed, April 1, 2015 12:30 pm, Igor Neyman [via PostgreSQL] wrote: > >> > >> TonyS <[hidden email]</user/SendEmail.jtp?type=node&node=5844292&i=0>> > >> wrote: > >> > >> > > > > name,current_setting,source autovacuum,off,configuration file > > synchronous_commit,off,configuration file TimeZone,localtime,configuration > > file unix_socket_directories,/var/run/postgresql,configuration file > > wal_buffers,8MB,configuration file work_mem,1536MB,configuration file > > > > > > --- > > > > > > ? work_mem,1536MB,configuration file > > > > > > > > IIRC, your RAM is 8GB. Your work_mem is too high. Actual memory used > > for sorting, etc... could be multiples of work_mem setting. > > > > That could be the reason for your memory problems. I'd suggest to set it > > to 16MB, and see if you can avoid "on disk" sorting. If not - gradually > > increase work_mem. > > > > Regards, > > > > > > Igor Neyman > > > > > Thanks Igor, > > I will try changing that. I pretty much just let pgtune set all of those > values for me. If pgtune set 1.5G of work_mem, then someone should file a bug report. -- Bill Moran
>
>
> TonyS <[hidden email]> wrote:
>
>
>> The postgresql log has these entries at the crash point:
>> 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated
>> by signal 9: Killed 2015-04-01 06:24:38 EDT DETAIL: Failed process was
>> running: analyze verbose;
>>
>
> That was almost certainly the action of the OS's Out Of Memory
> Killer process.
>
>
>> Is there anything else that would be helpful?
>>
>
> Unfortunately, leaving the OOM killer enabled causes the best
> evidence to be destroyed. If you disable the OOM killer and run this
> again, when memory is exhausted the database process attempting to
> allocate memory will dump a map of where its memory was allocated. That
> should give us something to work with regarding the cause. Try:
>
> vm.overcommit_memory = 2 vm.overcommit_ratio = 80
>
The analyze function has crashed again while the overcommit entries were
as above. The last bit of the PostgreSQL log shows:
MdSmgr: 41934848 total in 14 blocks; 639936 free (0 chunks); 41294912 used
ident parser context: 0 total in 0 blocks; 0 free (0 chunks); 0 used
hba parser context: 7168 total in 3 blocks; 2288 free (1 chunks); 4880 used
LOCALLOCK hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used
ErrorContext: 8192 total in 1 blocks; 8160 free (6 chunks); 32 used
2015-04-01 14:23:27 EDT ERROR: out of memory
2015-04-01 14:23:27 EDT DETAIL: Failed on request of size 80.
2015-04-01 14:23:27 EDT STATEMENT: analyze verbose;
From syslog leading up to the crash
Apr 1 06:24:37 postgrestest kernel: [26692.691936] [ 1434] 1000 1434
26408 70 53 197 0 sshd
Apr 1 06:24:37 postgrestest kernel: [26692.691939] [ 1435] 1000 1435
5593 0 16 452 0 bash
Apr 1 06:24:37 postgrestest kernel: [26692.691943] [ 1449] 1000 1449
6204 81 17 31 0 top
Apr 1 06:24:37 postgrestest kernel: [26692.691947] Out of memory: Kill
process 1384 (postgres) score 790 or sacrifice child
Apr 1 06:24:37 postgrestest kernel: [26692.693667] Killed process 1384
(postgres) total-vm:10044764kB, anon-rss:6963780kB, file-rss:202916kB
Apr 1 06:25:02 postgrestest CRON[1755]: (root) CMD (test -x
/usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
TonyS <tony@exquisiteimages.com> writes: > The analyze function has crashed again while the overcommit entries were > as above. The last bit of the PostgreSQL log shows: > MdSmgr: 41934848 total in 14 blocks; 639936 free (0 chunks); 41294912 used > ident parser context: 0 total in 0 blocks; 0 free (0 chunks); 0 used > hba parser context: 7168 total in 3 blocks; 2288 free (1 chunks); 4880 used > LOCALLOCK hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used > Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used > ErrorContext: 8192 total in 1 blocks; 8160 free (6 chunks); 32 used > 2015-04-01 14:23:27 EDT ERROR: out of memory > 2015-04-01 14:23:27 EDT DETAIL: Failed on request of size 80. > 2015-04-01 14:23:27 EDT STATEMENT: analyze verbose; We need to see all of that memory map, not just the last six lines of it. regards, tom lane
>
>
> TonyS <[hidden email]> writes:
>
>> The analyze function has crashed again while the overcommit entries
>> were as above. The last bit of the PostgreSQL log shows: MdSmgr: 41934848
>> total in 14 blocks; 639936 free (0 chunks); 41294912 used ident parser
>> context: 0 total in 0 blocks; 0 free (0 chunks); 0 used
>> hba parser context: 7168 total in 3 blocks; 2288 free (1 chunks); 4880
>> used LOCALLOCK hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
>> used Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728
>> used ErrorContext: 8192 total in 1 blocks; 8160 free (6 chunks); 32 used
>> 2015-04-01 14:23:27 EDT ERROR: out of memory
>> 2015-04-01 14:23:27 EDT DETAIL: Failed on request of size 80.
>> 2015-04-01 14:23:27 EDT STATEMENT: analyze verbose;
>>
>
> We need to see all of that memory map, not just the last six lines of it.
>
>
> regards, tom lane
>
I have used the procedures from this web page to try to get a core dump:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
If I follow the procedure and kill the postmaster pid while psql is
connected to it, it does generate a core dump; however, no core dump is
generated when the error I have been experiencing occurs.
I guess at this point I am just going to rebuild from the Linux
installation up. I also tried changing the work_mem to 16MB, but that
didn't seem to make a difference.
View this message in context: Re: Would like to know how analyze works technically
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 4/2/15 2:18 PM, TonyS wrote: > On Wed, April 1, 2015 5:50 pm, Tom Lane-2 [via PostgreSQL] wrote: > > > > > > > TonyS <[hidden email] > </user/SendEmail.jtp?type=node&node=5844517&i=0>> writes: > > > >> The analyze function has crashed again while the overcommit entries > >> were as above. The last bit of the PostgreSQL log shows: MdSmgr: > 41934848 > >> total in 14 blocks; 639936 free (0 chunks); 41294912 used ident parser > >> context: 0 total in 0 blocks; 0 free (0 chunks); 0 used > >> hba parser context: 7168 total in 3 blocks; 2288 free (1 chunks); 4880 > >> used LOCALLOCK hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 > >> used Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 > >> used ErrorContext: 8192 total in 1 blocks; 8160 free (6 chunks); 32 > used > >> 2015-04-01 14:23:27 EDT ERROR: out of memory > >> 2015-04-01 14:23:27 EDT DETAIL: Failed on request of size 80. > >> 2015-04-01 14:23:27 EDT STATEMENT: analyze verbose; > >> > > > > We need to see all of that memory map, not just the last six lines of > it. > > > > > > regards, tom lane > > > > > I have used the procedures from this web page to try to get a core dump: > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD > > If I follow the procedure and kill the postmaster pid while psql is > connected to it, it does generate a core dump; however, no core dump is > generated when the error I have been experiencing occurs. > > I guess at this point I am just going to rebuild from the Linux > installation up. I also tried changing the work_mem to 16MB, but that > didn't seem to make a difference. I don't know that a core dump will be helpful here. What Tom was talking about were all those lines in your log file, talking about blah context: xxx total in xxx blocks;... That's diagnostics about where PG has used all it's memory. That's what we need here. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com