Thread: postgresql 8 abort with signal 10
Hi list, I'm running postgresql 8.0.1 on FreeBSD 4.11-STABLE, the machine is and AMD Sempron 2.2, 1GB Ram.. I use postgresql as database for dspam, an spam classification program. This database have and moderated use, on averange 10 simultaneous conections executing relative big queries using "in" clausule. Watching postgresql logs I see the following messages ocurs a lot of times in a day: May 3 06:58:44 e-filter postgres[250]: [21-1] LOG: server process (PID 59608) was terminated by signal 10 May 3 06:58:44 e-filter postgres[250]: [22-1] LOG: terminating any other active server processes May 3 06:58:44 e-filter postgres[59605]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59605]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59605]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59605]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59607]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59607]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59607]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59607]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59606]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59606]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59606]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59606]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59626]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59626]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59626]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59626]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59628]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59629]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59629]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59629]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59629]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59628]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59628]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59628]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59609]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59609]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59609]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59609]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:44 e-filter postgres[59627]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:44 e-filter postgres[59627]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:44 e-filter postgres[59627]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:44 e-filter postgres[59627]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:45 e-filter postgres[69093]: [23-1] WARNING: terminating connection because of crash of another server process May 3 06:58:45 e-filter postgres[69093]: [23-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:45 e-filter postgres[69093]: [23-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:45 e-filter postgres[69093]: [23-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:45 e-filter postgres[59620]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:46 e-filter postgres[59620]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:46 e-filter postgres[59620]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:46 e-filter postgres[59620]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:46 e-filter postgres[59619]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:46 e-filter postgres[59619]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:46 e-filter postgres[59619]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:46 e-filter postgres[59619]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:46 e-filter postgres[59624]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:46 e-filter postgres[59624]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:46 e-filter postgres[59624]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:46 e-filter postgres[59624]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:46 e-filter postgres[59623]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:46 e-filter postgres[59623]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:46 e-filter postgres[59623]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:46 e-filter postgres[59623]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:46 e-filter postgres[59625]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:46 e-filter postgres[59625]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:46 e-filter postgres[59625]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:46 e-filter postgres[59625]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:46 e-filter postgres[59622]: [21-1] WARNING: terminating connection because of crash of another server process May 3 06:58:46 e-filter postgres[59622]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 06:58:46 e-filter postgres[59622]: [21-3] process exited abnormally and possibly corrupted shared memory. May 3 06:58:46 e-filter postgres[59622]: [21-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 06:58:46 e-filter postgres[59621]: [22-1] WARNING: terminating connection because of crash of another server process May 3 06:58:49 e-filter postgres[250]: [23-1] LOG: all server processes terminated; reinitializing May 3 06:58:51 e-filter postgres[13478]: [24-1] LOG: database system was interrupted at 2005-05-03 06:58:16 EST May 3 06:58:51 e-filter postgres[13478]: [25-1] LOG: checkpoint record is at 14/99F69378 May 3 06:58:51 e-filter postgres[13478]: [26-1] LOG: redo record is at 14/99F69378; undo record is at 0/0; shutdown FALSE May 3 06:58:51 e-filter postgres[13478]: [27-1] LOG: next transaction ID: 3639687; next OID: 388415 May 3 06:58:51 e-filter postgres[13478]: [28-1] LOG: database system was not properly shut down; automatic recovery in progress May 3 06:58:51 e-filter postgres[13478]: [29-1] LOG: redo starts at 14/99F693B4 May 3 06:58:53 e-filter postgres[13478]: [30-1] LOG: record with zero length at 14/9AE223F0 May 3 06:58:53 e-filter postgres[13478]: [31-1] LOG: redo done at 14/9AE223C8 May 3 06:58:54 e-filter postgres[13484]: [24-1] FATAL: the database system is starting up May 3 06:58:54 e-filter postgres[13485]: [24-1] FATAL: the database system is starting up May 3 06:58:55 e-filter postgres[13488]: [24-1] FATAL: the database system is starting up May 3 06:58:57 e-filter postgres[13478]: [32-1] LOG: database system is ready and some time latter its ocur again: May 3 09:59:38 e-filter postgres[250]: [24-1] LOG: server process (PID 34743) was terminated by signal 10 May 3 09:59:38 e-filter postgres[250]: [25-1] LOG: terminating any other active server processes May 3 09:59:38 e-filter postgres[35215]: [24-1] WARNING: terminating connection because of crash of another server process May 3 09:59:38 e-filter postgres[35215]: [24-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 09:59:38 e-filter postgres[35215]: [24-3] process exited abnormally and possibly corrupted shared memory. May 3 09:59:38 e-filter postgres[35215]: [24-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. May 3 09:59:38 e-filter postgres[34744]: [24-1] WARNING: terminating connection because of crash of another server process May 3 09:59:38 e-filter postgres[34744]: [24-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server May 3 09:59:38 e-filter postgres[33592]: [24-1] WARNING: terminating connection because of crash of another server process May 3 09:59:38 e-filter postgres[34744]: [24-3] process exited abnormally and possibly corrupted shared memory. This is my postgresql.conf max_connections = 70 superuser_reserved_connections = 2 shared_buffers = 81920 work_mem = 10240 maintenance_work_mem = 51200 fsync = true checkpoint_segments = 8 effective_cache_size = 100000 log_destination = 'syslog' silent_mode = true lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' and the shared memory configuration: kern.ipc.shmmax: 700000000 kern.ipc.shmmin: 1 kern.ipc.shmmni: 192 kern.ipc.shmseg: 256 kern.ipc.shmall: 700000000 I have some configuration error that could result in this kind of problem ? Any ideas ? Any thoughts ? Best Regards, Alexandre
Alexandre Biancalana <biancalana@gmail.com> writes: > Watching postgresql logs I see the following messages ocurs a lot of > times in a day: > May 3 06:58:44 e-filter postgres[250]: [21-1] LOG: server process > (PID 59608) was terminated by signal 10 You need to find out what's triggering that. Turning on query logging would be a good way of investigating. regards, tom lane
On Tue, 2005-05-03 at 08:39, Alexandre Biancalana wrote: > Hi list, > > I'm running postgresql 8.0.1 on FreeBSD 4.11-STABLE, the machine is > and AMD Sempron 2.2, 1GB Ram.. > > I use postgresql as database for dspam, an spam classification > program. This database have and moderated use, on averange 10 > simultaneous conections executing relative big queries using "in" > clausule. > > Watching postgresql logs I see the following messages ocurs a lot of > times in a day: > > May 3 06:58:44 e-filter postgres[250]: [21-1] LOG: server process > (PID 59608) was terminated by signal 10 > May 3 06:58:44 e-filter postgres[250]: [22-1] LOG: terminating any > other active server processes SNIP > This is my postgresql.conf > > max_connections = 70 > superuser_reserved_connections = 2 > shared_buffers = 81920 Rather large, shared buffers for a machine with only 1 gig of ram. 640 Meg of RAM means the kernel is basically double buffering everything. have you tested with smaller settings and this setting was the best? You might want to look in your signal man page on BSD and see what signal 10 means. On solaris it's a bus error. Not a clue what it is in FreeBSD myself though. > work_mem = 10240 > maintenance_work_mem = 51200 > fsync = true > checkpoint_segments = 8 > effective_cache_size = 100000 > log_destination = 'syslog' > silent_mode = true > lc_messages = 'C' > lc_monetary = 'C' > lc_numeric = 'C' > lc_time = 'C' > > > and the shared memory configuration: > > kern.ipc.shmmax: 700000000 > kern.ipc.shmmin: 1 > kern.ipc.shmmni: 192 > kern.ipc.shmseg: 256 > kern.ipc.shmall: 700000000 > > > I have some configuration error that could result in this kind of problem ? > > Any ideas ? Any thoughts ? > > Best Regards, > Alexandre > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Alexandre, I saw reports (and observed the problem myself) that all sort of different softwares suffering from signal 11 under FreeBSD (more often seen on 5-STABLE). So far the collection is: Apache 1.3 (myself), Mysql (recent descussion on freebsd-stable list) and now postgresql... The hardware is not the point of failure here. Try to post this into freebsd-stable - perhaps additional problem report will help them find the cause. p.s. here is the last one I see in my apache error log: [Wed Mar 9 17:50:45 2005] [notice] child pid 95642 exit signal Segmentation fault (11) On 5/3/05, Alexandre Biancalana <biancalana@gmail.com> wrote: > Hi list, > > I'm running postgresql 8.0.1 on FreeBSD 4.11-STABLE, the machine is > and AMD Sempron 2.2, 1GB Ram.. > > I use postgresql as database for dspam, an spam classification > program. This database have and moderated use, on averange 10 > simultaneous conections executing relative big queries using "in" > clausule. > > Watching postgresql logs I see the following messages ocurs a lot of > times in a day: > > May 3 06:58:44 e-filter postgres[250]: [21-1] LOG: server process > (PID 59608) was terminated by signal 10 > May 3 06:58:44 e-filter postgres[250]: [22-1] LOG: terminating any > other active server processes > May 3 06:58:44 e-filter postgres[59605]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59605]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59605]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59605]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59607]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59607]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59607]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59607]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59606]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59606]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59606]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59606]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59626]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59626]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59626]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59626]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59628]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59629]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59629]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59629]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59629]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59628]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59628]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59628]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59609]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59609]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59609]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59609]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:44 e-filter postgres[59627]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:44 e-filter postgres[59627]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:44 e-filter postgres[59627]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:44 e-filter postgres[59627]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:45 e-filter postgres[69093]: [23-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:45 e-filter postgres[69093]: [23-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:45 e-filter postgres[69093]: [23-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:45 e-filter postgres[69093]: [23-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:45 e-filter postgres[59620]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:46 e-filter postgres[59620]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:46 e-filter postgres[59620]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:46 e-filter postgres[59620]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:46 e-filter postgres[59619]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:46 e-filter postgres[59619]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:46 e-filter postgres[59619]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:46 e-filter postgres[59619]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:46 e-filter postgres[59624]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:46 e-filter postgres[59624]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:46 e-filter postgres[59624]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:46 e-filter postgres[59624]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:46 e-filter postgres[59623]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:46 e-filter postgres[59623]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:46 e-filter postgres[59623]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:46 e-filter postgres[59623]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:46 e-filter postgres[59625]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:46 e-filter postgres[59625]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:46 e-filter postgres[59625]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:46 e-filter postgres[59625]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:46 e-filter postgres[59622]: [21-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:46 e-filter postgres[59622]: [21-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 06:58:46 e-filter postgres[59622]: [21-3] process exited > abnormally and possibly corrupted shared memory. > May 3 06:58:46 e-filter postgres[59622]: [21-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 06:58:46 e-filter postgres[59621]: [22-1] WARNING: terminating > connection because of crash of another server process > May 3 06:58:49 e-filter postgres[250]: [23-1] LOG: all server > processes terminated; reinitializing > May 3 06:58:51 e-filter postgres[13478]: [24-1] LOG: database system > was interrupted at 2005-05-03 06:58:16 EST > May 3 06:58:51 e-filter postgres[13478]: [25-1] LOG: checkpoint > record is at 14/99F69378 > May 3 06:58:51 e-filter postgres[13478]: [26-1] LOG: redo record is > at 14/99F69378; undo record is at 0/0; shutdown FALSE > May 3 06:58:51 e-filter postgres[13478]: [27-1] LOG: next > transaction ID: 3639687; next OID: 388415 > May 3 06:58:51 e-filter postgres[13478]: [28-1] LOG: database system > was not properly shut down; automatic recovery in progress > May 3 06:58:51 e-filter postgres[13478]: [29-1] LOG: redo starts at > 14/99F693B4 > May 3 06:58:53 e-filter postgres[13478]: [30-1] LOG: record with > zero length at 14/9AE223F0 > May 3 06:58:53 e-filter postgres[13478]: [31-1] LOG: redo done at 14/9AE223C8 > May 3 06:58:54 e-filter postgres[13484]: [24-1] FATAL: the database > system is starting up > May 3 06:58:54 e-filter postgres[13485]: [24-1] FATAL: the database > system is starting up > May 3 06:58:55 e-filter postgres[13488]: [24-1] FATAL: the database > system is starting up > May 3 06:58:57 e-filter postgres[13478]: [32-1] LOG: database system is ready > > and some time latter its ocur again: > May 3 09:59:38 e-filter postgres[250]: [24-1] LOG: server process > (PID 34743) was terminated by signal 10 > May 3 09:59:38 e-filter postgres[250]: [25-1] LOG: terminating any > other active server processes > May 3 09:59:38 e-filter postgres[35215]: [24-1] WARNING: terminating > connection because of crash of another server process > May 3 09:59:38 e-filter postgres[35215]: [24-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 09:59:38 e-filter postgres[35215]: [24-3] process exited > abnormally and possibly corrupted shared memory. > May 3 09:59:38 e-filter postgres[35215]: [24-4] HINT: In a moment > you should be able to reconnect to the database and repeat your > command. > May 3 09:59:38 e-filter postgres[34744]: [24-1] WARNING: terminating > connection because of crash of another server process > May 3 09:59:38 e-filter postgres[34744]: [24-2] DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server > May 3 09:59:38 e-filter postgres[33592]: [24-1] WARNING: terminating > connection because of crash of another server process > May 3 09:59:38 e-filter postgres[34744]: [24-3] process exited > abnormally and possibly corrupted shared memory. > > This is my postgresql.conf > > max_connections = 70 > superuser_reserved_connections = 2 > shared_buffers = 81920 > work_mem = 10240 > maintenance_work_mem = 51200 > fsync = true > checkpoint_segments = 8 > effective_cache_size = 100000 > log_destination = 'syslog' > silent_mode = true > lc_messages = 'C' > lc_monetary = 'C' > lc_numeric = 'C' > lc_time = 'C' > > and the shared memory configuration: > > kern.ipc.shmmax: 700000000 > kern.ipc.shmmin: 1 > kern.ipc.shmmni: 192 > kern.ipc.shmseg: 256 > kern.ipc.shmall: 700000000 > > I have some configuration error that could result in this kind of problem ? > > Any ideas ? Any thoughts ? > > Best Regards, > Alexandre > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Vlad
oops... you were writing about signal 10 not signal 11. my bad - sorry On 5/3/05, Vlad <marchenko@gmail.com> wrote: > Alexandre, > > I saw reports (and observed the problem myself) that all sort of > different softwares suffering from signal 11 under FreeBSD (more often > seen on 5-STABLE). So far the collection is: Apache 1.3 (myself), > Mysql (recent descussion on freebsd-stable list) and now postgresql... > The hardware is not the point of failure here. Try to post this into > freebsd-stable - perhaps additional problem report will help them find > the cause. > > p.s. here is the last one I see in my apache error log: > [Wed Mar 9 17:50:45 2005] [notice] child pid 95642 exit signal > Segmentation fault (11) > > On 5/3/05, Alexandre Biancalana <biancalana@gmail.com> wrote: > > Hi list, > > > > I'm running postgresql 8.0.1 on FreeBSD 4.11-STABLE, the machine is > > and AMD Sempron 2.2, 1GB Ram.. > > > > I use postgresql as database for dspam, an spam classification > > program. This database have and moderated use, on averange 10 > > simultaneous conections executing relative big queries using "in" > > clausule. > > > > Watching postgresql logs I see the following messages ocurs a lot of > > times in a day: > > > > May 3 06:58:44 e-filter postgres[250]: [21-1] LOG: server process > > (PID 59608) was terminated by signal 10 > > May 3 06:58:44 e-filter postgres[250]: [22-1] LOG: terminating any > > other active server processes > > May 3 06:58:44 e-filter postgres[59605]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59605]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59605]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59605]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59607]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59607]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59607]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59607]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59606]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59606]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59606]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59606]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59626]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59626]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59626]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59626]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59628]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59629]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59629]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59629]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59629]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59628]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59628]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59628]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59609]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59609]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59609]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59609]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:44 e-filter postgres[59627]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:44 e-filter postgres[59627]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:44 e-filter postgres[59627]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:44 e-filter postgres[59627]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:45 e-filter postgres[69093]: [23-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:45 e-filter postgres[69093]: [23-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:45 e-filter postgres[69093]: [23-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:45 e-filter postgres[69093]: [23-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:45 e-filter postgres[59620]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:46 e-filter postgres[59620]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:46 e-filter postgres[59620]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:46 e-filter postgres[59620]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:46 e-filter postgres[59619]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:46 e-filter postgres[59619]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:46 e-filter postgres[59619]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:46 e-filter postgres[59619]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:46 e-filter postgres[59624]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:46 e-filter postgres[59624]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:46 e-filter postgres[59624]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:46 e-filter postgres[59624]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:46 e-filter postgres[59623]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:46 e-filter postgres[59623]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:46 e-filter postgres[59623]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:46 e-filter postgres[59623]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:46 e-filter postgres[59625]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:46 e-filter postgres[59625]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:46 e-filter postgres[59625]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:46 e-filter postgres[59625]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:46 e-filter postgres[59622]: [21-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:46 e-filter postgres[59622]: [21-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 06:58:46 e-filter postgres[59622]: [21-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 06:58:46 e-filter postgres[59622]: [21-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 06:58:46 e-filter postgres[59621]: [22-1] WARNING: terminating > > connection because of crash of another server process > > May 3 06:58:49 e-filter postgres[250]: [23-1] LOG: all server > > processes terminated; reinitializing > > May 3 06:58:51 e-filter postgres[13478]: [24-1] LOG: database system > > was interrupted at 2005-05-03 06:58:16 EST > > May 3 06:58:51 e-filter postgres[13478]: [25-1] LOG: checkpoint > > record is at 14/99F69378 > > May 3 06:58:51 e-filter postgres[13478]: [26-1] LOG: redo record is > > at 14/99F69378; undo record is at 0/0; shutdown FALSE > > May 3 06:58:51 e-filter postgres[13478]: [27-1] LOG: next > > transaction ID: 3639687; next OID: 388415 > > May 3 06:58:51 e-filter postgres[13478]: [28-1] LOG: database system > > was not properly shut down; automatic recovery in progress > > May 3 06:58:51 e-filter postgres[13478]: [29-1] LOG: redo starts at > > 14/99F693B4 > > May 3 06:58:53 e-filter postgres[13478]: [30-1] LOG: record with > > zero length at 14/9AE223F0 > > May 3 06:58:53 e-filter postgres[13478]: [31-1] LOG: redo done at 14/9AE223C8 > > May 3 06:58:54 e-filter postgres[13484]: [24-1] FATAL: the database > > system is starting up > > May 3 06:58:54 e-filter postgres[13485]: [24-1] FATAL: the database > > system is starting up > > May 3 06:58:55 e-filter postgres[13488]: [24-1] FATAL: the database > > system is starting up > > May 3 06:58:57 e-filter postgres[13478]: [32-1] LOG: database system is ready > > > > and some time latter its ocur again: > > May 3 09:59:38 e-filter postgres[250]: [24-1] LOG: server process > > (PID 34743) was terminated by signal 10 > > May 3 09:59:38 e-filter postgres[250]: [25-1] LOG: terminating any > > other active server processes > > May 3 09:59:38 e-filter postgres[35215]: [24-1] WARNING: terminating > > connection because of crash of another server process > > May 3 09:59:38 e-filter postgres[35215]: [24-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 09:59:38 e-filter postgres[35215]: [24-3] process exited > > abnormally and possibly corrupted shared memory. > > May 3 09:59:38 e-filter postgres[35215]: [24-4] HINT: In a moment > > you should be able to reconnect to the database and repeat your > > command. > > May 3 09:59:38 e-filter postgres[34744]: [24-1] WARNING: terminating > > connection because of crash of another server process > > May 3 09:59:38 e-filter postgres[34744]: [24-2] DETAIL: The > > postmaster has commanded this server process to roll back the current > > transaction and exit, because another server > > May 3 09:59:38 e-filter postgres[33592]: [24-1] WARNING: terminating > > connection because of crash of another server process > > May 3 09:59:38 e-filter postgres[34744]: [24-3] process exited > > abnormally and possibly corrupted shared memory. > > > > This is my postgresql.conf > > > > max_connections = 70 > > superuser_reserved_connections = 2 > > shared_buffers = 81920 > > work_mem = 10240 > > maintenance_work_mem = 51200 > > fsync = true > > checkpoint_segments = 8 > > effective_cache_size = 100000 > > log_destination = 'syslog' > > silent_mode = true > > lc_messages = 'C' > > lc_monetary = 'C' > > lc_numeric = 'C' > > lc_time = 'C' > > > > and the shared memory configuration: > > > > kern.ipc.shmmax: 700000000 > > kern.ipc.shmmin: 1 > > kern.ipc.shmmni: 192 > > kern.ipc.shmseg: 256 > > kern.ipc.shmall: 700000000 > > > > I have some configuration error that could result in this kind of problem ? > > > > Any ideas ? Any thoughts ? > > > > Best Regards, > > Alexandre > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > -- > > Vlad > -- Vlad
>>You need to find out what's triggering that. Turning on query logging >>would be a good way of investigating. Which directives can I use to enable this ? debug_print_parse ? debug_print_rewritten ? debug_print_plan ? debug_pretty_print ? >>Rather large, shared buffers for a machine with only 1 gig of ram. 640 >>Meg of RAM means the kernel is basically double buffering everything. >>have you tested with smaller settings and this setting was the best? I had 256 of RAM then I increase to 1GB thinking this could be a problem of out of memory or a buggy memory...... After this "upgrade" I increase the numbers of shared buffers,etc It's important to say that the max memory usage reach to only 80%. What values do you suggest ? >>You might want to look in your signal man page on BSD and see what >>signal 10 means. On solaris it's a bus error. Not a clue what it is in >>FreeBSD myself though. FreeBSD man page say: 10 SIGBUS The system does not generate core dump file for this error..... Regards,
On Tue, May 03, 2005 at 09:54:03AM -0500, Scott Marlowe wrote: > > You might want to look in your signal man page on BSD and see what > signal 10 means. On solaris it's a bus error. Not a clue what it is in > FreeBSD myself though. Signal 10 is SIGBUS (bus error) on FreeBSD 4.11. Somewhere under $PGDATA there might be a core dump named postmaster.core (or, more specifically, with a file name based on the kern.corefile sysctl setting) -- if there is, then a debugger like gdb might be able to show where the problem happened, especially if the postmaster was built with debugging info. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Tue, May 03, 2005 at 01:36:13PM -0300, Alexandre Biancalana wrote: > > The system does not generate core dump file for this error..... Are you sure? Where did you look and what file name did you look for? Unless you've changed the kern.corefile sysctl setting, the file should be named "postgres.core", not just "core", and it should be somewhere under $PGDATA. Whether a core file is produced is also affected by the kern.coredump sysctl setting and the coredumpsize resource limit. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Tue, May 03, 2005 at 10:37:03AM -0600, Michael Fuhr wrote: > > Signal 10 is SIGBUS (bus error) on FreeBSD 4.11. Somewhere under > $PGDATA there might be a core dump named postmaster.core Correction: the core dump should be named postgres.core (at least it is on my FreeBSD 4.11-STABLE system if I send the backend a signal 10). -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Tue, 2005-05-03 at 11:36, Alexandre Biancalana wrote: > >>You need to find out what's triggering that. Turning on query logging > >>would be a good way of investigating. > > Which directives can I use to enable this ? > debug_print_parse ? debug_print_rewritten ? debug_print_plan ? > debug_pretty_print ? > > > >>Rather large, shared buffers for a machine with only 1 gig of ram. 640 > >>Meg of RAM means the kernel is basically double buffering everything. > >>have you tested with smaller settings and this setting was the best? > > I had 256 of RAM then I increase to 1GB thinking this could be a > problem of out of memory or a buggy memory...... After this "upgrade" > I increase the numbers of shared buffers,etc > > It's important to say that the max memory usage reach to only 80%. > > What values do you suggest ? Generally 25% of the memory or 256 Megs, whichever is less. In your case, they're the same. The Reasoning being that the kernel caches, while postgresql only really holds onto data as long as it needs it, then frees it, so having a really huge buffer space lets postgresql flush the kernel cache, then the next access, after postgresql has freed the memory that was holding the data, now has to go to disk. The kernel is generally a lot better at caching than most apps. So, 32768 is about as big as i'd normally go, and even that may be more than you really need. Note that there's overhead in managing such a large buffer as well. With pgsql 8.x and the new caching algorithms in place, such overhead may be lower, and larger buffer settings may be in order. But if testing hasn't shown them to be faster, i'd avoid them for now and see if your signal 10 errors start going away. If they do, then you've likely got a kernel bug in there somewhere. If they don't, I'd suspect bad hardware. > >>You might want to look in your signal man page on BSD and see what > >>signal 10 means. On solaris it's a bus error. Not a clue what it is in > >>FreeBSD myself though. > > FreeBSD man page say: 10 SIGBUS > > The system does not generate core dump file for this error.....
On 5/3/05, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > On Tue, 2005-05-03 at 11:36, Alexandre Biancalana wrote: > > >>You need to find out what's triggering that. Turning on query logging > > >>would be a good way of investigating. > > > > Which directives can I use to enable this ? > > debug_print_parse ? debug_print_rewritten ? debug_print_plan ? > > debug_pretty_print ? > > > > > > >>Rather large, shared buffers for a machine with only 1 gig of ram. 640 > > >>Meg of RAM means the kernel is basically double buffering everything. > > >>have you tested with smaller settings and this setting was the best? > > > > I had 256 of RAM then I increase to 1GB thinking this could be a > > problem of out of memory or a buggy memory...... After this "upgrade" > > I increase the numbers of shared buffers,etc > > > > It's important to say that the max memory usage reach to only 80%. > > > > What values do you suggest ? > > Generally 25% of the memory or 256 Megs, whichever is less. In your > case, they're the same. The Reasoning being that the kernel caches, > while postgresql only really holds onto data as long as it needs it, > then frees it, so having a really huge buffer space lets postgresql > flush the kernel cache, then the next access, after postgresql has freed > the memory that was holding the data, now has to go to disk. > > The kernel is generally a lot better at caching than most apps. > > So, 32768 is about as big as i'd normally go, and even that may be more > than you really need. Note that there's overhead in managing such a > large buffer as well. With pgsql 8.x and the new caching algorithms in > place, such overhead may be lower, and larger buffer settings may be in > order. But if testing hasn't shown them to be faster, i'd avoid them > for now and see if your signal 10 errors start going away. > > If they do, then you've likely got a kernel bug in there somewhere. If > they don't, I'd suspect bad hardware. > > > >>You might want to look in your signal man page on BSD and see what > > >>signal 10 means. On solaris it's a bus error. Not a clue what it is in > > >>FreeBSD myself though. > > > > FreeBSD man page say: 10 SIGBUS > > > > The system does not generate core dump file for this error..... > > Hi Michael, Here is my /etc/sysctl.conf: kern.corefile="/var/coredumps/%N.%P.core" kern.sugid_coredump=1 and how I said before, there is no one core file in /var/coredumps.... I should say that this structure to store core files it's ok, in past I used this a lot.... Thanks Scott I will lower shared_buffers to 32768 and try again, but how about work_mem, maintenance_work_mem, effective_cache_size ??
On Tue, 2005-05-03 at 12:25, Alexandre Biancalana wrote: > On 5/3/05, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > > On Tue, 2005-05-03 at 11:36, Alexandre Biancalana wrote: > > > >>You need to find out what's triggering that. Turning on query logging > > > >>would be a good way of investigating. > > > > > > Which directives can I use to enable this ? > > > debug_print_parse ? debug_print_rewritten ? debug_print_plan ? > > > debug_pretty_print ? > > > > > > > > > >>Rather large, shared buffers for a machine with only 1 gig of ram. 640 > > > >>Meg of RAM means the kernel is basically double buffering everything. > > > >>have you tested with smaller settings and this setting was the best? > > > > > > I had 256 of RAM then I increase to 1GB thinking this could be a > > > problem of out of memory or a buggy memory...... After this "upgrade" > > > I increase the numbers of shared buffers,etc > > > > > > It's important to say that the max memory usage reach to only 80%. > > > > > > What values do you suggest ? > > > > Generally 25% of the memory or 256 Megs, whichever is less. In your > > case, they're the same. The Reasoning being that the kernel caches, > > while postgresql only really holds onto data as long as it needs it, > > then frees it, so having a really huge buffer space lets postgresql > > flush the kernel cache, then the next access, after postgresql has freed > > the memory that was holding the data, now has to go to disk. > > > > The kernel is generally a lot better at caching than most apps. > > > > So, 32768 is about as big as i'd normally go, and even that may be more > > than you really need. Note that there's overhead in managing such a > > large buffer as well. With pgsql 8.x and the new caching algorithms in > > place, such overhead may be lower, and larger buffer settings may be in > > order. But if testing hasn't shown them to be faster, i'd avoid them > > for now and see if your signal 10 errors start going away. > > > > If they do, then you've likely got a kernel bug in there somewhere. If > > they don't, I'd suspect bad hardware. > > > > > >>You might want to look in your signal man page on BSD and see what > > > >>signal 10 means. On solaris it's a bus error. Not a clue what it is in > > > >>FreeBSD myself though. > > > > > > FreeBSD man page say: 10 SIGBUS > > > > > > The system does not generate core dump file for this error..... > > > > > > Hi Michael, > > Here is my /etc/sysctl.conf: > > kern.corefile="/var/coredumps/%N.%P.core" > kern.sugid_coredump=1 > > and how I said before, there is no one core file in /var/coredumps.... > I should say that this structure to store core files it's ok, in past > I used this a lot.... > > Thanks Scott I will lower shared_buffers to 32768 and try again, but > how about work_mem, maintenance_work_mem, effective_cache_size ?? work_mem is how much memory things like sorts can allocate. It really kind of depends on the kind of parallel load you're looking at possibly handling. If you'll never have more than a dozen or so open connections that could be doing sorts (select distinct, order by, union, etc...) then having it be 10 to 20 meg is fine. If you're going to handle hundreds or even thousands of connections, you have to be careful it's not big enough to run your machine out of memory, or you'll start getting swap storms. maintenance_work_mem is used by processes like vacuum, which tend to be run one at a time, so having it be fairly large, like 32 to 64 meg is no big issue. Note that you can set either of these settings higher for one shot things, like nightly maintenance, if you need to keep them lower during the day to ensure proper operation. effective_cache_size is a setting that simply tells the query planner about how much the kernel / OS is caching of your data set. Generally the cached value shown in top or some other system monitor on a dedicated machine is about right. work_mem and maintenance_work_mem are in 1k increments, while the other two, (buffers and effective_cache_size) are in 8k increments, btw.
Thank you for the detailed explanation Scott, they are very handy !! I reduced the shared_buffers to 32768, but the problem still occurs..... Any other idea ??
On Tue, 2005-05-03 at 15:04, Alexandre Biancalana wrote: > Thank you for the detailed explanation Scott, they are very handy !! > > I reduced the shared_buffers to 32768, but the problem still occurs..... > > Any other idea ?? Yeah, I had a sneaking suspicion that shared_buffers wasn't causing the issue really. Sounds like either a hardware fault, or a BSD bug. I'd check the BSD mailing lists for mention of said bug, and see if you can grab a spare drive and install the last stable version of FreeBSD 4.x and if that fixes the problem. If you decide to try linux, avoid the 2.6 kernel, it's still got issues... 2.4 is pretty stable. I really doubt it's a problem in postgresql itself though.
Ohhh god :( The FreeBSD is the last STABLE version..... I can try to change some hardware, I already changed memory, what can I try now ? the processor ? motherboard ?? On 5/3/05, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > On Tue, 2005-05-03 at 15:04, Alexandre Biancalana wrote: > > Thank you for the detailed explanation Scott, they are very handy !! > > > > I reduced the shared_buffers to 32768, but the problem still occurs..... > > > > Any other idea ?? > > Yeah, I had a sneaking suspicion that shared_buffers wasn't causing the > issue really. > > Sounds like either a hardware fault, or a BSD bug. I'd check the BSD > mailing lists for mention of said bug, and see if you can grab a spare > drive and install the last stable version of FreeBSD 4.x and if that > fixes the problem. > > If you decide to try linux, avoid the 2.6 kernel, it's still got > issues... 2.4 is pretty stable. > > I really doubt it's a problem in postgresql itself though. >
On Tue, 2005-05-03 at 15:56, Alexandre Biancalana wrote: > Ohhh god :( > > The FreeBSD is the last STABLE version..... I can try to change some > hardware, I already changed memory, what can I try now ? the processor > ? motherboard ?? You're running FreeBSD 5, right? I'd try to find the last version of 4 and put it on a spare drive and see if that works or has the same problem. If you're running 4, then I'd try a spare machine to see if the problem follows BSD or the hardware. If the error really is a buss error, then this problem is way out of the realm of what I'm familiar with. Especially with regards to BSD.
# biancalana@gmail.com / 2005-05-03 17:56:53 -0300: > The FreeBSD is the last STABLE version..... I can try to change some > hardware, I already changed memory, what can I try now ? the processor > ? motherboard ?? > On 5/3/05, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > > On Tue, 2005-05-03 at 15:04, Alexandre Biancalana wrote: > > > Thank you for the detailed explanation Scott, they are very handy !! > > > > > > I reduced the shared_buffers to 32768, but the problem still occurs..... > > > > > > Any other idea ?? > > > > Yeah, I had a sneaking suspicion that shared_buffers wasn't causing the > > issue really. > > > > Sounds like either a hardware fault, or a BSD bug. I'd check the BSD > > mailing lists for mention of said bug, and see if you can grab a spare > > drive and install the last stable version of FreeBSD 4.x and if that > > fixes the problem. > > > > If you decide to try linux, avoid the 2.6 kernel, it's still got > > issues... 2.4 is pretty stable. > > > > I really doubt it's a problem in postgresql itself though. For the sake of archives, what was causing the SIGBUSes? -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
I changed from postgresql to mysql and everything now is great ;) Same machine, same os, etc... On 6/2/05, Roman Neuhauser <neuhauser@sigpipe.cz> wrote: > # biancalana@gmail.com / 2005-05-03 17:56:53 -0300: > > The FreeBSD is the last STABLE version..... I can try to change some > > hardware, I already changed memory, what can I try now ? the processor > > ? motherboard ?? > > > On 5/3/05, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > > > On Tue, 2005-05-03 at 15:04, Alexandre Biancalana wrote: > > > > Thank you for the detailed explanation Scott, they are very handy !! > > > > > > > > I reduced the shared_buffers to 32768, but the problem still occurs..... > > > > > > > > Any other idea ?? > > > > > > Yeah, I had a sneaking suspicion that shared_buffers wasn't causing the > > > issue really. > > > > > > Sounds like either a hardware fault, or a BSD bug. I'd check the BSD > > > mailing lists for mention of said bug, and see if you can grab a spare > > > drive and install the last stable version of FreeBSD 4.x and if that > > > fixes the problem. > > > > > > If you decide to try linux, avoid the 2.6 kernel, it's still got > > > issues... 2.4 is pretty stable. > > > > > > I really doubt it's a problem in postgresql itself though. > > For the sake of archives, what was causing the SIGBUSes? > > -- > How many Vietnam vets does it take to screw in a light bulb? > You don't know, man. You don't KNOW. > Cause you weren't THERE. http://bash.org/?255991 >