Thread: Degradation of postgres 7.4.5 on FreeBSD/CygWin
Hi All I'm really desparate about this. The problem has occurried in both of my customers first with cygwin and now with FreeBSD 5.3. After 2 months, postgres start get down the performance, and simple queries that should run in 100ms now tooks about 15 secs. Another behaviour, the data is growing to much, with no reason, just like the comparision. So, to solve problem, for the 5th time, a made a backup, dropped the entire database, recreate e reimported. One friend of mine tell me about same problem in linux and he go back to 7.3.x, and with me 5 times. The old data have this sizes: $ du -ks * | sort -nr 1379872 base 131202 pg_xlog 390 global 336 serverlog 74 pg_clog 8 postgresql.conf 4 pg_hba.conf 2 postmaster.opts 2 pg_ident.conf 2 PG_VERSION The Reimported database has this sizes: $ du -ks * | sort -nr 916496 base 131202 pg_xlog 134 global 14 serverlog 10 pg_clog 8 postgresql.conf 4 pg_hba.conf 2 postmaster.pid 2 postmaster.opts 2 pg_ident.conf 2 PG_VERSION This Procedure took 100 ms, but before re-import it took about 15secs, in a process that have a 1000 itens its took about 4 hours to finish, and after re-import 5 minutes. The bottleneck is this recursion procedure, that is a part os others procedure, but it have a simple query. CREATE OR REPLACE FUNCTION Produt_Repos(numeric, double precision, integer, integer) RETURNS double precision AS ' DECLARE xcodpro ALIAS FOR $1; xPesfor ALIAS FOR $2; xAno ALIAS FOR $3; xMes ALIAS FOR $4; oMatpro RECORD; xPreRep DOUBLE PRECISION; nPreRep DOUBLE PRECISION; xQtdKgs DOUBLE PRECISION; xPreCus DOUBLE PRECISION; BEGIN xPreRep := 0; IF xPesFor <> 0 THEN FOR oMatpro IN SELECT a.qtdpro, a.codmat, b.pesfor FROM matpro a, produt b WHERE a.codpro = xCodpro AND b.codpro = a.codmat LOOP xQtdKgs := oMatpro.QtdPro / xPesFor; nPreRep := Produt_Repos( oMatpro.codmat, coalesce(oMatpro.pesfor, 0.0), xAno, xMes); xPrerep := xPrerep + (nPreRep * xQtdKgs); IF nPreRep = 0 THEN SELECT coalesce(PreCus, 0.0) INTO xPreCus FROM produt_fecha WHERE codpro = oMatPro.codmat and ano = xAno and mes = xMes LIMIT 1; xPreRep := xPrerep + ( xPrecus * xQtdKgs ); END IF; END LOOP; END IF; RETURN xPreRep; END; ' LANGUAGE 'plpgsql'; This are my configs: msginfo: msgmax: 16384 (max characters in a message) msgmni: 40 (# of message queues) msgmnb: 2048 (max characters in a message queue) msgtql: 40 (max # of messages in system) msgssz: 8 (size of a message segment) msgseg: 2048 (# of message segments in system) shminfo: shmmax: 163840000 (max shared memory segment size) shmmin: 1 (min shared memory segment size) shmmni: 4000 (max number of shared memory identifiers) shmseg: 128 (max shared memory segments per process) shmall: 40000 (max amount of shared memory in pages) seminfo: semmap: 30 (# of entries in semaphore map) semmni: 40961 (# of semaphore identifiers) semmns: 16380 (# of semaphores in system) semmnu: 30 (# of undo structures in system) semmsl: 16380 (max # of semaphores per id) semopm: 100 (max # of operations per semop call) semume: 10 (max # of undo entries per process) semusz: 92 (size in bytes of undo structure) semvmx: 32767 (semaphore maximum value) semaem: 16384 (adjust on exit max value) max_connections = 30 shared_buffers = 8192 # min 16, at least max_connections*2, 8KB each sort_mem = 32768 # min 64, size in KB vacuum_mem = 32768 # min 1024, size in KB max_fsm_pages = 40000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 2000 # min 100, ~50 bytes each These are my crontab activities: $ crontab -l 00 13 * * 1-5 /bin/sh /home/postgres/backup.sh >/dev/null 2>&1 00 19 * * 1-5 /bin/sh /home/postgres/backup.sh >/dev/null 2>&1 00 23 * * 1-5 /usr/local/pgsql/bin/psql supre -c "vacuum analyze;" >>/dev/null 2>&1 00 23 * * 6 /usr/local/pgsql/bin/psql supre -c "reindex database supre;" >>/dev/null 2>&1 00 23 * * 7 /usr/local/pgsql/bin/psql supre -c "vacuum full analyze;" >>/dev/null 2>&1 So guys, i'm really desparate about this issue, and i think i'm doing everthing right. Please help me. If i tell to my customer that he is having the same problem that in cygwin version, after spending money to change from windows to freebsd,upgrading server, etc, problably he will kill me. :) Best Regards Rodrigo Moreno
On Fri, Feb 18, 2005 at 11:54:34AM -0300, Rodrigo Moreno wrote: > 00 23 * * 1-5 /usr/local/pgsql/bin/psql supre -c "vacuum analyze;" >>>/dev/null 2>&1 Isn't vacuum once a day a bit too little with heavy activity? You should probably consider autovacuum. > 00 23 * * 6 /usr/local/pgsql/bin/psql supre -c "reindex database supre;" >>>/dev/null 2>&1 REINDEX DATABASE does (AFAIK) only index the indexes on the system tables in the database. /* Steinar */ -- Homepage: http://www.sesse.net/
"Rodrigo Moreno" <rodrigo.miguel@terra.com.br> writes: > After 2 months, postgres start get down the performance, and simple queries > that should run in 100ms now tooks about 15 secs. > Another behaviour, the data is growing to much, with no reason, just like > the comparision. Are you vacuuming on a regular basis? Do you have the FSM settings high enough to cover the database? regards, tom lane
On Fri, Feb 18, 2005 at 09:32:25AM -0500, Tom Lane wrote: > Are you vacuuming on a regular basis? Do you have the FSM settings high > enough to cover the database? He posted his cron settings ;-) /* Steinar */ -- Homepage: http://www.sesse.net/
Hi, this is only max 15 concurrent conections. And is not a heavy performance database, so i think this is not necessary vacumm more than once a day. In another customer, has only 5 users and the database have 300mb, small database, and has the same behaviour (haven't modified postgresql). My first instalation was not changed anything in postgresql.conf, but in this new server (FreeBSD) i have changed some parameters. as showed in my crontab list, i think this is enough: 00 13 * * 1-5 /bin/sh /home/postgres/backup.sh >/dev/null 2>&1 00 19 * * 1-5 /bin/sh /home/postgres/backup.sh >/dev/null 2>&1 00 23 * * 1-5 /usr/local/pgsql/bin/psql supre -c "vacuum analyze;" >>/dev/null 2>&1 00 23 * * 6 /usr/local/pgsql/bin/psql supre -c "reindex database supre;" >>/dev/null 2>&1 00 23 * * 7 /usr/local/pgsql/bin/psql supre -c "vacuum full analyze;" >>/dev/null 2>&1 These my changed configs in postgresql.conf: max_connections = 30 shared_buffers = 8192 sort_mem = 32768 vacuum_mem = 32768 max_fsm_pages = 40000 max_fsm_relations = 2000 But why after 2 months the database has 1.3gb and after reimport on 900mb ? Both customer are smaller databases, but one of them, has 8 years os data, it's the reason of size 900mb, these are too smaller database. Regards Rodrigo Moreno
> this is only max 15 concurrent conections. And is not a heavy performance > database, so i think this is not necessary vacumm more than once a day. > > In another customer, has only 5 users and the database have 300mb, small > database, and has the same behaviour (haven't modified postgresql). > My first instalation was not changed anything in postgresql.conf, but in > this new server (FreeBSD) i have changed some parameters. > > as showed in my crontab list, i think this is enough: > 00 13 * * 1-5 /bin/sh /home/postgres/backup.sh >/dev/null 2>&1 > 00 19 * * 1-5 /bin/sh /home/postgres/backup.sh >/dev/null 2>&1 > 00 23 * * 1-5 /usr/local/pgsql/bin/psql supre -c "vacuum analyze;" We just told you - it's nowhere near enough. Vacuum once an hour. Size of the database is not that relevant, its size of changes that is. Chris
> 00 23 * * 1-5 /usr/local/pgsql/bin/psql supre -c "vacuum analyze;" Also, this is bad - you are not vacuuming all your databases, which will cause you data loss one day with transaction wraparound. Use the vacuumdb utility that comes with PostgreSQL instead. Chris
"Rodrigo Moreno" <rodrigo.miguel@terra.com.br> writes: > max_fsm_pages = 40000 > max_fsm_relations = 2000 > But why after 2 months the database has 1.3gb and after reimport on 900mb ? 40k pages = 320M bytes = 1/3rd of your database. Perhaps you need a larger setting for max_fsm_pages. However, 30% bloat of the database doesn't particularly bother me, especially when you are using infrequent vacuums. Bear in mind that, for example, the steady-state fill factor of a b-tree index is usually estimated at less than 70%. A certain amount of wasted space is not only intended, but essential for reasonable performance. What you need is to take a more detailed look at the behavior of that function that's getting so slow. Are the query plans changing? Is the loop iterating over many more rows than before? You haven't told us anything that would account for 100x slowdown. regards, tom lane
Thanks to all, at this moment, can't stop the database and put back the old database, but at night i will take more analyzes on old database and reimported and i put here the results. Thanks a lot Rodrigo -----Mensagem original----- De: Tom Lane [mailto:tgl@sss.pgh.pa.us] Enviada em: sexta-feira, 18 de fevereiro de 2005 12:00 Para: Rodrigo Moreno Cc: pgsql-performance@postgresql.org Assunto: Re: RES: [PERFORM] Degradation of postgres 7.4.5 on FreeBSD/CygWin "Rodrigo Moreno" <rodrigo.miguel@terra.com.br> writes: > max_fsm_pages = 40000 > max_fsm_relations = 2000 > But why after 2 months the database has 1.3gb and after reimport on 900mb ? 40k pages = 320M bytes = 1/3rd of your database. Perhaps you need a larger setting for max_fsm_pages. However, 30% bloat of the database doesn't particularly bother me, especially when you are using infrequent vacuums. Bear in mind that, for example, the steady-state fill factor of a b-tree index is usually estimated at less than 70%. A certain amount of wasted space is not only intended, but essential for reasonable performance. What you need is to take a more detailed look at the behavior of that function that's getting so slow. Are the query plans changing? Is the loop iterating over many more rows than before? You haven't told us anything that would account for 100x slowdown. regards, tom lane
Hi all, I Got more improvements using vacuumdb utility and the size of my database was decreasead from 1.3gb to 900mb. Only one thing is not right yeat. My procedure perform others 7 subprocedures and with reimported database, it's took about 5 minutes to complete. With old vacuumed database, the same process took 20minutes, it's much better than the 4 hours before, but there is little diference. Now, i have scheduled the vacuumdb --analyze once a day and vacuumdb --analyze --all --full once a week, i think this is enough. Now i'll check for reindexes tables and i'll perform analyze in each query in procedure. When i get more results, i post here. Thanks a Lot Rodrigo Moreno