Thread: ERROR: invalid memory alloc request size 4294967293
Hello, Actually I use postgresql version 8.4.6-0ubuntu10.04 in bacula server to save information about backups from bacula. But, 2 days ago, the postgresql make a error when I run the command pg_dump. This is the error: ==================== 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: SQL command failed 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: Error message from server: ERROR: invalid memory alloc request size 4294967293 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: The command was: COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout; 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dumpall: pg_dump failed on database "bacula", exiting ==================== In postgresql's logs show the same message: ==================== 2011-01-02 06:31:07 CLST ERROR: table "delcandidates" does not exist 2011-01-02 06:31:07 CLST STATEMENT: DROP TABLE DelCandidates 2011-01-02 06:32:15 CLST ERROR: invalid memory alloc request size 4294967293 2011-01-02 06:32:15 CLST STATEMENT: COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout; 2011-01-02 06:32:15 CLST LOG: could not receive data from client: Connection reset by peer 2011-01-02 06:32:15 CLST LOG: unexpected EOF on client connection ==================== I was searching in internet for a long time, and I found many message/case/errors similar, but none solution !! :( Until this moment, I have tried: - checked the RAM... Is OK, I run memtest for 12 times and none error. - checked the disk... Is OK, i run fsck and badblock and none error - backuped the data... I rename the /var/lib/postgresql to /var/lib/postgresql-old and after I copy the contain of /var/lib/postgresql-old to /var/lib/postgresql and no errors - checked the logs... I don't view importants messages of errors (only the message above). - changed the server... I move the actual data to new server, and I have the same problem (with this test, I discard hardware problem) If I run this command in the console: su - postgres -c "psql bacula -c 'COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout'" | wc -l in the first time, I receive a total of 1417599 lines if a run this command for second time o more, I receive only 114 but, if I restart the server and I run this command again, so we have the 1417599 again. ========= su - postgres -c "psql bacula -c 'COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout'" | wc ERROR: invalid memory alloc request size 4294967293 1417599 32604777 173355793 su - postgres -c "psql bacula -c 'COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout'" | wc ERROR: invalid memory alloc request size 4294967293 114 2622 14192 su - postgres -c "psql bacula -c 'COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout'" | wc ERROR: invalid memory alloc request size 4294967293 114 2622 14192 su - postgres -c "psql bacula -c 'COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout'" | wc ERROR: invalid memory alloc request size 4294967293 1417599 32604777 173355793 su - postgres -c "psql bacula -c 'COPY public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat, md5) TO stdout'" | wc ERROR: invalid memory alloc request size 4294967293 114 2622 14192 ========= This server is in a datacenter tougher a others many servers and with a UPS.. the uptime of this server was ~ 60 days (yesterday I restarted it). is a dedicate server to postgresql and bacula with 8GB of RAM (ECC and redundant) and I ran the dump of database every day... and only two day ago, I receive this error. So.. In base a this I write to you to ask if everyone have any idea about this problem ?? thanks and attentive. -- -- Victor Hugo dos Santos Linux Counter #224399
On Tue, Jan 4, 2011 at 10:59 AM, Bob Lunney <bob_lunney@yahoo.com> wrote: > Run ulimit -a and verify the max memory size allowed for the postgres account.(I assume you are running postmaster underthe postgres account, right?) The allowed size should be large enough for the postmaster plus shared buffers and severalother GUCs that require memory. Hello, =============================== $ whoami postgres $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited =============================== attentive. -- -- Victor Hugo dos Santos Linux Counter #224399
Run ulimit -a and verify the max memory size allowed for the postgres account.(I assume you are running postmaster underthe postgres account, right?) The allowed size should be large enough for the postmaster plus shared buffers and severalother GUCs that require memory. Bob Lunney --- On Tue, 1/4/11, Victor Hugo dos Santos <listas.vhs@gmail.com> wrote: > From: Victor Hugo dos Santos <listas.vhs@gmail.com> > Subject: [ADMIN] ERROR: invalid memory alloc request size 4294967293 > To: pgsql-admin@postgresql.org > Date: Tuesday, January 4, 2011, 8:48 AM > Hello, > > Actually I use postgresql version 8.4.6-0ubuntu10.04 > in bacula server > to save information about backups from bacula. > But, 2 days ago, the postgresql make a error when I run the > command > pg_dump. This is the error: > > ==================== > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: > SQL command failed > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: > Error message > from server: ERROR: invalid memory alloc request size > 4294967293 > 02-Jan 06:32 bacula-dir JobId 31005:%2
Run ulimit -a and verify the max memory size allowed for the postgres account.(I assume you are running postmaster underthe postgres account, right?) The allowed size should be large enough for the postmaster plus shared buffers and severalother GUCs that require memory. Bob Lunney --- On Tue, 1/4/11, Victor Hugo dos Santos <listas.vhs@gmail.com> wrote: > From: Victor Hugo dos Santos <listas.vhs@gmail.com> > Subject: [ADMIN] ERROR: invalid memory alloc request size 4294967293 > To: pgsql-admin@postgresql.org > Date: Tuesday, January 4, 2011, 8:48 AM > Hello, > > Actually I use postgresql version 8.4.6-0ubuntu10.04 > in bacula server > to save information about backups from bacula. > But, 2 days ago, the postgresql make a error when I run the > command > pg_dump. This is the error: > > ==================== > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: > SQL command failed > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: > Error message > from server: ERROR: invalid memory alloc request size > 4294967293 > 02-Jan 06:32 bacula-dir JobId 31005:%2
Run ulimit -a and verify the max memory size allowed for the postgres account.(I assume you are running postmaster underthe postgres account, right?) The allowed size should be large enough for the postmaster plus shared buffers and severalother GUCs that require memory. Bob Lunney --- On Tue, 1/4/11, Victor Hugo dos Santos <listas.vhs@gmail.com> wrote: > From: Victor Hugo dos Santos <listas.vhs@gmail.com> > Subject: [ADMIN] ERROR: invalid memory alloc request size 4294967293 > To: pgsql-admin@postgresql.org > Date: Tuesday, January 4, 2011, 8:48 AM > Hello, > > Actually I use postgresql version 8.4.6-0ubuntu10.04 > in bacula server > to save information about backups from bacula. > But, 2 days ago, the postgresql make a error when I run the > command > pg_dump. This is the error: > > ==================== > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: > SQL command failed > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: > Error message > from server: ERROR: invalid memory alloc request size > 4294967293 > 02-Jan 06:32 bacula-dir JobId 31005:%2
hi, looks like a 32bit version so 4294967293 is too much (4GB). Use a 64bit version regards Thomas Am 04.01.2011 14:48, schrieb Victor Hugo dos Santos: > Hello, > > Actually I use postgresql version 8.4.6-0ubuntu10.04 in bacula server > to save information about backups from bacula. > But, 2 days ago, the postgresql make a error when I run the command > pg_dump. This is the error: > > ==================== > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: SQL command failed > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: Error message > from server: ERROR: invalid memory alloc request size 4294967293 > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dump: The command > was: COPY public.file (fileid, fileindex, jobid, pathid, filenameid, > markid, lstat, md5) TO stdout; > 02-Jan 06:32 bacula-dir JobId 31005: BeforeJob: pg_dumpall: pg_dump > failed on database "bacula", exiting > ====================
On Tue, Jan 4, 2011 at 11:07 AM, Thomas Markus <t.markus@proventis.net> wrote: > hi, > > looks like a 32bit version so 4294967293 is too much (4GB). Use a 64bit > version mmmm.. is a small table (look the size of last dump) ============= $ ls -lh /tmp/public-files.sql -rw-r----- 1 root root 298M Jan 4 11:59 /tmp/public-files.sql $ sudo cat /tmp/public-files.sql | wc 2580840 59359163 311685729 ============= but, I'm dowloading the 64bit image now and after I install it in a Virtual Machine to try. thanks and attentive to others commentaries. -- -- Victor Hugo dos Santos Linux Counter #224399
On Tue, Jan 4, 2011 at 12:02 PM, Victor Hugo dos Santos <listas.vhs@gmail.com> wrote: [...] > but, I'm dowloading the 64bit image now and after I install it in a > Virtual Machine to try. I'm a stupid !! :( I can't restore the same DB from 32bit in 64bits !!! for this, should be export first in 32bits (pg_dump) and after import in 64bit. but in my case, the pg_dump don't work !! :( any idea ??? thanks -- -- Victor Hugo dos Santos Linux Counter #224399
Victor Hugo dos Santos <listas.vhs@gmail.com> writes: > any idea ??? It looks like a corrupted-data problem from here. You need to isolate and delete the bad row(s). regards, tom lane
On Tue, Jan 4, 2011 at 1:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Victor Hugo dos Santos <listas.vhs@gmail.com> writes: >> any idea ??? > > It looks like a corrupted-data problem from here. You need to isolate > and delete the bad row(s). hello any sugestion about how found and delete the bad row(s) ?? thanks -- -- Victor Hugo dos Santos Linux Counter #224399
On Tue, Jan 4, 2011 at 1:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Victor Hugo dos Santos <listas.vhs@gmail.com> writes: >> any idea ??? > > It looks like a corrupted-data problem from here. You need to isolate > and delete the bad row(s). Hello again... this is very, very strange !! :D (sorry for long message, but I'm trying to sent all information) I was trying of isolate the "bad rows" in base a this article http://archives.postgresql.org/pgsql-admin/2003-06/msg00204.php first I create a function run a select the column "md5" (md5 and lstat are both with errors) row-to-row.. after a time, I discoverer that the (first) row with problem is the "1417610" So, now I run the command manually: ========================= # su - postgres -c "psql bacula -c 'SELECT md5 from public.file OFFSET 1417610 LIMIT 1'" md5 ----------------------------- WPC7vlHBLbGDA5XL6bsuBVsVVEM (1 row) # su - postgres -c "psql bacula -c 'SELECT md5 from public.file OFFSET 1417610 LIMIT 1'" server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost # su - postgres -c "psql bacula -c 'SELECT md5 from public.file OFFSET 1417610 LIMIT 1'" md5 ----------------------------- WPC7vlHBLbGDA5XL6bsuBVsVVEM (1 row) # su - postgres -c "psql bacula -c 'SELECT md5 from public.file OFFSET 1417610 LIMIT 1'" server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost ==================== In others words.. if I run the same command two times, I get a error, and if I rerun the same command, work !!! In the logs of postgresql I found after: ========================= 2011-01-04 16:40:35 CLST STATEMENT: SELECT md5 from public.file OFFSET 800000 LIMIT 1 2011-01-04 16:40:36 CLST DEBUG: reaping dead processes 2011-01-04 16:40:36 CLST DEBUG: server process (PID 15542) was terminated by signal 11: Segmentation fault 2011-01-04 16:40:36 CLST LOG: server process (PID 15542) was terminated by signal 11: Segmentation fault 2011-01-04 16:40:36 CLST LOG: terminating any other active server processes 2011-01-04 16:40:36 CLST DEBUG: sending SIGQUIT to process 15533 2011-01-04 16:40:36 CLST DEBUG: sending SIGQUIT to process 15534 2011-01-04 16:40:36 CLST DEBUG: sending SIGQUIT to process 15535 2011-01-04 16:40:36 CLST DEBUG: sending SIGQUIT to process 15536 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: proc_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: proc_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: proc_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: proc_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: reaping dead processes 2011-01-04 16:40:36 CLST LOG: all server processes terminated; reinitializing 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(1): 3 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: invoking IpcMemoryCreate(size=32595968) 2011-01-04 16:40:36 CLST LOG: database system was interrupted; last known up at 2011-01-04 16:40:28 CLST 2011-01-04 16:40:36 CLST DEBUG: checkpoint record is at 6/83C484E4 2011-01-04 16:40:36 CLST DEBUG: redo record is at 6/83C484E4; shutdown TRUE 2011-01-04 16:40:36 CLST DEBUG: next transaction ID: 0/416024; next OID: 156073 2011-01-04 16:40:36 CLST DEBUG: next MultiXactId: 1; next MultiXactOffset: 0 2011-01-04 16:40:36 CLST LOG: database system was not properly shut down; automatic recovery in progress 2011-01-04 16:40:36 CLST LOG: record with zero length at 6/83C48528 2011-01-04 16:40:36 CLST LOG: redo is not required 2011-01-04 16:40:36 CLST DEBUG: transaction ID wrap limit is 2147484295, limited by database "template1" 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(0): 3 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: proc_exit(0): 2 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: exit(0) 2011-01-04 16:40:36 CLST DEBUG: shmem_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: proc_exit(-1): 0 callbacks to make 2011-01-04 16:40:36 CLST DEBUG: reaping dead processes 2011-01-04 16:40:36 CLST LOG: database system is ready to accept connections 2011-01-04 16:40:36 CLST LOG: autovacuum launcher started ========================= maybe the problem is in "6/83C48528" but none idea how fix it. thanks. -- -- Victor Hugo dos Santos Linux Counter #224399
Victor Hugo dos Santos <listas.vhs@gmail.com> writes: > On Tue, Jan 4, 2011 at 1:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It looks like a corrupted-data problem from here. �You need to isolate >> and delete the bad row(s). > # su - postgres -c "psql bacula -c 'SELECT md5 from public.file OFFSET > 1417610 LIMIT 1'" > server closed the connection unexpectedly So you've got some rows that are corrupted badly enough to crash the backend :-(. > In others words.. if I run the same command two times, I get a error, > and if I rerun the same command, work !!! When you're running long seqscans like these, you need to turn off "synchronize_seqscans" to get reproducible results. With that flag turned on, scans may start from somewhere in the middle of the table instead of always starting from the beginning. regards, tom lane