Re: pg_dump crashes - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: pg_dump crashes |
Date | |
Msg-id | 5655d83a-85ff-e8c0-0547-828aca12db39@aklaver.com Whole thread Raw |
In response to | Re: pg_dump crashes (Nico De Ranter <nico.deranter@esaturnus.com>) |
Responses |
Re: pg_dump crashes
|
List | pgsql-general |
On 5/22/20 6:40 AM, Nico De Ranter wrote: > I was just trying that. It's always the same (huge) table that crashes > the pg_dump. Running a dump excluding that one table goes fine, > running a dump of only that one table crashes. > In the system logs I always see a segfault > > May 22 15:22:14 core4 kernel: [337837.874618] postgres[1311]: segfault > at 7f778008ed0d ip 000055f197ccc008 sp 00007ffdd1fc15a8 error 4 in > postgres[55f1977c0000+727000] > > It doesn't seem to be an Out-of-memory thing (at least not on the OS level). > The database is currently installed on a dedicated server with 32GB > RAM. I tried tweaking some of the memory parameters for postgres, but > the crash always happens at the exact same spot (if I run pg_dump for > that one table with and without memory tweaks the resulting files are > identical). > > One thing I just noticed looking at the dump file: at around the end of > the file I see this: So the below is the output from?: pg_dumpall --cluster 11/main --file=dump.sql > > 2087983804 516130 37989 2218636 3079067 0 0 P4B BcISC IGk L BOT BOP A jC > BAA I BeMj/b BceUl6 BehUAn 0Ms A C I4p9CBfUiSeAPU4eDuipKQ > *4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 > \N \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1145127487 1413694803 21071 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1145127487 1413694803 21071 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 6071772946555290175 1056985679 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ?????????????????????????????? > 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N > \N ??????????????????????????????* > 2087983833 554418 37989 5405605 14507502 0 0 P4B Bb8c/ IGk L BOS BOP A > Lfh BAA Bg BeMj+2 Bd1LVN BehUAl rlx ABA TOR > > It looks suspicious however there are about 837 more lines before the > output stops. > > Nico > > On Fri, May 22, 2020 at 3:27 PM Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 5/22/20 5:37 AM, Nico De Ranter wrote: > > Hi all, > > > > Postgres version: 9.5 > > OS: Ubuntu 18.04.4 > > > > I have a 144GB Bacula database that crashes the postgres daemon > when I > > try to do a pg_dump. > > At some point the server ran out of diskspace for the database > storage. > > I expanded the lvm and rebooted the server. It seemed to work fine, > > however when I try to dump the bacula database the postgres > daemon dies > > after about 37GB. > > > > I tried copying the database to another machine and upgrading > postgres > > to 11 using pg_upgrade. The upgrade seems to work but I still get > > exactly the same problem when trying to dump the database. > > > > postgres@core4:~$ pg_dumpall --cluster 11/main --file=dump.sql > > pg_dump: Dumping the contents of table "file" failed: > PQgetCopyData() > > failed. > > pg_dump: Error message from server: server closed the connection > > unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request. > > pg_dump: The command was: COPY public.file (fileid, fileindex, > jobid, > > pathid, filenameid, deltaseq, markid, lstat, md5) TO stdout; > > pg_dumpall: pg_dump failed on database "bacula", exiting > > What happens if you try to dump just this table? > > Something along lines of: > > pg_dump -t file -d some_db -U some_user > > Have you looked at the system logs to see if it is the OS killing the > process? > > > > > > In the logs I see: > > > > 2020-05-22 14:23:30.649 CEST [12768] LOG: server process (PID > 534) was > > terminated by signal 11: Segmentation fault > > 2020-05-22 14:23:30.649 CEST [12768] DETAIL: Failed process was > > running: COPY public.file (fileid, fileindex, jobid, pathid, > filenameid, > > deltaseq, markid, lstat, md5) TO stdout; > > 2020-05-22 14:23:30.651 CEST [12768] LOG: terminating any other > active > > server processes > > 2020-05-22 14:23:30.651 CEST [482] WARNING: terminating connection > > because of crash of another server process > > 2020-05-22 14:23:30.651 CEST [482] DETAIL: The postmaster has > commanded > > this server process to roll back the current transaction and exit, > > because another server process exited abnormally and possibly > corrupted > > shared memory. > > 2020-05-22 14:23:30.651 CEST [482] HINT: In a moment you should > be able > > to reconnect to the database and repeat your command. > > 2020-05-22 14:23:30.652 CEST [12768] LOG: all server processes > > terminated; reinitializing > > 2020-05-22 14:23:30.671 CEST [578] LOG: database system was > > interrupted; last known up at 2020-05-22 14:15:19 CEST > > 2020-05-22 14:23:30.809 CEST [578] LOG: database system was not > > properly shut down; automatic recovery in progress > > 2020-05-22 14:23:30.819 CEST [578] LOG: redo starts at 197/D605EA18 > > 2020-05-22 14:23:30.819 CEST [578] LOG: invalid record length at > > 197/D605EA50: wanted 24, got 0 > > 2020-05-22 14:23:30.819 CEST [578] LOG: redo done at 197/D605EA18 > > 2020-05-22 14:23:30.876 CEST [12768] LOG: database system is > ready to > > accept connections > > 2020-05-22 14:29:07.511 CEST [12768] LOG: received fast shutdown > request > > > > > > Any ideas how to fix or debug this? > > > > Nico > > > > -- > > > > Nico De Ranter > > > > Operations Engineer > > > > T. +32 16 38 72 10 > > > > > > <http://www.esaturnus.com> > > > > <http://www.esaturnus.com> > > > > > > eSATURNUS > > Philipssite 5, D, box 28 > > 3001 Leuven – Belgium > > > > > > > > T. +32 16 40 12 82 > > F. +32 16 40 84 77 > > www.esaturnus.com <http://www.esaturnus.com> > <http://www.esaturnus.com> > > > > ** <http://www.esaturnus.com/> > > > > *For Service & Support :* > > > > Support Line Belgium: +32 2 2009897 > > > > Support Line International: +44 12 56 68 38 78 > > > > Or via email : medical.services.eu@sony.com > <mailto:medical.services.eu@sony.com> > > <mailto:medical.services.eu@sony.com > <mailto:medical.services.eu@sony.com>> > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > > -- > > Nico De Ranter > > Operations Engineer > > T. +32 16 38 72 10 > > > <http://www.esaturnus.com> > > <http://www.esaturnus.com> > > > eSATURNUS > Philipssite 5, D, box 28 > 3001 Leuven – Belgium > > > > T. +32 16 40 12 82 > F. +32 16 40 84 77 > www.esaturnus.com <http://www.esaturnus.com> > > ** <http://www.esaturnus.com/> > > *For Service & Support :* > > Support Line Belgium: +32 2 2009897 > > Support Line International: +44 12 56 68 38 78 > > Or via email : medical.services.eu@sony.com > <mailto:medical.services.eu@sony.com> > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: