Re: pg_dump crashes - Mailing list pgsql-general

From Adrian Klaver
Subject Re: pg_dump crashes
Date
Msg-id 5655d83a-85ff-e8c0-0547-828aca12db39@aklaver.com
Whole thread Raw
In response to Re: pg_dump crashes  (Nico De Ranter <nico.deranter@esaturnus.com>)
Responses Re: pg_dump crashes
List pgsql-general
On 5/22/20 6:40 AM, Nico De Ranter wrote:
> I was just trying that.  It's always the same (huge) table that crashes 
> the pg_dump.   Running a dump excluding that one table goes fine, 
> running a dump of only that one table crashes.
> In the system logs I always see a segfault
> 
> May 22 15:22:14 core4 kernel: [337837.874618] postgres[1311]: segfault 
> at 7f778008ed0d ip 000055f197ccc008 sp 00007ffdd1fc15a8 error 4 in 
> postgres[55f1977c0000+727000]
> 
> It doesn't seem to be an Out-of-memory thing (at least not on the OS level).
> The database is currently installed on a dedicated server with 32GB 
> RAM.   I tried tweaking some of the memory parameters for postgres, but 
> the crash always happens at the exact same spot (if I run pg_dump for 
> that one table with and without memory tweaks the resulting files are 
> identical).
> 
> One thing I just noticed looking at the dump file: at around the end of 
> the file I see this:

So the below is the output from?:

pg_dumpall --cluster 11/main --file=dump.sql

> 
> 2087983804 516130 37989 2218636 3079067 0 0 P4B BcISC IGk L BOT BOP A jC 
> BAA I BeMj/b BceUl6 BehUAn 0Ms A C I4p9CBfUiSeAPU4eDuipKQ
> *4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 
> \N \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1145127487 1413694803 21071 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1145127487 1413694803 21071 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 6071772946555290175 1056985679 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????
> 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N 
> \N ??????????????????????????????*
> 2087983833 554418 37989 5405605 14507502 0 0 P4B Bb8c/ IGk L BOS BOP A 
> Lfh BAA Bg BeMj+2 Bd1LVN BehUAl rlx ABA TOR
> 
> It looks suspicious however there are about 837 more lines before the 
> output stops.
> 
> Nico
> 
> On Fri, May 22, 2020 at 3:27 PM Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 5/22/20 5:37 AM, Nico De Ranter wrote:
>      > Hi all,
>      >
>      > Postgres version: 9.5
>      > OS: Ubuntu 18.04.4
>      >
>      > I have a 144GB Bacula database that crashes the postgres daemon
>     when I
>      > try to do a pg_dump.
>      > At some point the server ran out of diskspace for the database
>     storage.
>      > I expanded the lvm and rebooted the server. It seemed to work fine,
>      > however when I try to dump the bacula database the postgres
>     daemon dies
>      > after about 37GB.
>      >
>      > I tried copying the database to another machine and upgrading
>     postgres
>      > to 11 using pg_upgrade.  The upgrade seems to work but I still get
>      > exactly the same problem when trying to dump the database.
>      >
>      > postgres@core4:~$ pg_dumpall --cluster 11/main --file=dump.sql
>      > pg_dump: Dumping the contents of table "file" failed:
>     PQgetCopyData()
>      > failed.
>      > pg_dump: Error message from server: server closed the connection
>      > unexpectedly
>      > This probably means the server terminated abnormally
>      > before or while processing the request.
>      > pg_dump: The command was: COPY public.file (fileid, fileindex,
>     jobid,
>      > pathid, filenameid, deltaseq, markid, lstat, md5) TO stdout;
>      > pg_dumpall: pg_dump failed on database "bacula", exiting
> 
>     What happens if you try to dump just this table?
> 
>     Something along lines of:
> 
>     pg_dump -t file -d some_db -U some_user
> 
>     Have you looked at the system logs to see if it is the OS killing the
>     process?
> 
> 
>      >
>      > In the logs I see:
>      >
>      > 2020-05-22 14:23:30.649 CEST [12768] LOG:  server process (PID
>     534) was
>      > terminated by signal 11: Segmentation fault
>      > 2020-05-22 14:23:30.649 CEST [12768] DETAIL:  Failed process was
>      > running: COPY public.file (fileid, fileindex, jobid, pathid,
>     filenameid,
>      > deltaseq, markid, lstat, md5) TO stdout;
>      > 2020-05-22 14:23:30.651 CEST [12768] LOG:  terminating any other
>     active
>      > server processes
>      > 2020-05-22 14:23:30.651 CEST [482] WARNING:  terminating connection
>      > because of crash of another server process
>      > 2020-05-22 14:23:30.651 CEST [482] DETAIL:  The postmaster has
>     commanded
>      > this server process to roll back the current transaction and exit,
>      > because another server process exited abnormally and possibly
>     corrupted
>      > shared memory.
>      > 2020-05-22 14:23:30.651 CEST [482] HINT:  In a moment you should
>     be able
>      > to reconnect to the database and repeat your command.
>      > 2020-05-22 14:23:30.652 CEST [12768] LOG:  all server processes
>      > terminated; reinitializing
>      > 2020-05-22 14:23:30.671 CEST [578] LOG:  database system was
>      > interrupted; last known up at 2020-05-22 14:15:19 CEST
>      > 2020-05-22 14:23:30.809 CEST [578] LOG:  database system was not
>      > properly shut down; automatic recovery in progress
>      > 2020-05-22 14:23:30.819 CEST [578] LOG:  redo starts at 197/D605EA18
>      > 2020-05-22 14:23:30.819 CEST [578] LOG:  invalid record length at
>      > 197/D605EA50: wanted 24, got 0
>      > 2020-05-22 14:23:30.819 CEST [578] LOG:  redo done at 197/D605EA18
>      > 2020-05-22 14:23:30.876 CEST [12768] LOG:  database system is
>     ready to
>      > accept connections
>      > 2020-05-22 14:29:07.511 CEST [12768] LOG:  received fast shutdown
>     request
>      >
>      >
>      > Any ideas how to fix or debug this?
>      >
>      > Nico
>      >
>      > --
>      >
>      > Nico De Ranter
>      >
>      > Operations Engineer
>      >
>      > T. +32 16 38 72 10
>      >
>      >
>      > <http://www.esaturnus.com>
>      >
>      > <http://www.esaturnus.com>
>      >
>      >
>      > eSATURNUS
>      > Philipssite 5, D, box 28
>      > 3001 Leuven – Belgium
>      >
>      >
>      >
>      > T. +32 16 40 12 82
>      > F. +32 16 40 84 77
>      > www.esaturnus.com <http://www.esaturnus.com>
>     <http://www.esaturnus.com>
>      >
>      > ** <http://www.esaturnus.com/>
>      >
>      > *For Service & Support :*
>      >
>      > Support Line Belgium: +32 2 2009897
>      >
>      > Support Line International: +44 12 56 68 38 78
>      >
>      > Or via email : medical.services.eu@sony.com
>     <mailto:medical.services.eu@sony.com>
>      > <mailto:medical.services.eu@sony.com
>     <mailto:medical.services.eu@sony.com>>
>      >
>      >
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 
> 
> -- 
> 
> Nico De Ranter
> 
> Operations Engineer
> 
> T. +32 16 38 72 10
> 
> 
> <http://www.esaturnus.com>
> 
> <http://www.esaturnus.com>
> 
> 
> eSATURNUS
> Philipssite 5, D, box 28
> 3001 Leuven – Belgium
> 
>     
> 
> T. +32 16 40 12 82
> F. +32 16 40 84 77
> www.esaturnus.com <http://www.esaturnus.com>
> 
> ** <http://www.esaturnus.com/>
> 
> *For Service & Support :*
> 
> Support Line Belgium: +32 2 2009897
> 
> Support Line International: +44 12 56 68 38 78
> 
> Or via email : medical.services.eu@sony.com 
> <mailto:medical.services.eu@sony.com>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: btree_gist extension - gbt_cash_union return type
Next
From: Nico De Ranter
Date:
Subject: Re: pg_dump crashes