Thread: Trying to pg_restore a 7.1.3 db into 7.3.4 - stalling at 100% CPU w/ no disk access
Trying to pg_restore a 7.1.3 db into 7.3.4 - stalling at 100% CPU w/ no disk access
From
me@shanewright.co.uk (Shane Wright)
Date:
Hi, I'm trying to upgrade our 25Gb database from 7.1.3 to 7.3.4 - pg_dump worked fine, although piping through split to get a set of 1Gb files. But, after a few attempts on using pg_restore to get the data into the new installation I'm having a few problems; basically it restores the first few tables fine (big tables too), but now it's just hanging on one table; using 100% CPU but hardly touching the disk at all (vmstat reports about 50kb every few minutes). pg_dump command was something like... pg_dump --user=xxx --superuser=xxx --format=t | split -b 1000000000 pg_restore commands are these: cat xaa xab xac | pg_restore --dbname=xx --superuser=xx --username=xx --schema-only cat xaa xab xac | pg_restore --dbname=xx --superuser=xx --username=xx --data-only --rearrange --disable-triggers--verbose (dropped a few indices between the two pg_restore's to make things quicker). Anyway, it ploughed through the first bit (including a 25 million row table) in under an hour, but now it's been stalled on one table for 5 hours. Dunno if it's coincidence or not but this table is the only one with a BYTEA column, it has a few million rows I think. Any ideas what might be causing it? I'm hoping its not some bug in pg_restore causing an infinite loop or something. if it helps, the CPU usage (according to top) is all in pg_restore (not a postmaster process), and (according to vmstat) its about 40% user and 60% system. Both installs are (I think, ISP did it), clean source from PostgreSQL's FTP site, both are running on RH Linux, 7.1.3 on a dual PIII, 7.3.4 on a P4 Xeon. Total size of dumped database is 3Gb (across 3 files). Any help appreciated, Thanks, Shane
Shane Wright wrote: > Hi, > > I'm trying to upgrade our 25Gb database from 7.1.3 to 7.3.4 - pg_dump > worked fine, although piping through split to get a set of 1Gb files. > > But, after a few attempts on using pg_restore to get the data into the > new installation I'm having a few problems; basically it restores the > first few tables fine (big tables too), but now it's just hanging on > one table; using 100% CPU but hardly touching the disk at all (vmstat > reports about 50kb every few minutes). what show an strace on that process ? Regards Gaetano Mendola
Gaetano, er, shedloads of this.... read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 methinks thats not a good sign? Shane Gaetano Mendola wrote: > Shane Wright wrote: > >> Hi, >> >> I'm trying to upgrade our 25Gb database from 7.1.3 to 7.3.4 - pg_dump >> worked fine, although piping through split to get a set of 1Gb files. >> >> But, after a few attempts on using pg_restore to get the data into the >> new installation I'm having a few problems; basically it restores the >> first few tables fine (big tables too), but now it's just hanging on >> one table; using 100% CPU but hardly touching the disk at all (vmstat >> reports about 50kb every few minutes). > > > what show an strace on that process ? > > > Regards > Gaetano Mendola >
Gaetano, er, shedloads of this.... read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 read(0, "", 4096) = 0 methinks thats not a good sign? Shane Gaetano Mendola wrote: > Shane Wright wrote: > >> Hi, >> >> I'm trying to upgrade our 25Gb database from 7.1.3 to 7.3.4 - pg_dump >> worked fine, although piping through split to get a set of 1Gb files. >> >> But, after a few attempts on using pg_restore to get the data into the >> new installation I'm having a few problems; basically it restores the >> first few tables fine (big tables too), but now it's just hanging on >> one table; using 100% CPU but hardly touching the disk at all (vmstat >> reports about 50kb every few minutes). > > > what show an strace on that process ? > > > Regards > Gaetano Mendola >
Shane Wright wrote: > Gaetano, > > er, shedloads of this.... > > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > > > methinks thats not a good sign? I a non sense, I don't have clue why is reading block of 4K from a NULL file descriptor ! Anyone any idea ? Regards Gaetano Mendola
> read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > read(0, "", 4096) = 0 > fd 0 is usually stdin, unless the program disconnects stdin. Maybe pg_restore is waiting for input, perhaps a password? -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Hi, > fd 0 is usually stdin, unless the program disconnects stdin. > Maybe pg_restore is waiting for input, perhaps a password? certainly shouldn't be - the table where the problem happens is no different to any of the others, but I will try doing just that table later today and see if that makes any difference. Wouldn't it have outputted something though; like a prompt for input or something? Regards Shane On 11 Nov 2003, at 19:52, Jeff wrote: >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> > > fd 0 is usually stdin, unless the program disconnects stdin. > Maybe pg_restore is waiting for input, perhaps a password? > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > ---------------------------(end of > broadcast)--------------------------- > TIP 8: explain analyze is your friend >
Hi, I've used this command to dump just the table concerned, and I get a zero length file!?! No errors or anything, just a silent exit. pg_dump -U xx-S xx -F t -a -t tablename mydatabase > ~/backup/mytable Any ideas? I'm thinking the zero-sized file could be what was causing the read()s to fail, but as to why it's happening in the first place I can't think.. Cheers Shane On 11 Nov 2003, at 19:52, Jeff wrote: >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> read(0, "", 4096) = 0 >> > > fd 0 is usually stdin, unless the program disconnects stdin. > Maybe pg_restore is waiting for input, perhaps a password? > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ >