Thread: BUG #5603: pg_tblspc and pg_twoface directories get deleted when starting up service
BUG #5603: pg_tblspc and pg_twoface directories get deleted when starting up service
From
"Nacho Mezzadra"
Date:
The following bug has been logged online: Bug reference: 5603 Logged by: Nacho Mezzadra Email address: nachomezzadra@gmail.com PostgreSQL version: 8.3.11 Operating system: Red Hat Enterprise 5.3 Description: pg_tblspc and pg_twoface directories get deleted when starting up service Details: This issue happened not very frequently, but it happened to me 3 times, in 3 different Red Hat servers. The thing is that when stopping the Postgresql service with the "/sbin/service postgresql-8.3 stop" command, and after that starting it with the "/sbin/service postgresql-8.3 start" command (haven't tried with the restart one though), a few times both pg_tblspc and pg_twoface directories (inside data directory) get somehow deleted and hence the start service command fails. Looking in the log files I find the following error: 2010-07-19 16:54:55 ISTFATAL: could not open directory "pg_tblspc": No such file or directory So I manually create the "pg_tblspc" directory, and then try to start again the service unsuccessfully, getting this time a similar error, but saying that pg_twoface directory doesn't exist. After creating the pg_twoface directory, service can be successfully started. Please note that all these always happened running the service command as root. All 3 linux boxes are running over a VMWare host.
Re: BUG #5603: pg_tblspc and pg_twoface directories get deleted when starting up service
From
Robert Haas
Date:
On Thu, Aug 5, 2010 at 2:46 PM, Nacho Mezzadra <nachomezzadra@gmail.com> wr= ote: > > The following bug has been logged online: > > Bug reference: =A0 =A0 =A05603 > Logged by: =A0 =A0 =A0 =A0 =A0Nacho Mezzadra > Email address: =A0 =A0 =A0nachomezzadra@gmail.com > PostgreSQL version: 8.3.11 > Operating system: =A0 Red Hat Enterprise 5.3 > Description: =A0 =A0 =A0 =A0pg_tblspc and pg_twoface directories get dele= ted when > starting up service > Details: > > This issue happened not very frequently, but it happened to me 3 times, i= n 3 > different Red Hat servers. > The thing is that when stopping the Postgresql service with the > "/sbin/service postgresql-8.3 stop" command, and after that starting it w= ith > the "/sbin/service postgresql-8.3 start" command (haven't tried with the > restart one though), a few times both pg_tblspc and pg_twoface =A0directo= ries > (inside data directory) get somehow deleted and hence the start service > command fails. =A0Looking in the log files I find the following error: > > 2010-07-19 16:54:55 ISTFATAL: =A0could not open directory "pg_tblspc": No= such > file or directory > > So I manually create the "pg_tblspc" directory, and then try to start aga= in > the service unsuccessfully, getting this time a similar error, but saying > that pg_twoface directory doesn't exist. > > After creating the pg_twoface directory, service can be successfully > started. > > Please note that all these always happened running the service command as > root. > All 3 linux boxes are running over a VMWare host. This is pretty scary, but it's a little hard to believe that Red Hat would ship a script which had even the faintest chance of obliterating two critical directories. Especially since the guy who does the packaging of PostgreSQL over thereabouts is our most knowledgeable, experienced, and prolific committer. So I suspect you've a (broken) custom script, or a cron job that's doing something evil, or some other weirdness that is specific to your installations, but you haven't provided enough details to speculate in detail (for example, perhaps you could reply to the list and post a copy of the script you think is doing this). Also, I'm pretty sure that we don't have a directory called pg_twoface, though it would pretty funny if we did. It's fairly obvious what this is meant to say, but it doesn't. --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: BUG #5603: pg_tblspc and pg_twoface directories get deleted when starting up service
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Aug 5, 2010 at 2:46 PM, Nacho Mezzadra <nachomezzadra@gmail.com> wrote: >> PostgreSQL version: 8.3.11 >> Operating system: Red Hat Enterprise 5.3 >> Description: pg_tblspc and pg_twoface directories get deleted when >> starting up service > This is pretty scary, but it's a little hard to believe that Red Hat > would ship a script which had even the faintest chance of obliterating > two critical directories. Especially since the guy who does the > packaging of PostgreSQL over thereabouts is our most knowledgeable, > experienced, and prolific committer. So I suspect you've a (broken) > custom script, or a cron job that's doing something evil, or some > other weirdness that is specific to your installations, but you > haven't provided enough details to speculate in detail (for example, > perhaps you could reply to the list and post a copy of the script you > think is doing this). Well, I have to disclaim credit/blame for this, because Red Hat has never shipped PG 8.3.anything for RHEL-5. Possibly the OP is running Devrim's or Command Prompt's RPMs. That said, the initscript Devrim uses looks just about like mine, and there's no chance whatever that it would selectively delete portions of what's under $PGDATA. I have to think that there's a loose cannon somewhere else in the OP's system. We have for example seen some very unfortunate behavior in the past when the data directory was located on a slow-to-mount NFS server. (I have no reason to think that that's exactly what this problem is; I just cite it to illustrate the kind of thing to be looking for.) regards, tom lane
Re: BUG #5603: pg_tblspc and pg_twoface directories get deleted when starting up service
From
Nacho Mezzadra
Date:
Tom, Robert, sorry I am coming back to you after a while, but we still have the same issue. This has been happening in our environments, but now it is also happening in customers' environments -which we do not set up- and it is also happening. All environments are always Red Hat Enterprise 5.3. As reported in the issue, when starting a service using /sbin/service postgresql-8.3 start, sometimes the directories data/pg_tblspc and data/pg_twophase get deleted and PostgreSQL engine won't start up. As a workaround, we recreate both directories and PostgreSQL can be started again, but we need to know why this is happening and if it ever will harm in any way our data. Please let me know if you need any more info, or whatever. Thanks a lot in advance, Nacho.- >On Tue, Aug 10, 2010 at 01:11, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Robert Haas <robertmhaas@gmail.com> writes: > > On Thu, Aug 5, 2010 at 2:46 PM, Nacho Mezzadra <nachomezzadra@gmail.com= > wrote: > >> PostgreSQL version: 8.3.11 > >> Operating system: =C2=A0 Red Hat Enterprise 5.3 > >> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0pg_tblspc and pg_twoface direc= tories get deleted when > >> starting up service > > > This is pretty scary, but it's a little hard to believe that Red Hat > > would ship a script which had even the faintest chance of obliterating > > two critical directories. =C2=A0Especially since the guy who does the > > packaging of PostgreSQL over thereabouts is our most knowledgeable, > > experienced, and prolific committer. =C2=A0So I suspect you've a (broke= n) > > custom script, or a cron job that's doing something evil, or some > > other weirdness that is specific to your installations, but you > > haven't provided enough details to speculate in detail (for example, > > perhaps you could reply to the list and post a copy of the script you > > think is doing this). > > Well, I have to disclaim credit/blame for this, because Red Hat has > never shipped PG 8.3.anything for RHEL-5. =C2=A0Possibly the OP is running > Devrim's or Command Prompt's RPMs. =C2=A0That said, the initscript Devrim > uses looks just about like mine, and there's no chance whatever that it > would selectively delete portions of what's under $PGDATA. =C2=A0I have to > think that there's a loose cannon somewhere else in the OP's system. > We have for example seen some very unfortunate behavior in the past > when the data directory was located on a slow-to-mount NFS server. > (I have no reason to think that that's exactly what this problem is; > I just cite it to illustrate the kind of thing to be looking for.) > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0regards, tom lane On Thu, Aug 5, 2010 at 2:46 PM, Nacho Mezzadra <nachomezzadra@gmail.com> wr= ote: > > The following bug has been logged online: > > Bug reference: =C2=A0 =C2=A0 =C2=A05603 > Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Nacho Mezzadra > Email address: =C2=A0 =C2=A0 =C2=A0nachomezzadra@gmail.com > PostgreSQL version: 8.3.11 > Operating system: =C2=A0 Red Hat Enterprise 5.3 > Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0pg_tblspc and pg_twoface director= ies get deleted when > starting up service > Details: > > This issue happened not very frequently, but it happened to me 3 times, i= n 3 > different Red Hat servers. > The thing is that when stopping the Postgresql service with the > "/sbin/service postgresql-8.3 stop" command, and after that starting it w= ith > the "/sbin/service postgresql-8.3 start" command (haven't tried with the > restart one though), a few times both pg_tblspc and pg_twoface =C2=A0dire= ctories > (inside data directory) get somehow deleted and hence the start service > command fails. =C2=A0Looking in the log files I find the following error: > > 2010-07-19 16:54:55 ISTFATAL: =C2=A0could not open directory "pg_tblspc":= No such > file or directory > > So I manually create the "pg_tblspc" directory, and then try to start aga= in > the service unsuccessfully, getting this time a similar error, but saying > that pg_twoface directory doesn't exist. > > After creating the pg_twoface directory, service can be successfully > started. > > Please note that all these always happened running the service command as > root. > All 3 linux boxes are running over a VMWare host.
Re: BUG #5603: pg_tblspc and pg_twoface directories get deleted when starting up service
From
Tom Lane
Date:
Nacho Mezzadra <nachomezzadra@gmail.com> writes: > Tom, Robert, sorry I am coming back to you after a while, but we still > have the same issue. This has been happening in our environments, but > now it is also happening in customers' environments -which we do not > set up- and it is also happening. All environments are always Red Hat > Enterprise 5.3. You still haven't given any reason to think this is a Postgres bug, nor indeed any information beyond what you said originally. One thing that strikes me is that both pg_tblspc and pg_twophase are empty and unused during normal operation (if you're not using the relevant features). They are scanned during postmaster startup though, which is why you're getting failures then. I suspect that these subdirectories are not in fact getting removed during PG shutdown or restart, but were deleted some time before that. In particular I wonder if somebody's loosed an overaggressive tmp-file-cleaning script on your whole filesystem. Something that was removing empty directories that hadn't been accessed in awhile could explain this. regards, tom lane