Thread: Problem with dropping a tablespace
Hello, I'm unable to drop a recently created tablespace (the partition on which the tablespace relies on, run out of space - postgresql seems to have a problem, now): template1=# drop tablespace disk1; ERROR: tablespace "disk1" is not empty Huh? Okay...lets dig into... template1=# select oid from pg_tablespace where spcname = 'disk1'; oid ----------- 594611987 (1 row) template1=# select * from pg_tablespace_databases(594611987); pg_tablespace_databases ------------------------- 595675173 (1 row) Aha...so there seems to be a database associated to this tablespace - lets see which one - template1=# select * from pg_database where oid = 595675173; datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid | datvacuumxid | datfrozenxid | dattablespace | datconfig | datacl ---------+--------+----------+---------------+--------------+---------------+--------------+--------------+---------------+-----------+-------- (0 rows) Mhh. Nothing. How may I delete the tablespace manually? Best regards Oliver
On Mon, Aug 01, 2005 at 09:28:07PM +0200, Oliver Siegmar wrote: > > template1=# select * from pg_tablespace_databases(594611987); > pg_tablespace_databases > ------------------------- > 595675173 > (1 row) > > Aha...so there seems to be a database associated to this tablespace - lets see > which one - > > template1=# select * from pg_database where oid = 595675173; > datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid | > datvacuumxid | datfrozenxid | dattablespace | datconfig | datacl > ---------+--------+----------+---------------+--------------+---------------+--------------+--------------+---------------+-----------+-------- > (0 rows) > > Mhh. Nothing. > > How may I delete the tablespace manually? Perhaps a better question to ask is: why does pg_tablespace_databases() think that database 595675173 has objects in the tablespace? What platform are you using? If some flavor of Unix, what's the output of the following command? ls -alR /path/to/tablespace/directory (Replace the path with the tablespace's directory; you'll probably need to be the directory owner or root to run "ls" without getting a "Permission denied" error). -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Monday 01 August 2005 22:15, Michael Fuhr wrote: > On Mon, Aug 01, 2005 at 09:28:07PM +0200, Oliver Siegmar wrote: > > template1=# select * from pg_tablespace_databases(594611987); > > pg_tablespace_databases > > ------------------------- > > 595675173 > > (1 row) > > > > Aha...so there seems to be a database associated to this tablespace - > > lets see which one - > > > > template1=# select * from pg_database where oid = 595675173; > > datname | datdba | encoding | datistemplate | datallowconn | > > datlastsysoid | datvacuumxid | datfrozenxid | dattablespace | datconfig | > > datacl > > ---------+--------+----------+---------------+--------------+------------ > >---+--------------+--------------+---------------+-----------+-------- (0 > > rows) > > > > Mhh. Nothing. > > > > How may I delete the tablespace manually? > > Perhaps a better question to ask is: why does pg_tablespace_databases() > think that database 595675173 has objects in the tablespace? What > platform are you using? If some flavor of Unix, what's the output > of the following command? > > ls -alR /path/to/tablespace/directory This directory indeed contains a subdirectory named 595675173 (the ghost's database oid ;-)) Is it save to shutdown the postmaster, remove this directory and restart the postmaster again? Are there any system tables that need updates after removing the database manually? Best Oliver
On Tue, Aug 02, 2005 at 08:00:28AM +0200, Oliver Siegmar wrote: > On Monday 01 August 2005 22:15, Michael Fuhr wrote: > > On Mon, Aug 01, 2005 at 09:28:07PM +0200, Oliver Siegmar wrote: > > > How may I delete the tablespace manually? > > > > Perhaps a better question to ask is: why does pg_tablespace_databases() > > think that database 595675173 has objects in the tablespace? What > > platform are you using? If some flavor of Unix, what's the output > > of the following command? > > > > ls -alR /path/to/tablespace/directory > > This directory indeed contains a subdirectory named 595675173 (the ghost's > database oid ;-)) Does that subdirectory contain anything? That's part of why I asked for the "ls -alR" output; I was also curious to see if there were any surprises in ownership and permissions. > Is it save to shutdown the postmaster, remove this directory and restart the > postmaster again? Are there any system tables that need updates after > removing the database manually? I'll defer to the developers to answer that, at least until I've done more research. But the question that needs answering is: why does that directory still exist if the database has been dropped? Maybe one of the developers will have some ideas, especially if we can see the directory's contents. Please post the output of the "ls" command shown above, i.e., a recursive long listing of all files under the tablespace directory. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Tuesday 02 August 2005 15:16, Michael Fuhr wrote: > > This directory indeed contains a subdirectory named 595675173 (the > > ghost's database oid ;-)) > > Does that subdirectory contain anything? That's part of why I asked > for the "ls -alR" output; I was also curious to see if there were > any surprises in ownership and permissions. Yeah. There are a lot of files in it. Owner and group is postgres (the postmaster user). Permissions are set to 0600 - the directory itself has 0700. > we can see the directory's contents. Please post the output of the > "ls" command shown above, i.e., a recursive long listing of all files > under the tablespace directory. I didn't want to "spam" the list... See the attatched file... Thanks Oliver
Attachment
On Tue, Aug 02, 2005 at 03:25:53PM +0200, Oliver Siegmar wrote: > On Tuesday 02 August 2005 15:16, Michael Fuhr wrote: > > > This directory indeed contains a subdirectory named 595675173 (the > > > ghost's database oid ;-)) > > > > Does that subdirectory contain anything? That's part of why I asked > > for the "ls -alR" output; I was also curious to see if there were > > any surprises in ownership and permissions. > > Yeah. There are a lot of files in it. Owner and group is postgres (the > postmaster user). Permissions are set to 0600 - the directory itself has > 0700. Hmmm...based on the file names, it looks like the directory contains only system tables. Do you know what database this was? Did you explicitly drop it, or is its disappearance (or appearance) a mystery? Didn't you say that the tablespace's filesystem ran out of space? When did that happen in relation to this database's existence (during create, during drop, etc.)? Are there any unusual messages in the PostgreSQL logs? If any of the developers are following this thread, could the tablespace's filesystem running out of space have resulted in this phantom directory, perhaps due to a create or drop that didn't complete? Is there any danger in Oliver removing the directory if pg_database doesn't know about the database oid? Should he shut down the postmaster before removing it, or shouldn't that matter? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Tuesday 02 August 2005 17:01, Michael Fuhr wrote: > Hmmm...based on the file names, it looks like the directory contains > only system tables. Do you know what database this was? Yes, a business database - not a system one. The database were created using template0. > Did you > explicitly drop it, or is its disappearance (or appearance) a > mystery? I dropped the database with 'DROP DATABASE xxx;' without any problems (after the tablespace run out of space). BTW: The 'default' tablespace (pg_default and pg_global) where all the system stuff relies on, didn't ran out of space - only a temporarily created tablespace on another partition. > Didn't you say that the tablespace's filesystem ran out > of space? When did that happen in relation to this database's > existence (during create, during drop, etc.)? It happened while performing a 'pg_dump -F p business_db | pg_restore -d disk1'. > Are there any unusual > messages in the PostgreSQL logs? Well, the only unusual messages were when the space ran out - tons of entries like this: ERROR: could not write block 13632 of relation 594611987/597873915/736358396: No space left on device Thanks Oliver
Oliver Siegmar <o.siegmar@vitrado.de> writes: > On Tuesday 02 August 2005 17:01, Michael Fuhr wrote: >> Hmmm...based on the file names, it looks like the directory contains >> only system tables. Do you know what database this was? > Yes, a business database - not a system one. The database were created using > template0. >> Did you explicitly drop it, or is its disappearance (or appearance) a >> mystery? > I dropped the database with 'DROP DATABASE xxx;' without any problems (after > the tablespace run out of space). How exactly do you know that OID 595675173 is the database you dropped, and not that of some other DB? I'm theorizing that the scenario went like this: CREATE DATABASE starts to create a database, for which it assigns the OID 595675173. Copying the template database goes fine. (If we'd run out of space in this step, we'd have removed the partially copied directories before reporting failure.) While trying to make the pg_database entry for the new database, we run out of space and fail. There isn't any provision for removing the copied data if we fail at late stages of the CREATE process :-(. I'm not sure it's possible to plug this hole completely, but we could at least catch any error up to the end of createdb() by wrapping all that code in a PG_TRY/PG_RECOVER block. With the recent shared dependency patch, our exposure to an out-of-space failure at this stage has gone up by orders of magnitude because so many more catalog rows need to be added. regards, tom lane
Michael Fuhr <mike@fuhr.org> writes: > ... Is there any danger in Oliver removing the directory if > pg_database doesn't know about the database oid? No, but I'd counsel not doing so until we're certain we can't learn any more about what happened. > Should he shut down the postmaster before removing it, or shouldn't > that matter? Shouldn't be necessary. regards, tom lane
On Tuesday 02 August 2005 18:42, Tom Lane wrote: > > I dropped the database with 'DROP DATABASE xxx;' without any problems > > (after the tablespace run out of space). > > How exactly do you know that OID 595675173 is the database you dropped, > and not that of some other DB? I don't know that for sure, but I can't remember having created and dropped an other database within this tablespace. > I'm theorizing that the scenario went like this: > > CREATE DATABASE starts to create a database, for which it > assigns the OID 595675173. > > Copying the template database goes fine. (If we'd run out of > space in this step, we'd have removed the partially copied > directories before reporting failure.) > > While trying to make the pg_database entry for the new database, > we run out of space and fail. At which stage this pg_database entry gets created? I'm very sure, that there was plenty of free disk space at the time of 'CREATE TABLE' statement. The space ran out gigabytes after starting pg_restore. I did a hexdump on the files within the tabelspace directory...no business data at all, only postgres internals (I saw a lot of function names and datatypes). Let me know if I shall dig more information...otherwise I'm happy if I can safely remove the directory. Best Oliver
Oliver Siegmar <o.siegmar@vitrado.de> writes: > I did a hexdump on the files within the tabelspace directory...no business > data at all, only postgres internals (I saw a lot of function names and > datatypes). Would you look at pg_class in particular (file 1259) and confirm that it contains only names of Postgres system catalogs and indexes, no tables of your own? If so, I think the evidence is pretty strong that this database was in process of being created. A DB that had actually gotten used at all would have more entries in pg_class. regards, tom lane
On Tuesday 02 August 2005 19:49, Tom Lane wrote: > Oliver Siegmar <o.siegmar@vitrado.de> writes: > > I did a hexdump on the files within the tabelspace directory...no > > business data at all, only postgres internals (I saw a lot of function > > names and datatypes). > > Would you look at pg_class in particular (file 1259) and confirm that it > contains only names of Postgres system catalogs and indexes, no tables > of your own? Everything in this file seems to be postgres related. Not a single table of our own. I removed the database directory and dropped the tablespace (by DROP TABLESPACE) successfully. Michael, Tom - thanks a bunch your help! Best Oliver
Oliver Siegmar <o.siegmar@vitrado.de> writes: > On Tuesday 02 August 2005 19:49, Tom Lane wrote: >> Would you look at pg_class in particular (file 1259) and confirm that it >> contains only names of Postgres system catalogs and indexes, no tables >> of your own? > Everything in this file seems to be postgres related. Not a single table of > our own. OK. I think that's sufficient evidence for my hypothesis (CREATE DATABASE failed partway through). I'm working on cleaning that code up. regards, tom lane