Thread: Detach/attach database
Hi, I don't know if this has been discussed before, but would it be feasible to introduce the ability to detach and attach databases? (if you're thinking "stop right there" skip to the end) What I had in mind would be to do something like the following: SELECT pg_detach_database('my_database') pg_detach_database --------------------base/16384 (1 row) Then this database would no longer be accessible. The system tables would be updated to reflect the removal of this database, and some kind of manifest file would be produced in that database directory. The database represented by that returned directory could then be moved elsewhere (or deleted if the user so wished, but isn't a real benefit). But then if another cluster were running with the same version and architecture of the cluster we just detached our database from, we could copy that directory to the base directory of that other cluster (assuming the OID of the database didn't already conflict with an existing one), then run: SELECT pg_attach_database('16384', optional_tablespace_name_here); Then the usual version/architecture/platform checks would happen, and a reading of the manifest to populate system tables, checking for conflicts, then the database brought online. And if I haven't already asked too much, something like this could be run: SELECT pg_start_clone('my_database'); pg_start_clone ----------------base/16384 (1 row) You may now be able to infer where this notion came from, when someone asked if you can clone databases without kicking users off. However, this isn't a schema-only copy, but naturally contains data as well. So that function above would be like pg_start_backup() except that we wouldn't prevent writes to other stable databases, just the candidate one. I personally don't see how this could work with the way we currently replay WAL files. But the idea is that it would create the manifest file like attach_database() except it wouldn't detach the database, and users could continue to use the database. Then the user could copy away the database directory elsewhere, then call: SELECT pg_stop_clone('my_database'); And in theory the user could rename the copy of the directory and move it back and reattach it as an identical copy (or identical at the time of copying it). The most obvious problems I see are related to permissions. All the object level permissions would have to exist in the destination database (which is fine if you're copying it to the same cluster), but even if they do exist, I suspect the OIDs of those users would need to be the same. Then there's extensions and collations. If either of those are unavailable on the destination cluster, it couldn't work either. But then some kind of error message communicating the missing component could be returned when attempting to attach it. Also I don't know how you could pause WAL replay for an individual database. But use cases for this would be: - Backing up select databases for quick re-attachment without lengthy restores - User running out of disk space and wants to move large databases to another server quickly - A large static database is wanted on another server - A copy of a large database is wanted on the same cluster It's just a vague idea, and I'm kind of expecting responses to begin with "Well for a start, this couldn't possible begin to work because..." but that's par for the course. ;) -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Nov 13, 2011 at 1:13 PM, Thom Brown <thom@linux.com> wrote: > I don't know if this has been discussed before, but would it be > feasible to introduce the ability to detach and attach databases? (if > you're thinking "stop right there" skip to the end) What I had in > mind would be to do something like the following: That would be better done at the tablespace level, and then the feature becomes "transportable tablespaces". Which seems like a good and useful idea to me. > You may now be able to infer where this notion came from, when someone > asked if you can clone databases without kicking users off. However, > this isn't a schema-only copy, but naturally contains data as well. The OP wanted to do this without freezing activity on the database, which is not easy... OTOH we can do a backup of just a single database and then filter recovery at database level to produce just a single copy of another database on its own server, if anyone wanted that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Thom Brown <thom@linux.com> writes: > I don't know if this has been discussed before, but would it be > feasible to introduce the ability to detach and attach databases? (if > you're thinking "stop right there" skip to the end) ... skipping ... > It's just a vague idea, and I'm kind of expecting responses to begin > with "Well for a start, this couldn't possible begin to work > because..." but that's par for the course. ;) The main reason this doesn't work is XID management. It's barely possible you could make it work if you first locked all other sessions out of the DB and then froze every XID in the database, but that's a sufficiently heavyweight operation to make it of dubious value. You'd also have to think of some way to ensure that page LSNs in the database are lower than the current WAL endpoint in the receiver. The other thing I'd be concerned about is inconsistency with the global system catalogs in the receiving installation. Consider roles for example: the receiver might not have the same set of roles, probably wouldn't have the same OIDs for those roles, and definitely would be missing the pg_shdepend entries that describe which objects in the transported database are owned by which roles. regards, tom lane
On Sunday, November 13, 2011 13:13:11 Thom Brown wrote: > Hi, > > I don't know if this has been discussed before, but would it be > feasible to introduce the ability to detach and attach databases? (if > you're thinking "stop right there" skip to the end) > It's just a vague idea, and I'm kind of expecting responses to begin > with "Well for a start, this couldn't possible begin to work > because..." but that's par for the course. ;) You would have to do quite some surgery because of oids from shared tables. I don't think thats easily dooable. Andres
On 13 November 2011 16:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thom Brown <thom@linux.com> writes: >> I don't know if this has been discussed before, but would it be >> feasible to introduce the ability to detach and attach databases? (if >> you're thinking "stop right there" skip to the end) > > ... skipping ... > >> It's just a vague idea, and I'm kind of expecting responses to begin >> with "Well for a start, this couldn't possible begin to work >> because..." but that's par for the course. ;) > > The main reason this doesn't work is XID management. > > It's barely possible you could make it work if you first locked all > other sessions out of the DB and then froze every XID in the database, > but that's a sufficiently heavyweight operation to make it of dubious > value. > > You'd also have to think of some way to ensure that page LSNs in the > database are lower than the current WAL endpoint in the receiver. > > The other thing I'd be concerned about is inconsistency with the global > system catalogs in the receiving installation. Consider roles for > example: the receiver might not have the same set of roles, probably > wouldn't have the same OIDs for those roles, and definitely would be > missing the pg_shdepend entries that describe which objects in the > transported database are owned by which roles. I feared such a non-traversable terrain would prevent it being possible. Oh well. Thanks for the explanation though. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 13 November 2011 15:26, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sun, Nov 13, 2011 at 1:13 PM, Thom Brown <thom@linux.com> wrote: > >> I don't know if this has been discussed before, but would it be >> feasible to introduce the ability to detach and attach databases? (if >> you're thinking "stop right there" skip to the end) What I had in >> mind would be to do something like the following: > > That would be better done at the tablespace level, and then the > feature becomes "transportable tablespaces". Which seems like a good > and useful idea to me. I've been trying to think why the tablespace equivalent would be better but can't see it. The reason for detaching a database would be that you want do so something with an entire related set of data. A tablespace can contain just indexes, or a few tables from several databases. >> You may now be able to infer where this notion came from, when someone >> asked if you can clone databases without kicking users off. However, >> this isn't a schema-only copy, but naturally contains data as well. > > The OP wanted to do this without freezing activity on the database, > which is not easy... > > OTOH we can do a backup of just a single database and then filter > recovery at database level to produce just a single copy of another > database on its own server, if anyone wanted that. Filtering recovery sounds very tricky to me. And it's the global objects part which makes things extra difficult. But the whole idea I was seeking sounds riddled with holes anyway, but glad I can at least put it from my mind. So can I humbly request we completely re-architect the whole of PostgreSQL to fit this feature? Thanks. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Nov 14, 2011 at 4:55 AM, Thom Brown <thom@linux.com> wrote: > So can I humbly request we completely re-architect the whole of > PostgreSQL to fit this feature? Thanks. Heh. I have to admit I've thought about this from time to time, and it would be pretty cool. I was initially thinking that it wouldn't be that difficult to do this on a per-database level, because if you slurp up a whole database then by definition you're also including the system catalogs, which means that you have the pg_class, pg_attribute, and pg_type entries that are necessary to interpret the table contents. If you do anything more fine-grained (per-tablespace, per-table, or whatever) then things get much more complex, but at the database level you only need to worry about interactions with other globals: tablespace and role definitions. And we could probably write code to grovel through the system catalogs for a newly "mounted" database and do search and replace on the appropriate columns, to map from the old OIDs to the new ones. It wouldn't be simple, but I think it could be done. But Tom's point about XIDs and LSNs seems like it kind of puts a bullet through the heart of the whole idea. Now, before you can move the database (or table, or whatever) between clusters, you've got to rewrite all the data files to freeze XIDs and, I don't know, zero out LSNs, or something. And if you're going to rewrite all the data, then you've pretty much lost all the benefit of doing this in the first place. In fact, it might end up being *slower* than a dump and restore; even an uncompressed dump will be smaller than the on-disk footprint of the original database, and many dumps compress quite well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 14 November 2011 13:32, Robert Haas <robertmhaas@gmail.com> wrote:
But Tom's point about XIDs and LSNs seems like it kind of puts a
bullet through the heart of the whole idea.
What about having database-level XIDs rather than cluster-level? Is that remotely feasible?
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Nov 14, 2011 at 10:05 AM, Thom Brown <thom@linux.com> wrote: > On 14 November 2011 13:32, Robert Haas <robertmhaas@gmail.com> wrote: >> >> But Tom's point about XIDs and LSNs seems like it kind of puts a >> bullet through the heart of the whole idea. > > What about having database-level XIDs rather than cluster-level? Is that > remotely feasible? Maybe. You'd need a set separate set for shared catalogs, too. It seems like a heck of a lot of work, though, especially since (IME, anyway) most people only really one run one database per cluster. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 14 November 2011 15:07, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Nov 14, 2011 at 10:05 AM, Thom Brown <thom@linux.com> wrote:Maybe. You'd need a set separate set for shared catalogs, too. It
> On 14 November 2011 13:32, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> But Tom's point about XIDs and LSNs seems like it kind of puts a
>> bullet through the heart of the whole idea.
>
> What about having database-level XIDs rather than cluster-level? Is that
> remotely feasible?
seems like a heck of a lot of work, though, especially since (IME,
anyway) most people only really one run one database per cluster.
Thought it would be a lot of work. Well one benefit I could potentially see is paving the way for per-database replication. But I'll let this dream go as it's clearly not something to realistically pursue.
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas wrote: > But Tom's point about XIDs and LSNs seems like it kind of puts a > bullet through the heart of the whole idea. Now, before you can move > the database (or table, or whatever) between clusters, you've got to > rewrite all the data files to freeze XIDs and, I don't know, zero out > LSNs, or something. And if you're going to rewrite all the data, then > you've pretty much lost all the benefit of doing this in the first > place. In fact, it might end up being *slower* than a dump and > restore; even an uncompressed dump will be smaller than the on-disk > footprint of the original database, and many dumps compress quite > well. These are the same limitations pg_upgrade has, except it freezes the system tables of the new cluster (very small) and then moves the clog files from the old cluster to the new cluster to match the user files. No way to really merge two different cluster clog files. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +