Thread: Scanning pg_tablespace from walsender
I'm working on completing Heikki's patch for streaming base backups, and have run into a problem: In order to dump all tablespaces properly, I have to know where they are (d'uh). In order to do that, I need to scan pg_tablespace. However, scanning that from walsender gives me: FATAL: cannot read pg_class without having selected a database Which means I somehow have to get pg_tablespace into the cache without reading pg_class, I guess. Similar to how we do for pg_database for example. Can someone throw me a pointer or two on how to actually do that? :-) Am I correct in assuming I need to add it to RelationCacheInitializePhase2(), and to do that, need to figure out how to define a TableSpaceRelation_Rowtype_Id in the headers? Or is there an easier way I'm missing? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > I'm working on completing Heikki's patch for streaming base backups, > and have run into a problem: > In order to dump all tablespaces properly, I have to know where they > are (d'uh). In order to do that, I need to scan pg_tablespace. Wait a minute. Isn't this problem about to metastasize into "I need to read *every* global catalog from walsender"? If not, why not? If so, I think we need another answer. regards, tom lane
On Mon, Jan 3, 2011 at 10:25 AM, Magnus Hagander <magnus@hagander.net> wrote: > I'm working on completing Heikki's patch for streaming base backups, > and have run into a problem: > > In order to dump all tablespaces properly, I have to know where they > are (d'uh). Can you get that directly from the filesystem layout? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 3, 2011 at 16:29, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> I'm working on completing Heikki's patch for streaming base backups, >> and have run into a problem: > >> In order to dump all tablespaces properly, I have to know where they >> are (d'uh). In order to do that, I need to scan pg_tablespace. > > Wait a minute. Isn't this problem about to metastasize into "I need to > read *every* global catalog from walsender"? If not, why not? If so, > I think we need another answer. Um, why would I need that? I need to be able to find all files, which means I need to find all tablespaces. I don't see how that would turn into "every global catalog"? (It already needs pg_authid and similar for the login, but that's shared with all others) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 3, 2011 at 16:34, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Jan 3, 2011 at 10:25 AM, Magnus Hagander <magnus@hagander.net> wrote: >> I'm working on completing Heikki's patch for streaming base backups, >> and have run into a problem: >> >> In order to dump all tablespaces properly, I have to know where they >> are (d'uh). > > Can you get that directly from the filesystem layout? Hmm. I guess we could enumerate the pg_tblspc directory, and call readlink() on all the symlinks in there. Assuming all platforms can do readlink() (we'd obviously need a special windows implementation, but that's doable I guess). I just figured it'd be a lot cleaner to read it from our own catalogs... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 3, 2011 at 10:37 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Mon, Jan 3, 2011 at 16:34, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Jan 3, 2011 at 10:25 AM, Magnus Hagander <magnus@hagander.net> wrote: >>> I'm working on completing Heikki's patch for streaming base backups, >>> and have run into a problem: >>> >>> In order to dump all tablespaces properly, I have to know where they >>> are (d'uh). >> >> Can you get that directly from the filesystem layout? > > Hmm. I guess we could enumerate the pg_tblspc directory, and call > readlink() on all the symlinks in there. Assuming all platforms can do > readlink() (we'd obviously need a special windows implementation, but > that's doable I guess). > > I just figured it'd be a lot cleaner to read it from our own catalogs... I don't even see why you'd need readlink. Can't you just traverse the symlinks? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 3, 2011 at 16:40, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Jan 3, 2011 at 10:37 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Mon, Jan 3, 2011 at 16:34, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Mon, Jan 3, 2011 at 10:25 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>> I'm working on completing Heikki's patch for streaming base backups, >>>> and have run into a problem: >>>> >>>> In order to dump all tablespaces properly, I have to know where they >>>> are (d'uh). >>> >>> Can you get that directly from the filesystem layout? >> >> Hmm. I guess we could enumerate the pg_tblspc directory, and call >> readlink() on all the symlinks in there. Assuming all platforms can do >> readlink() (we'd obviously need a special windows implementation, but >> that's doable I guess). >> >> I just figured it'd be a lot cleaner to read it from our own catalogs... > > I don't even see why you'd need readlink. Can't you just traverse the symlinks? Well, they need to be put back in the same location on the other machine (slave in case of replication, tarball otherwise). If I just traverse the symlinks, they'll just appears as a subdirectory of pg_tblspc on the other machine, won't they? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Excerpts from Magnus Hagander's message of lun ene 03 12:25:28 -0300 2011: > Can someone throw me a pointer or two on how to actually do that? :-) > Am I correct in assuming I need to add it to > RelationCacheInitializePhase2(), and to do that, need to figure out > how to define a TableSpaceRelation_Rowtype_Id in the headers? Or is > there an easier way I'm missing? I think you just need to add BKI_ROWTYPE_OID and BKI_SCHEMA_MACRO to pg_tablespace.h (yes, and pick a suitable OID for the rowtype). Then figure out formrdesc. HTH -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Magnus Hagander <magnus@hagander.net> writes: > On Mon, Jan 3, 2011 at 16:29, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Wait a minute. �Isn't this problem about to metastasize into "I need to >> read *every* global catalog from walsender"? �If not, why not? �If so, >> I think we need another answer. > Um, why would I need that? I need to be able to find all files, which > means I need to find all tablespaces. I don't see how that would turn > into "every global catalog"? Well, if you just need to find all the files, scan the symlinks in $PGDATA/pg_tblspc/. Don't turn a filesystem problem into a catalog problem. regards, tom lane
On Mon, Jan 3, 2011 at 10:42 AM, Magnus Hagander <magnus@hagander.net> wrote: > Well, they need to be put back in the same location on the other > machine (slave in case of replication, tarball otherwise). If I just > traverse the symlinks, they'll just appears as a subdirectory of > pg_tblspc on the other machine, won't they? Sure, I guess you'd need to read the links if you want it to work that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jan 3, 2011 at 10:42 AM, Magnus Hagander <magnus@hagander.net> wrote: >> Well, they need to be put back in the same location on the other >> machine (slave in case of replication, tarball otherwise). If I just >> traverse the symlinks, they'll just appears as a subdirectory of >> pg_tblspc on the other machine, won't they? > Sure, I guess you'd need to read the links if you want it to work that way. Well, you're quietly ignoring a whole bunch of issues there, like whether the tablespaces *should* be in the identical locations on the other machine and how you'll deal with it if not. Eventually there's going to need to be some sort of "tablespace mapping" option for replication. But anyway, taking a base backup is fundamentally defined as "scan the filesystem, paying no attention to catalogs" and ISTM that it obviously should be the same way for tablespaces. regards, tom lane
On Mon, Jan 3, 2011 at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Jan 3, 2011 at 10:42 AM, Magnus Hagander <magnus@hagander.net> wrote: >>> Well, they need to be put back in the same location on the other >>> machine (slave in case of replication, tarball otherwise). If I just >>> traverse the symlinks, they'll just appears as a subdirectory of >>> pg_tblspc on the other machine, won't they? > >> Sure, I guess you'd need to read the links if you want it to work that way. > > Well, you're quietly ignoring a whole bunch of issues there, like > whether the tablespaces *should* be in the identical locations on the > other machine and how you'll deal with it if not. Eventually there's > going to need to be some sort of "tablespace mapping" option for > replication. But anyway, taking a base backup is fundamentally defined > as "scan the filesystem, paying no attention to catalogs" and ISTM that > it obviously should be the same way for tablespaces. I'm doing that now, and it works fine on my linux box. Haven't looked at a win32 implementation yet, but that can certainly be done. As for relocating tablespaces - yes, that would be very useful. But at this point, we *do* require them to be at the same place on the box you restore to (whether it's a backup or a slave). That said, it seems we don't actually ever *care* - from my quick grep of the source, it seems we never ever read the location from the catalog - we just store it there for reference. So in theory, we should be able to relocate a tablespace by just changing the symlink. But that would leave pg_tablespace and the filesystem out of sync, so we probably shouldn't do that. Either way, relocating tablespaces is for the future, let's start with being able to do streaming base backups at all. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
* Robert Haas (robertmhaas@gmail.com) wrote: > On Mon, Jan 3, 2011 at 10:42 AM, Magnus Hagander <magnus@hagander.net> wrote: > > Well, they need to be put back in the same location on the other > > machine (slave in case of replication, tarball otherwise). If I just > > traverse the symlinks, they'll just appears as a subdirectory of > > pg_tblspc on the other machine, won't they? > > Sure, I guess you'd need to read the links if you want it to work that way. Have to admit, I'm not entirely sure if this is really the behavior that makes the most sense. My gut reaction to this is that it'd make more sense for them to end up as directories rather than symlinks to places that might not exist on the slave, or that might not be writable by PG on the slave. I can see arguments either way though and so I really don't like the idea of it being forced one way or the other. Here's my 2c- make it optional on the slave side and then don't complain if the symlink already exists (even if it goes somewhere else). My thinking is that if someone needs to have the tablespaces reside somewhere else on the slave, they could say "don't create the symlinks" in the recovery config, and then manually create the symlinks where they need them to go. Thanks, Stephen
On Mon, Jan 3, 2011 at 17:17, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Mon, Jan 3, 2011 at 10:42 AM, Magnus Hagander <magnus@hagander.net> wrote: >> > Well, they need to be put back in the same location on the other >> > machine (slave in case of replication, tarball otherwise). If I just >> > traverse the symlinks, they'll just appears as a subdirectory of >> > pg_tblspc on the other machine, won't they? >> >> Sure, I guess you'd need to read the links if you want it to work that way. > > Have to admit, I'm not entirely sure if this is really the behavior that > makes the most sense. My gut reaction to this is that it'd make more > sense for them to end up as directories rather than symlinks to places > that might not exist on the slave, or that might not be writable by PG > on the slave. I can see arguments either way though and so I really > don't like the idea of it being forced one way or the other. > > Here's my 2c- make it optional on the slave side and then don't complain > if the symlink already exists (even if it goes somewhere else). My > thinking is that if someone needs to have the tablespaces reside > somewhere else on the slave, they could say "don't create the symlinks" > in the recovery config, and then manually create the symlinks where they > need them to go. It's already a basic requirement that they need to be the same - replicating over a CREATE TABLESPACE is going to fail otherwise.. Being able to deal with that in a nice way would be good, of course, but we don't have that today. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
* Magnus Hagander (magnus@hagander.net) wrote: > It's already a basic requirement that they need to be the same - > replicating over a CREATE TABLESPACE is going to fail otherwise.. > Being able to deal with that in a nice way would be good, of course, > but we don't have that today. If CREATE TABLESPACE replication also looked at the flag I was proposing, it could work. :) Of course, the admin wouldn't be able to move the directory/change the symlink to where they actually want it to be w/o taking the replication server down, but I'm not sure that's a show-stopper... It's certainly not the cleanest/nicest approach, don't get me wrong, but I really hate the idea of forcing people to have an identical filesystem layout on the slave that they have on the master. Stephen