Thread: Re: [pgsql-advocacy] Unlogged vs. In-Memory
[ moving to -hacker s] On Thu, Sep 22, 2011 at 9:26 PM, Thom Brown <thom@linux.com> wrote: > On 22 September 2011 17:38, Josh Berkus <josh@agliodbs.com> wrote: >> >>> So are there any plans to allow swappable drive/volatile storage >>> unlogged tables? >> >> Be our guest. ;-) > > Oh it can't be that difficult. On first glance it looks like it's a > case of piggy-backing mdopen and getting it to treat > RELPERSISTENCE_TEMP relations in the same way as it would for > relations during the bootstrap script (i.e. create it if it doesn't > exist)... then telling it not to try reading anything from the > relation... or something like this. But I don't know C so... > *puppy-dog eyes* I don't think that's it. It seems to me that what we really need to do is put the _init forks in a different directory than all the other forks (and then fix pg_upgrade to move them if upgrading from 9.1) - so the core code change is actually in relpathbackend(). The question is how you want to expose that to the user. I mean, you could just put them in another directory under the rug, and call it good. Users who know about the magic under the hood can fiddle with their mount points, and everyone else will be none the wiser. I'm not sure whether that's acceptable from a usability standpoint, however. If you want to expose an SQL-level interface, this gets to be a harder problem... starting with defining what exactly that interface should look like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 23 September 2011 15:12, Robert Haas <robertmhaas@gmail.com> wrote: > [ moving to -hacker s] > > On Thu, Sep 22, 2011 at 9:26 PM, Thom Brown <thom@linux.com> wrote: >> On 22 September 2011 17:38, Josh Berkus <josh@agliodbs.com> wrote: >>> >>>> So are there any plans to allow swappable drive/volatile storage >>>> unlogged tables? >>> >>> Be our guest. ;-) >> >> Oh it can't be that difficult. On first glance it looks like it's a >> case of piggy-backing mdopen and getting it to treat >> RELPERSISTENCE_TEMP relations in the same way as it would for >> relations during the bootstrap script (i.e. create it if it doesn't >> exist)... then telling it not to try reading anything from the >> relation... or something like this. But I don't know C so... >> *puppy-dog eyes* > > I don't think that's it. It seems to me that what we really need to > do is put the _init forks in a different directory than all the other > forks (and then fix pg_upgrade to move them if upgrading from 9.1) - > so the core code change is actually in relpathbackend(). The question > is how you want to expose that to the user. I mean, you could just > put them in another directory under the rug, and call it good. Users > who know about the magic under the hood can fiddle with their mount > points, and everyone else will be none the wiser. > > I'm not sure whether that's acceptable from a usability standpoint, > however. If you want to expose an SQL-level interface, this gets to > be a harder problem... starting with defining what exactly that > interface should look like. Couldn't this come under tablespace changes then? After all the use-case stated would require a separate tablespace, and you could do something like: CREATE VOLATILE TABLESPACE drive_made_of_wax_left_in_the_sun LOCATION '/mnt/ramdisk'; All objects then created or reassigned therein would <insert magic stuff here>. In theory it would be independent of UNLOGGEDness, but I can see this would be problematic because such tables wouldn't be allowed foreign key references to tables within a stable tablespace and vice-versa, since the wonky tablespace could collapse any minute and integrity with it. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Sep 23, 2011 at 10:37 AM, Thom Brown <thom@linux.com> wrote: > Couldn't this come under tablespace changes then? After all the > use-case stated would require a separate tablespace, and you could do > something like: > > CREATE VOLATILE TABLESPACE drive_made_of_wax_left_in_the_sun LOCATION > '/mnt/ramdisk'; > > All objects then created or reassigned therein would <insert magic > stuff here>. In theory it would be independent of UNLOGGEDness, but I > can see this would be problematic because such tables wouldn't be > allowed foreign key references to tables within a stable tablespace > and vice-versa, since the wonky tablespace could collapse any minute > and integrity with it. I don't get it. It would certainly be possible to create a VOLATILE TABLESPACE in which only TEMPORARY tables could be created. We would just disallow the creation of anything other than a temporary table within that tablespace, and if the contents of the tablespace get wiped out, WDC. (Mind you, I think we'd likely want to insist that the pg-version directory manufactured by CREATE TABLESPACE would stick around... or else we'd need some provision for recreating it on every startup.) However, if you want a VOLATILE TABLESPACE to allow not only TEMPORARY but also UNLOGGED objects, it's not so simple, because the _init forks of an unlogged relation are not disposable. Those are not allowed to disappear, or you're going to be in trouble. So the issue still comes down to this: where are we gonna put those _init forks? I guess we could do something like this: CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; All forks of temporary relations, and all non-_init forks of non-temporary relations, could be stored in the VOLATILE LOCATION, while everything else could be stored in the regular LOCATION. Hmm... actually, I kind of like that. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas <robertmhaas@gmail.com> wrote: > CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION > '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; > > All forks of temporary relations, and all non-_init forks of > non-temporary relations, could be stored in the VOLATILE LOCATION, > while everything else could be stored in the regular LOCATION. > > Hmm... actually, I kind of like that. Thoughts? Gah. I mean, all forks of temporary relations, and all non-_init forks of *unlogged* relations, could be stored in the VOLATILE LOCATION. Permanent tables would have all forks in the regular LOCATION, along with _init forks of unlogged tables. Of course, that would have the problem that relpathbackend() would need to know the relpersistence value in order to compute the pathname, which I think is going to be ugly, come to think of it. Hmm... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 23 September 2011 15:56, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION >> '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; >> >> All forks of temporary relations, and all non-_init forks of >> non-temporary relations, could be stored in the VOLATILE LOCATION, >> while everything else could be stored in the regular LOCATION. >> >> Hmm... actually, I kind of like that. Thoughts? > > Gah. I mean, all forks of temporary relations, and all non-_init > forks of *unlogged* relations, could be stored in the VOLATILE > LOCATION. Permanent tables would have all forks in the regular > LOCATION, along with _init forks of unlogged tables. > > Of course, that would have the problem that relpathbackend() would > need to know the relpersistence value in order to compute the > pathname, which I think is going to be ugly, come to think of it. I doubt I understand the whole _init forks thing correctly, but can't the main tablespace provide sanctuary to such volatile supporting meta data (pg_version, _init and whatever else you're worried about) except the actual relation (and its vm/fsm)? Anything you can't afford to lose you get the main tablespace to look after. And instead of having a dir linked to the location in pg_tblspc, an actual dir could exist, containing items directly linked to items in the volatile location. Hmm... it doesn't sound quite right to me either. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Sep 23, 2011 at 11:36 AM, Thom Brown <thom@linux.com> wrote: > On 23 September 2011 15:56, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION >>> '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; >>> >>> All forks of temporary relations, and all non-_init forks of >>> non-temporary relations, could be stored in the VOLATILE LOCATION, >>> while everything else could be stored in the regular LOCATION. >>> >>> Hmm... actually, I kind of like that. Thoughts? >> >> Gah. I mean, all forks of temporary relations, and all non-_init >> forks of *unlogged* relations, could be stored in the VOLATILE >> LOCATION. Permanent tables would have all forks in the regular >> LOCATION, along with _init forks of unlogged tables. >> >> Of course, that would have the problem that relpathbackend() would >> need to know the relpersistence value in order to compute the >> pathname, which I think is going to be ugly, come to think of it. > > I doubt I understand the whole _init forks thing correctly, Basically, for every unlogged table, you get an empty _init fork, and for every index of an unlogged table, you get an _init fork initialized to an empty index. The _init forks are copied over the main forks by the startup process before entering normal running. > but can't > the main tablespace provide sanctuary to such volatile supporting meta > data (pg_version, _init and whatever else you're worried about) except > the actual relation (and its vm/fsm)? Anything you can't afford to > lose you get the main tablespace to look after. And instead of having > a dir linked to the location in pg_tblspc, an actual dir could exist, > containing items directly linked to items in the volatile location. > > Hmm... it doesn't sound quite right to me either. Well, we could certainly Decree From On High that the _init forks are all going to be stored under $PGDATA rather than in the tablespace directories. That would make things simple. Of course, it also means that if you want the _init forks stored somewhere, you are out of luck. Now maybe that is an unlikely scenario. Off the top of my head, the only case I can think of would be if the storage space or inode consumption requirements were problematic - and even then you could stick a symlink in there someplace to make it work, if you're the sort of person who knows how to do that. So maybe it's OK. But it makes me a little uneasy. When people ask to put stuff in a tablespace, I tend to think they want it to show up in that tablespace. Hmm, hmm... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Basically, for every unlogged table, you get an empty _init fork, and > for every index of an unlogged table, you get an _init fork > initialized to an empty index. The _init forks are copied over the > main forks by the startup process before entering normal running. Let's call that metadata. > Well, we could certainly Decree From On High that the _init forks are > all going to be stored under $PGDATA rather than in the tablespace > directories. That would make things simple. Of course, it also means And now you need to associate a non volatile tablespace where to store the metadata of your volatile tablespace where you want to store unlogged data. And we already have a default tablespace, of course. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support