Thread: How about a proper TEMPORARY TABLESPACE?
Hi Hackers,
I was facing a situation were we wanted to set temp_tablespaces to a tablespace on a ephemeral disk (yes, it is AWS ephemeral disk), and I know many users have faced the same situation. Although it seems safe to create a tablespace on ephemeral disks if you use it to store only temporary files (either created by queries or temp tables), PostgreSQL does not have such information, and can't safely prevent a careless user of creating a non-temporary relation on this tablespace.[1] https://github.com/matheusoliveira/postgres/compare/my_temptablespace
Best regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
Attachment
* Matheus de Oliveira (matioli.matheus@gmail.com) wrote: > Hi Hackers, Having read only the subject- +1 from me on the idea. Maybe +1000. > I was facing a situation were we wanted to set temp_tablespaces to a > tablespace on a ephemeral disk (yes, it is AWS ephemeral disk), and I know > many users have faced the same situation. Although it seems safe to create > a tablespace on ephemeral disks if you use it to store only temporary files > (either created by queries or temp tables), PostgreSQL does not have such > information, and can't safely prevent a careless user of creating a > non-temporary relation on this tablespace. Right. > So, what you guys think about letting PG know somehow that a tablespace is > temporary? PG would need to enforce that it's only used for temporary objects as well, of course.. Or at least, that was my thinking on this. > I have took some small time to make a PoC just to see if that is doable. > And so I did a new syntax like: > > CREATE TABLESPACE spcname [TEMP | TEMPORARY] LOCATION ... > > So, if TEMP or TEMPORARY is present, I mark a new column at pg_tablespace > as true. On every table creation or moving to a new tablespace, I just > check this, and fails if the tablespace is "temporary" but the > "relpersistence" says the table is not. Not sure about that specific syntax (don't we have SET options now?) but I do like the general idea. > The other part is, every time some query or relation is asked to be stored > on this tablespace, I create (on-demand) the PG_TEMP_FILES_DIR inside of it > (also if it is temporary). > > The attached patch (and also on my Github, [1]), shows the PoC. For now, > I'm not worried about the code quality, there are yet a lot of work to be > done there (like ALTER TABLESPACE, better testing, use relcache, etc...), > and it is my first hacking on PG (so I'm a newbie). But I'd like to hear > from you guys if such feature is desirable and if I could starting working > on that for real. Also some thoughts about better way of implementing it. Yeah, +1 here and it should go into an appropriate commitfest to get a proper review. This would be *really* nice to have. Thanks, Stephen
<div dir="ltr"><div class="gmail_extra"><br />On Wed, Jun 18, 2014 at 9:00 AM, Stephen Frost <<a href="mailto:sfrost@snowman.net">sfrost@snowman.net</a>>wrote:<br />><br />> > I have took some small time tomake a PoC just to see if that is doable.<br /> > > And so I did a new syntax like:<br />> ><br />> > CREATE TABLESPACE spcname [TEMP | TEMPORARY] LOCATION ...<br />> ><br />> > So, if TEMP or TEMPORARYis present, I mark a new column at pg_tablespace<br /> > > as true. On every table creation or moving to anew tablespace, I just<br />> > check this, and fails if the tablespace is "temporary" but the<br />> > "relpersistence"says the table is not.<br /> ><br />> Not sure about that specific syntax (don't we have SET optionsnow?) but<br />> I do like the general idea.<br />><br /><br /></div><div class="gmail_extra">Maybe somethinglike that:<br /><br />CREATE TABLESPACE spcname LOCATION '/foo/bar' WITH (only_temp_relations = true);<br /></div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Have in mind you must take care if you use ALTER TABLESPACEspcname SET (...) to guarantee that exists only temp objects stored in the target tablespace, and if exists a regularobject you must throw an exception.<br /><br /></div><div class="gmail_extra">Makes sense?<br /></div><div class="gmail_extra"><br/></div><div class="gmail_extra">Regards,<br /><br /></div><div class="gmail_extra">--<br />Fabríziode Royes Mello<br />Consultoria/Coaching PostgreSQL<br /> >> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/>>> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/> >> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
* Fabrízio de Royes Mello (fabriziomello@gmail.com) wrote: > On Wed, Jun 18, 2014 at 9:00 AM, Stephen Frost <sfrost@snowman.net> wrote: > > Not sure about that specific syntax (don't we have SET options now?) but > > I do like the general idea. > > Maybe something like that: > > CREATE TABLESPACE spcname LOCATION '/foo/bar' WITH (only_temp_relations = > true); Yeah, that's more along the lines of what I was thinking. > Have in mind you must take care if you use ALTER TABLESPACE spcname SET > (...) to guarantee that exists only temp objects stored in the target > tablespace, and if exists a regular object you must throw an exception. > > Makes sense? Yes. I'd definitely like to see an ALTER TABLESPACE option, with an ERROR that lists out all of the non-temporary objects which are found (and lists any other databases which have objects in those tablespaces..). That would allow administrators who have existing notionally temporary-only tablespaces to go clean things up to make them actually temporary-only. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > Yes. I'd definitely like to see an ALTER TABLESPACE option, with an > ERROR that lists out all of the non-temporary objects which are found > (and lists any other databases which have objects in those > tablespaces..). That would allow administrators who have existing > notionally temporary-only tablespaces to go clean things up to make them > actually temporary-only. That seems just about impossible from a concurrency standpoint (ie, what if someone is creating a new table in the tablespace concurrently with your check? Possibly in a different database?) I would certainly suggest that the first version of the patch not undertake to allow this property to be ALTERed; the cost-benefit ratio isn't there IMO. regards, tom lane
<div dir="ltr"><div class="gmail_extra">On Wed, Jun 18, 2014 at 1:59 PM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>wrote:<br />><br />> Stephen Frost <<a href="mailto:sfrost@snowman.net">sfrost@snowman.net</a>>writes:<br /> > > Yes. I'd definitely like to see an ALTERTABLESPACE option, with an<br />> > ERROR that lists out all of the non-temporary objects which are found<br />>> (and lists any other databases which have objects in those<br /> > > tablespaces..). That would allow administratorswho have existing<br />> > notionally temporary-only tablespaces to go clean things up to make them<br/>> > actually temporary-only.<br />><br />> That seems just about impossible from a concurrency standpoint<br/> > (ie, what if someone is creating a new table in the tablespace<br />> concurrently with your check? Possibly in a different database?)<br />><br /><br /></div><div class="gmail_extra">You are correct, I completelyforgot that CREATE TABLESPACE is not transaction safe.<br /></div><div class="gmail_extra"><br /><br />> I wouldcertainly suggest that the first version of the patch not<br />> undertake to allow this property to be ALTERed;the cost-benefit<br />> ratio isn't there IMO.<br /> ><br /><br />+1<br /><br /></div><div class="gmail_extra">Regards,<br/></div><div class="gmail_extra"><br />--<br />Fabrízio de Royes Mello<br />Consultoria/CoachingPostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/> >> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/>>> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/> >> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Yes. I'd definitely like to see an ALTER TABLESPACE option, with an > > ERROR that lists out all of the non-temporary objects which are found > > (and lists any other databases which have objects in those > > tablespaces..). That would allow administrators who have existing > > notionally temporary-only tablespaces to go clean things up to make them > > actually temporary-only. > > That seems just about impossible from a concurrency standpoint > (ie, what if someone is creating a new table in the tablespace > concurrently with your check? Possibly in a different database?) Yeah, that's definitely an annoying complexity. > I would certainly suggest that the first version of the patch not > undertake to allow this property to be ALTERed; the cost-benefit > ratio isn't there IMO. I suppose scheduling downtime to do the check manually across all databases, then drop and recreate the tablespace, would work. As someone who's working with a couple of these cases, it'd be awful nice if there was a way PG would handle it for me. Thanks, Stephen
Stephen Frost wrote: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > Stephen Frost <sfrost@snowman.net> writes: > > > Yes. I'd definitely like to see an ALTER TABLESPACE option, with an > > > ERROR that lists out all of the non-temporary objects which are found > > > (and lists any other databases which have objects in those > > > tablespaces..). That would allow administrators who have existing > > > notionally temporary-only tablespaces to go clean things up to make them > > > actually temporary-only. > > I would certainly suggest that the first version of the patch not > > undertake to allow this property to be ALTERed; the cost-benefit > > ratio isn't there IMO. > > I suppose scheduling downtime to do the check manually across all > databases, then drop and recreate the tablespace, would work. As > someone who's working with a couple of these cases, it'd be awful nice > if there was a way PG would handle it for me. I wonder if some form of NOT VALID marking could be useful here. Of course, this is not a constraint. But a mechanism of a similar spirit seems apropos. It seems reasonable to leave such a thing for future improvement. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><div class="gmail_extra"><br />On Wed, Jun 18, 2014 at 4:53 PM, Alvaro Herrera <<a href="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>>wrote:<br />><br />> Stephen Frost wrote:<br/>> > * Tom Lane (<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>) wrote:<br /> > > > StephenFrost <<a href="mailto:sfrost@snowman.net">sfrost@snowman.net</a>> writes:<br />> > > > Yes. I'ddefinitely like to see an ALTER TABLESPACE option, with an<br />> > > > ERROR that lists out all of the non-temporaryobjects which are found<br /> > > > > (and lists any other databases which have objects in those<br/>> > > > tablespaces..). That would allow administrators who have existing<br />> > > >notionally temporary-only tablespaces to go clean things up to make them<br /> > > > > actually temporary-only.<br/>><br />> > > I would certainly suggest that the first version of the patch not<br />>> > undertake to allow this property to be ALTERed; the cost-benefit<br /> > > > ratio isn't there IMO.<br/>> ><br />> > I suppose scheduling downtime to do the check manually across all<br />> > databases,then drop and recreate the tablespace, would work. As<br />> > someone who's working with a couple of thesecases, it'd be awful nice<br /> > > if there was a way PG would handle it for me.<br />><br />> I wonderif some form of NOT VALID marking could be useful here. Of<br />> course, this is not a constraint. But a mechanismof a similar spirit<br /> > seems apropos. It seems reasonable to leave such a thing for future<br />> improvement.<br/>><br /><br />+1<br /><br /></div><div class="gmail_extra">Then, to summarize Matheus must do:<br /></div><divclass="gmail_extra"> * use an option instead of change the syntax and catalog to indicate that a tablespace isused to store temp objects<br /></div><div class="gmail_extra">* throw an exception if we try ALTER the option "only_temp_relations"<br/></div><div class="gmail_extra">* add regression tests<br /></div><div class="gmail_extra">* adddocumentation<br /></div><div class="gmail_extra"><br />And, of course, register to the next open commitfest [1] to getdetailed feedback and review.<br /><br />Regards,<br /><br />[1] <a href="https://commitfest.postgresql.org/">https://commitfest.postgresql.org/</a><br/></div><div class="gmail_extra"><br />--<br/>Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/> >> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/>>> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/> >> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
I was going to answer each message, but Fabrízio summarized everything so far really nicely. Thanks Fabrízio.
On Wed, Jun 18, 2014 at 5:25 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
Then, to summarize Matheus must do:* use an option instead of change the syntax and catalog to indicate that a tablespace is used to store temp objects
Yes. I myself wasn't sure TEMPORARY syntax would be good, but I would just like to hear more about. Storing as an options seems nice to me.
* throw an exception if we try ALTER the option "only_temp_relations"
I think we should postpone the ALTER option, perhaps as another patch. Wasn't that what was decided?
I'll investigate the options better before going this route. Let me work on CREATE first.
* add regression tests* add documentation
And, of course, register to the next open commitfest [1] to get detailed feedback and review.
Noted. It will be on the next commitfest. Just knowing some people do want this make me more comfortable on working better.
Best regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
<div dir="ltr"><div class="gmail_extra">On Wed, Jun 18, 2014 at 10:39 PM, Matheus de Oliveira <<a href="mailto:matioli.matheus@gmail.com">matioli.matheus@gmail.com</a>>wrote:<br />><br />> I was going to answereach message, but Fabrízio summarized everything so far really nicely. Thanks Fabrízio.<br /> ><br /><br />You'rewelcome. :-)<br /><br /> <br />><br />> On Wed, Jun 18, 2014 at 5:25 PM, Fabrízio de Royes Mello <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>>wrote:<br />>><br />>> Then, to summarizeMatheus must do:<br /> >> * use an option instead of change the syntax and catalog to indicate that a tablespaceis used to store temp objects<br />><br />><br />> Yes. I myself wasn't sure TEMPORARY syntax would begood, but I would just like to hear more about. Storing as an options seems nice to me.<br /> > <br /><br /></div><divclass="gmail_extra">Are there somebody that reject this idea? Thoughts?<br /></div><div class="gmail_extra"><br/><br />>><br />>> * throw an exception if we try ALTER the option "only_temp_relations"<br/> ><br />><br />> I think we should postpone the ALTER option, perhaps as another patch.Wasn't that what was decided?<br />><br /><br />Don't worry about that, we can help you. ;-)<br /><br /> <br />><br/>> I'll investigate the options better before going this route. Let me work on CREATE first.<br /> ><br /><br/>See this files:<br />- src/include/commands/tablespace.h (TableSpaceOpts struct at line 35)<br />- src/backend/access/common/reloptions.c(tablespace_reloptions at line 1333)<br /><br /> <br />>><br />>> * addregression tests<br /> >> * add documentation<br />>><br />>> And, of course, register to the next opencommitfest [1] to get detailed feedback and review.<br />><br />><br />> Noted. It will be on the next commitfest.Just knowing some people do want this make me more comfortable on working better.<br /> ><br /><br />Nice...I would like to be a reviewer.<br /><br />Att,<br /><br />--<br />Fabrízio de Royes Mello<br />Consultoria/CoachingPostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/> >> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/>>> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/> >> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/18/2014 08:00 PM, Stephen Frost wrote: > PG would need to enforce that it's only used for temporary objects > as well, of course.. Or at least, that was my thinking on this. A way to put UNLOGGED objects in such a space and have them recovered if they vanish would also be valuable, IMO. Not necessarily in the same patch, I'd just rather keep it in mind so any chosen design doesn't preclude adding that later. - -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTpmsOAAoJELBXNkqjr+S2SNUH/A1zor2WUbQg3PTyD+6wyxlM v6iMpH9zaeMWYyavIHM9N8a+h7hl9ajKDPMuK0vYLMs83W88LXI4T/t1iJlgZ3Po WhyjjdNonChn5v1NEJncb4KgqmRQxyWY43rzgEgofSJQgjFbo8UAwoD8Xa32ghDu 4dx2ijYqYsx/hC1Gb1u/HbQ9rX+OkTA0DwljdUgPqPorW/mn5RVscz9Qo5p/W+XH tdvpu9CKRi4AFPkshZOij+Mu3My863K1+k7sHUpRcsRGJT/AIO1V0wUQE2PuWTAV 5InU5vw/1C6CsYpH4v++XvpeTOnQ4QrMnllh99UTxV91UXN+jx/ENv1SYhBfYGE= =JGYF -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/22/2014 01:35 PM, Craig Ringer wrote: > A way to put UNLOGGED objects in such a space and have them > recovered if they vanish would also be valuable, IMO. By which I mean "re-created empty, as if after a crash". - -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTpmuwAAoJELBXNkqjr+S284oIALPSZVCkqv/A6Oriu48n9LOm KI0cnyiv+kJ8Kor8kDIbweM6DdaUaP0msQOj3vRh8t1GGldF1hmEo83IH/t+sx4t 3B2SUPmHQX5QHa27zmzCk8qAhYKYyUy/HOI0u2DxLm5fGtxpzCjyojo4k8+YkSgM b17eDGrAzlxXJF9BiRj4SBITOxekdgml0yF6J0UPWk98N4cTaXXsMv5r/QBlyXOm /3bdZ0rx0extLcuH+x9iUo7aXC4qCgFJ1ZEzp3xSVGVxU7VA3vQl1O1OR7GHRKJb E3SqS6RiHpWqT2JxD1su/a5XblpF/d5/EfXNxSY7etdNBBarHF5Q6xCNDNard8Y= =30iz -----END PGP SIGNATURE-----
* Craig Ringer (craig@2ndquadrant.com) wrote: > On 06/18/2014 08:00 PM, Stephen Frost wrote: > > PG would need to enforce that it's only used for temporary objects > > as well, of course.. Or at least, that was my thinking on this. > > A way to put UNLOGGED objects in such a space and have them recovered > if they vanish would also be valuable, IMO. Interesting idea. > Not necessarily in the same patch, I'd just rather keep it in mind so > any chosen design doesn't preclude adding that later. Perhaps we need a more complex definition than just "temporary tablespace", as in, we should have a way for users to say "this tablespace is allowed to have objects of type X, Y, Z"? I can see a couple of ways to do that and I don't think it'd require much in the way of changes to the current patch... Thanks, Stephen
On Wed, Jun 18, 2014 at 9:39 PM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote: >> Then, to summarize Matheus must do: >> * use an option instead of change the syntax and catalog to indicate that >> a tablespace is used to store temp objects > > Yes. I myself wasn't sure TEMPORARY syntax would be good, but I would just > like to hear more about. Storing as an options seems nice to me. I kind of like the TEMPORARY syntax, by analogy with what we do for tables. On the other hand, this situation isn't quite analogous: the tablespace itself is permanent; it's only the objects inside the tablespace that are temporary. Hmm. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
<div dir="ltr"><div class="gmail_extra"><br />On Mon, Jun 23, 2014 at 1:17 PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br />><br />> On Wed, Jun 18, 2014 at 9:39PM, Matheus de Oliveira<br /> > <<a href="mailto:matioli.matheus@gmail.com">matioli.matheus@gmail.com</a>> wrote:<br/>> >> Then, to summarize Matheus must do:<br />> >> * use an option instead of change the syntaxand catalog to indicate that<br /> > >> a tablespace is used to store temp objects<br />> ><br />>> Yes. I myself wasn't sure TEMPORARY syntax would be good, but I would just<br />> > like to hear more about.Storing as an options seems nice to me.<br /> ><br />> I kind of like the TEMPORARY syntax, by analogy with whatwe do for<br />> tables. On the other hand, this situation isn't quite analogous: the<br />> tablespace itselfis permanent; it's only the objects inside the<br /> > tablespace that are temporary. Hmm.<br />><br /><br />Thesuggested syntax is about "TEMP LOCATION" and not "TEMP TABLESPACE"...<br /><br /></div><div class="gmail_extra">Regards,<br/><br /></div><div class="gmail_extra"> --<br />Fabrízio de Royes Mello<br />Consultoria/CoachingPostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
<div dir="ltr"><div class="gmail_extra"><br /><div class="gmail_quote">On Sun, Jun 22, 2014 at 2:35 AM, Craig Ringer <spandir="ltr"><<a href="mailto:craig@2ndquadrant.com" target="_blank">craig@2ndquadrant.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="a3s"id=":1x2" style="overflow:hidden">A way to put UNLOGGED objects in such a <span class="il">space</span> and havethem recovered<br /> if they vanish would also be valuable, IMO.<br /><br /> Not necessarily in the same patch, I'd justrather keep it in mind so<br /> any chosen design doesn't preclude adding that later.</div></blockquote></div><br /></div><divclass="gmail_extra">The idea is nice, but I think you should think more about it. Were would we put the "init"files in this case? It surely can't be in the tablespace.<br /><br /></div><div class="gmail_extra">Best regards,<br/></div><div class="gmail_extra">-- <br />Matheus de Oliveira<br />Analista de Banco de Dados<br />Dextra Sistemas- MPS.Br nível F!<br /><a href="http://www.dextra.com.br/postgres/" target="_blank">www.dextra.com.br/postgres</a><br/><br /></div></div>
Hi Hackers,
I have worked on that patch a little more. So now I have functional patch (although still WIP) attached. The feature works as following:
- Added a boolean parameter "only_temp_files" to pg_tablespace.spcoptions;I have worked on that patch a little more. So now I have functional patch (although still WIP) attached. The feature works as following:
I still haven't change documentation, as I think I need some insights about the changes. I have some more thoughts about the syntax and I still think that "TEMP LOCATION" syntax is better suited for this patch. First because of the nature of the changes I made, it seems more suitable to a column on pg_tablespace rather than an option. Second because no ALTER is available (so far) and I think it is odd to have an option that can't be changed. Third, I think "TEMP" keyword is more clear and users can be more used to it.
Thoughts?
Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
Attachment
On 06/29/2014 02:19 AM, Matheus de Oliveira wrote: > Hi Hackers, > > I have worked on that patch a little more. So now I have functional > patch (although still WIP) attached. The feature works as following: > > - Added a boolean parameter "only_temp_files" to pg_tablespace.spcoptions; > - This parameter can be set to true only during CREATE TABLESPACE, not > on ALTER TABLESPACE (I have thought of ways of implementing the latter, > and I'd like to discuss it more latter); > - On the creation of relations, it is checked if it is a > temporary-tablespace, and an error occurs when it is and the relation is > not temporary (temp table or index on a temp table); > - When a temporary file (either relation file or sort/agg file) is > created inside a temporary-tablespace, the entire directories structure > is created on-demand (e.g. if > pg_tblspc/<oid>/<TABLESPACE_VERSION_DIRECTORY> is missing, it is created > on demand) it is done on OpenTemporaryFileInTablespace, at fd.c (I > wonder if shouldn't we do that for any tablespace) and on > TablespaceCreateDbspace, at tablespace.c. Right now PostgreSQL appears to rely on the absence of the tablespace directory as a flag to tell it "don't start up, something's badly wrong here". If the user creates the tablespace directory, it figures they at least know what they're doing and carries on starting up with the empty tablespace. It's treated as an admin statement - "I know it's gone, it's not coming back, just carry on as best you can". If Pg were to auto-create the directory, then if (say) a mount of a tablespace dir failed at boot, Pg would still happily start up and might create files in the tablespace, despite there being inaccessible tables/indexes/etc on the not-mounted volume that's *supposed* to be the tablespace storage. That'd create plenty of mess. So no, I don't think Pg should auto-create the tablespace directory if it's missing for non-temporary tablespaces. Not unless the current (less than friendly) behaviour around lost tablespaces is replaced with something like an `ignore_missing_tablespaces` or `drop_missing_tablespaces` GUC - akin to our existing `zero_missing_pages` recovery GUC. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jun 30, 2014 at 5:14 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
Right now PostgreSQL appears to rely on the absence of the tablespace
directory as a flag to tell it "don't start up, something's badly wrong
here".
Well, in fact the behavior is just to give a LOG message:
LOG: could not open tablespace directory "pg_tblspc/100607/PG_9.3_201306121": No such file or directory
LOG: could not open tablespace directory "pg_tblspc/100607/PG_9.3_201306121": No such file or directory
But it starts fine. No?
I ask because I'm relying on that. My patch does not on startup, so in the absence of the tablespace directory, the above LOG is shown even for "temporary-tablespace", and it tries to create the directory when any process tries to use it. As we don't have catalog access on startup, it will require me to use some kind of _init fork for tablespace if we want it to be re-created during startup. Should we bother about this?
If the user creates the tablespace directory, it figures they at least
know what they're doing and carries on starting up with the empty
tablespace. It's treated as an admin statement - "I know it's gone, it's
not coming back, just carry on as best you can".
Well, I think for a tablespace like that it is okay to create it on-demand (note that I have done this only for "temp", not ordinary ones).
If Pg were to auto-create the directory, then if (say) a mount of a
tablespace dir failed at boot, Pg would still happily start up and might
create files in the tablespace, despite there being inaccessible
tables/indexes/etc on the not-mounted volume that's *supposed* to be the
tablespace storage. That'd create plenty of mess.
So no, I don't think Pg should auto-create the tablespace directory if
it's missing for non-temporary tablespaces.
Ok. I agree. Let's create only for temporary ones (as done by the patch now).
Not unless the current (less
than friendly) behaviour around lost tablespaces is replaced with
something like an `ignore_missing_tablespaces` or
`drop_missing_tablespaces` GUC - akin to our existing
`zero_missing_pages` recovery GUC.
You meant `zero_damaged_pages`, I think. I don't think that is really necessary in this case.
Anyway, I was thinking about some way to this tablespace to also allow creation of non-temporary indexes, and after a crash we could mark such indexes as `NOT VALID` (perhaps `zero_damaged_pages` could be of help here?!). That should be part of another patch thought, and would require more thoughts. But I think that would be a great improvement for some environments, like those on AWS with their ephemeral SSDs.
Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
On Sun, Jun 29, 2014 at 3:19 AM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote: > Hi Hackers, > > I have worked on that patch a little more. So now I have functional patch > (although still WIP) attached. The feature works as following: > > - Added a boolean parameter "only_temp_files" to pg_tablespace.spcoptions; > - This parameter can be set to true only during CREATE TABLESPACE, not on > ALTER TABLESPACE (I have thought of ways of implementing the latter, and I'd > like to discuss it more latter); > - On the creation of relations, it is checked if it is a > temporary-tablespace, and an error occurs when it is and the relation is not > temporary (temp table or index on a temp table); > - When a temporary file (either relation file or sort/agg file) is created > inside a temporary-tablespace, the entire directories structure is created > on-demand (e.g. if pg_tblspc/<oid>/<TABLESPACE_VERSION_DIRECTORY> is > missing, it is created on demand) it is done on > OpenTemporaryFileInTablespace, at fd.c (I wonder if shouldn't we do that for > any tablespace) and on TablespaceCreateDbspace, at tablespace.c. > > I still haven't change documentation, as I think I need some insights about > the changes. I have some more thoughts about the syntax and I still think > that "TEMP LOCATION" syntax is better suited for this patch. First because > of the nature of the changes I made, it seems more suitable to a column on > pg_tablespace rather than an option. Second because no ALTER is available > (so far) and I think it is odd to have an option that can't be changed. > Third, I think "TEMP" keyword is more clear and users can be more used to > it. > > Thoughts? > > I'm going to add the CF app entry next. Could I get some review now or after > discussion about how things are going (remember I'm a newbie on this, so I'm > a little lost)? I got the following warning messages at the compilation. dbcommands.c:420: warning: implicit declaration of function 'is_tablespace_temp_only' indexcmds.c:437: warning: implicit declaration of function 'is_tablespace_temp_only' tablecmds.c:527: warning: implicit declaration of function 'is_tablespace_temp_only' When I created temporary tablespace and executed pg_dumpall, it generated the SQL "ALTER TABLESPACE hoge SET (only_temp_files=on);". This is clearly a bug because that SQL is not allowed to be executed. > ERROR: this tablespace only allows temporary files I think that this ERROR message can be improved, for example, for the database creation case, what about something like the following? ERROR: database cannot be created on temporary tablespace HINT: temporary tablespace is allowed to store only temporaryfiles. Regards, -- Fujii Masao
Hi, I also tried this. This looks nice but seems a bit difficult to find a rasonable behavior. > I have worked on that patch a little more. So now I have functional patch > (although still WIP) attached. The feature works as following: > > - Added a boolean parameter "only_temp_files" to pg_tablespace.spcoptions; > - This parameter can be set to true only during CREATE TABLESPACE, not on > ALTER TABLESPACE (I have thought of ways of implementing the latter, and > I'd like to discuss it more latter); > - On the creation of relations, it is checked if it is a > temporary-tablespace, and an error occurs when it is and the relation is > not temporary (temp table or index on a temp table); > - When a temporary file (either relation file or sort/agg file) is created > inside a temporary-tablespace, the entire directories structure is created > on-demand (e.g. if pg_tblspc/<oid>/<TABLESPACE_VERSION_DIRECTORY> is > missing, it is created on demand) it is done on > OpenTemporaryFileInTablespace, at fd.c (I wonder if shouldn't we do that > for any tablespace) and on TablespaceCreateDbspace, at tablespace.c. > > I still haven't change documentation, as I think I need some insights about > the changes. I have some more thoughts about the syntax and I still think > that "TEMP LOCATION" syntax is better suited for this patch. First because > of the nature of the changes I made, it seems more suitable to a column on > pg_tablespace rather than an option. Second because no ALTER is available > (so far) and I think it is odd to have an option that can't be changed. > Third, I think "TEMP" keyword is more clear and users can be more used to > it. > > Thoughts? > > I'm going to add the CF app entry next. Could I get some review now or > after discussion about how things are going (remember I'm a newbie on this, > so I'm a little lost)? Here is some random comments. 1. I think some users may want to store the temp tablespace in specially created subdirectory, like this. | =# CREATE TABLESPACE hoge LOCATION '/mount_point_of_nonpersist_device/temptblspc1' | WITH (only_temp_files = true); I saw the following message for create table after restarting after "rm -r /mount...ice/*". | =# create temp table thoge (a int) tablespace hoge; | ERROR: could not create directory "pg_tblspc/16435/PG_9.5_201408162": No such file or directory Multiple-depth mkdir would be needed. 2. Creating a temporary table in a tablespace with (only_temp_files = false) after erasing the directory then restarting the server failed showing me the following message only for the first time, | =# create temp table thoge (a int) tablespace hoge; | ERROR: could not create directory "pg_tblspc/16435/PG_9.5_201408162/13004": Success Unpatched head seems always showing 'No such file or directory' from the first time for the case. 3. I saw the following error message during startup after deleting the tablespace directory for the only-temp tablespace. | $ postgres | LOG: database system was shut down at 2014-09-02 16:54:39 JST *| LOG: could not open tablespace directory "pg_tblspc/16435/PG_9.5_201408162": No such file or directory| LOG: autovacuumlauncher started| LOG: database system is ready to accept connections I think the server should refrain from showing this message for laking of the directories for only-temp teblespaces. 4. You inhibited the option only_temp_files from ALTER'ing from false to true but pg_tablesspace.spcoptions unfortunately can be changed directly. Other reloptions for all objects seems not so harmful. | =# update pg_tablespace set spcoptions = '{only_temp_files=true}' where spcname = 'hoge'; Are we allowed to store such a kind of option as reoptions? Or a result from such a bogus operation should be ignored? Or do we ought to protect spcoptions from direct modification? Or ... Any Thoughts? regards, -- Kyotaro Horiguchi NTT Open Source Software Center