Thread: Reviewing temp_tablespaces GUC patch

Reviewing temp_tablespaces GUC patch

From
Bernd Helmle
Date:
While looking at Jaime's last temp_tablespaces GUC patch, i've got some
concerns about it's current implementation:

1)

The code claims that OIDs of temp tablespaces couldn't be cached, therefore 
it is parsing and
re-reading the GUCs in GetTempTablespace() with SplitIdentifierNames() over 
and over again. Because GetTempTablespace() is likely to be called many 
times in queries with a good amount of search operations, i believe this 
could be done better by allocating a list of OIDs in permanent storage 
(TopMemoryContext) and use this OID list to re-check them in 
GetTempTablespace() (i have modified the patch and it seems to work). This 
would save us to split the GUC every time.

2)

It's possible that someone could drop a temporary tablespace between 
subsequent usage of GetTempTablespace() when they are empty. This leads to 
strange NOTICEs like

NOTICE:  could not create temporary file 
"pg_tblspc/16387/pgsql_tmp/pgsql_tmp19942.0"

during query execution. However, the code is save enough and switches back 
to base/pgsql_tmp then, but this looks a little bit ugly to me. The silent 
mechanism to drop a tablespace during temporary usage makes me a little bit 
uncomfortable about its robustness.

Comments?

--  Thanks
                   Bernd


Re: Reviewing temp_tablespaces GUC patch

From
Alvaro Herrera
Date:
Bernd Helmle escribió:

> It's possible that someone could drop a temporary tablespace between 
> subsequent usage of GetTempTablespace() when they are empty. This leads to 
> strange NOTICEs like
> 
> NOTICE:  could not create temporary file 
> "pg_tblspc/16387/pgsql_tmp/pgsql_tmp19942.0"
> 
> during query execution. However, the code is save enough and switches back 
> to base/pgsql_tmp then, but this looks a little bit ugly to me. The silent 
> mechanism to drop a tablespace during temporary usage makes me a little bit 
> uncomfortable about its robustness.

What happens if you create a cursor that stores something (sort
intermediate results?) in a temp tablespace, FETCH some from it, then
someone else drops the tablespace and FETCH some more?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/24/07, Bernd Helmle <mailings@oopsware.de> wrote:
> While looking at Jaime's last temp_tablespaces GUC patch, i've got some
> concerns about it's current implementation:
>

the original patch is from Albert Cervera =)

> 1)
>
> The code claims that OIDs of temp tablespaces couldn't be cached, therefore
> it is parsing and
> re-reading the GUCs in GetTempTablespace() with SplitIdentifierNames() over
> and over again. Because GetTempTablespace() is likely to be called many
> times in queries with a good amount of search operations, i believe this
> could be done better by allocating a list of OIDs in permanent storage
> (TopMemoryContext) and use this OID list to re-check them in
> GetTempTablespace() (i have modified the patch and it seems to work). This
> would save us to split the GUC every time.
>

sounds good. can we see the new patch?

> 2)
>
> It's possible that someone could drop a temporary tablespace between
> subsequent usage of GetTempTablespace() when they are empty. This leads to
> strange NOTICEs like
>
> NOTICE:  could not create temporary file
> "pg_tblspc/16387/pgsql_tmp/pgsql_tmp19942.0"
>
> during query execution. However, the code is save enough and switches back
> to base/pgsql_tmp then, but this looks a little bit ugly to me.

the reason for those messages is that the tablespace can get full or
can be dropped before use, so we throw the message for the dba to take
actions. if no one thinks is a good idea the message can be removed.

> The silent
> mechanism to drop a tablespace during temporary usage makes me a little bit
> uncomfortable about its robustness.
>

maybe using the list you put in TopMemoryContext we can deny the
ability to drop the tablespace until it's removed from the list of
temp tablespaces.

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/24/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Bernd Helmle escribió:
>
> > It's possible that someone could drop a temporary tablespace between
> > subsequent usage of GetTempTablespace() when they are empty. This leads to
> > strange NOTICEs like
> >
> > NOTICE:  could not create temporary file
> > "pg_tblspc/16387/pgsql_tmp/pgsql_tmp19942.0"
> >
> > during query execution. However, the code is save enough and switches back
> > to base/pgsql_tmp then, but this looks a little bit ugly to me. The silent
> > mechanism to drop a tablespace during temporary usage makes me a little bit
> > uncomfortable about its robustness.
>
> What happens if you create a cursor that stores something (sort
> intermediate results?) in a temp tablespace, FETCH some from it, then
> someone else drops the tablespace and FETCH some more?
>

you can't drop a tablespace that is not empty.

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
Tom Lane
Date:
"Jaime Casanova" <systemguards@gmail.com> writes:
> On 5/24/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:
>> What happens if you create a cursor that stores something (sort
>> intermediate results?) in a temp tablespace, FETCH some from it, then
>> someone else drops the tablespace and FETCH some more?

> you can't drop a tablespace that is not empty.

So a temp file left over by a crashed backend would indefinitely prevent
dropping the tablespace, until someone manually cleaned it up?
        regards, tom lane


Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jaime Casanova" <systemguards@gmail.com> writes:
> > On 5/24/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> >> What happens if you create a cursor that stores something (sort
> >> intermediate results?) in a temp tablespace, FETCH some from it, then
> >> someone else drops the tablespace and FETCH some more?
>
> > you can't drop a tablespace that is not empty.
>
> So a temp file left over by a crashed backend would indefinitely prevent
> dropping the tablespace, until someone manually cleaned it up?
>

No, because the RemovePgTempFiles() call in PostmasterMain() will
remove all tmp files at startup.

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
Bernd Helmle
Date:
--On Freitag, Mai 25, 2007 10:49:29 +0000 Jaime Casanova 
<systemguards@gmail.com> wrote:

> No, because the RemovePgTempFiles() call in PostmasterMain() will
> remove all tmp files at startup.

Hmm isn't RemovePgTempFiles() called on postmaster startup only? What will 
happen if a temp tablespace is out of disk space, and the backend leaves 
all previously created temp files there? Under these assumption we'll need 
to restart the postmaster to get a clean tablespace ready to drop...

--  Thanks
                   Bernd


Re: Reviewing temp_tablespaces GUC patch

From
Bernd Helmle
Date:
--On Freitag, Mai 25, 2007 00:02:06 +0000 Jaime Casanova
<systemguards@gmail.com> wrote:


> the original patch is from Albert Cervera =)

Ah, missed that, thanks ;)


>>
>
> sounds good. can we see the new patch?

Attached tablespace.c.diff shows my current changes to use an OID  lookup
list.

>
> the reason for those messages is that the tablespace can get full or
> can be dropped before use, so we throw the message for the dba to take
> actions. if no one thinks is a good idea the message can be removed.
>

I could imagine that this could irritate DBA's (at least, that is what
happened to me during testing). It's okay that someone could drop a
tablespace concurrently to other transactions, but i have concerns that
with temp_tablespaces this could happen during _queries_. Do queries
delete/recreate temp files during execution, maybe within sorts so that the
used temp tablespace looks empty for a certain period of time?

>> The silent
>> mechanism to drop a tablespace during temporary usage makes me a little
>> bit uncomfortable about its robustness.
>>
>
> maybe using the list you put in TopMemoryContext we can deny the
> ability to drop the tablespace until it's removed from the list of
> temp tablespaces.

That would mean we have to share this information between backends. This
looks complicated since every user could have its own temp_tablespaces
GUC....


--
  Thanks

                    Bernd
Attachment

Re: Reviewing temp_tablespaces GUC patch

From
Tom Lane
Date:
Bernd Helmle <mailings@oopsware.de> writes:
> --On Freitag, Mai 25, 2007 10:49:29 +0000 Jaime Casanova 
> <systemguards@gmail.com> wrote:
>> No, because the RemovePgTempFiles() call in PostmasterMain() will
>> remove all tmp files at startup.

> Hmm isn't RemovePgTempFiles() called on postmaster startup only? What will 
> happen if a temp tablespace is out of disk space, and the backend leaves 
> all previously created temp files there? Under these assumption we'll need 
> to restart the postmaster to get a clean tablespace ready to drop...

Theoretically, a backend will always remove its temp files during
transaction abort, so the only case that is really of concern is a
backend crashing before it can get around to doing that.  However, I
believe we do not call RemovePgTempFiles during a crash recovery cycle;
this is intentional on the theory that the temp files might contain
useful debugging clues.  So there is a potential problem there.
Not sure how important it really is though --- neither crashes nor
tablespace drops ought to be so common that we need a really nice
solution.
        regards, tom lane


Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bernd Helmle <mailings@oopsware.de> writes:
> > --On Freitag, Mai 25, 2007 10:49:29 +0000 Jaime Casanova
> > <systemguards@gmail.com> wrote:
> >> No, because the RemovePgTempFiles() call in PostmasterMain() will
> >> remove all tmp files at startup.
>
> I believe we do not call RemovePgTempFiles during a crash recovery
> cycle; this is intentional on the theory that the temp files might contain
> useful debugging clues.

ah, i forgot that

>  So there is a potential problem there.
> Not sure how important it really is though --- neither crashes nor
> tablespace drops ought to be so common that we need a really nice
> solution.
>

the only semi-sane solution i can think of, is to have a superuser
only function that acts as a wrapper for RemovePgTempFiles(), but
still exists a chance for shoot yourself on the foot...

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/25/07, Bernd Helmle <mailings@oopsware.de> wrote:
> --On Freitag, Mai 25, 2007 00:02:06 +0000 Jaime Casanova
> <systemguards@gmail.com> wrote:
> >>
> >
> > sounds good. can we see the new patch?
>
> Attached tablespace.c.diff shows my current changes to use an OID  lookup
> list.
>
> >

+       if (source >= PGC_S_INTERACTIVE && IsTransactionState())
+       {
+           /*
+            * Verify that all the names are valid tablespace names
+            * We do not check for USAGE rights should we?
+            */
+           Oid cur_tblspc = get_tablespace_oid(curname);
+           if (cur_tblspc == InvalidOid)
+           {
+               ereport((source == PGC_S_TEST) ? NOTICE : ERROR,
+                       (errcode(ERRCODE_UNDEFINED_OBJECT),
+                        errmsg("tablespace \"%s\" does not exist", curname)));
+           }
+           else
+           {
+               /*
+                * Append new OID to temporary list. We can't
+                * use the lookup table directly, because there could
+                * be an ereport() in subsequent loops.
+                */
+               oidlist = lappend_oid(oidlist, cur_tblspc);
+           }
+       }


the list of oid's is only filled when you execute
SET temp_tablespaces = 'somelist'

but if you use the GUC in postgresql.conf at startup then not, so the
temp_tablespaces are not used even if they are setted

can you do that outside
+       if (source >= PGC_S_INTERACTIVE && IsTransactionState())

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/25/07, Bernd Helmle <mailings@oopsware.de> wrote:
> --On Freitag, Mai 25, 2007 00:02:06 +0000 Jaime Casanova
> <systemguards@gmail.com> wrote:
> > sounds good. can we see the new patch?
>
> Attached tablespace.c.diff shows my current changes to use an OID  lookup
> list.
>

on second thought, what happens if someone drops an empty tablespace,
that already is in the temp_tablespace GUC, and recreate it (one
scenario for this is if you want to move the tablespace to a newer
better/faster location). then will have an invalid oid until at least
you execute a new SET temp_tablespaces. And we know some "DBA's"
doesn't read the manual, so maybe this behaviour will be unexpected
for them...

> >
> > the reason for those messages is that the tablespace can get full or
> > can be dropped before use, so we throw the message for the dba to take
> > actions. if no one thinks is a good idea the message can be removed.
> >
>
> I could imagine that this could irritate DBA's (at least, that is what
> happened to me during testing).

Well at least with this message they will be alerted, but still seems
silly to me... (make a SET with the same list just for updating cached
OID's)

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
Robert Treat
Date:
On Friday 25 May 2007 12:39, Jaime Casanova wrote:
> On 5/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Bernd Helmle <mailings@oopsware.de> writes:
> > > --On Freitag, Mai 25, 2007 10:49:29 +0000 Jaime Casanova
> > >
> > > <systemguards@gmail.com> wrote:
> > >> No, because the RemovePgTempFiles() call in PostmasterMain() will
> > >> remove all tmp files at startup.
> >
> > I believe we do not call RemovePgTempFiles during a crash recovery
> > cycle; this is intentional on the theory that the temp files might
> > contain useful debugging clues.
>
> ah, i forgot that
>
> >  So there is a potential problem there.
> > Not sure how important it really is though --- neither crashes nor
> > tablespace drops ought to be so common that we need a really nice
> > solution.
>
> the only semi-sane solution i can think of, is to have a superuser
> only function that acts as a wrapper for RemovePgTempFiles(), but
> still exists a chance for shoot yourself on the foot...

If there was a way for DBA's to know they could safely delete the left-over 
files (maybe the files timestamp is older than postmaster start; though not 
sure how you measure that), then I think this would be enough to give them a 
way out.  Of course maybe that level of smarts could be put into drop 
tablespace itself?  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


Re: Reviewing temp_tablespaces GUC patch

From
Jim Nasby
Date:
On May 27, 2007, at 1:19 PM, Robert Treat wrote:
> On Friday 25 May 2007 12:39, Jaime Casanova wrote:
>> On 5/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Bernd Helmle <mailings@oopsware.de> writes:
>>>>> No, because the RemovePgTempFiles() call in PostmasterMain() will
>>>>> remove all tmp files at startup.
>>> I believe we do not call RemovePgTempFiles during a crash recovery
>>> cycle; this is intentional on the theory that the temp files might
>>> contain useful debugging clues.
>>>  So there is a potential problem there.
>>> Not sure how important it really is though --- neither crashes nor
>>> tablespace drops ought to be so common that we need a really nice
>>> solution.

While tablespace operations won't normally be a day-to-day thing,  
it's not good if users can't delete a tablespace because it's not  
empty, but can't find out what exactly is in the tablespace.

>> the only semi-sane solution i can think of, is to have a superuser
>> only function that acts as a wrapper for RemovePgTempFiles(), but
>> still exists a chance for shoot yourself on the foot...
> If there was a way for DBA's to know they could safely delete the  
> left-over
> files (maybe the files timestamp is older than postmaster start;  
> though not
> sure how you measure that), then I think this would be enough to  
> give them a
> way out.  Of course maybe that level of smarts could be put into drop
> tablespace itself?

If we're saving data on crashes with the intention that it might be  
useful then we need to be able to get a list of what those files are,  
and possibly be able to delete them easily.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




Re: Reviewing temp_tablespaces GUC patch

From
"Jaime Casanova"
Date:
On 5/27/07, Robert Treat <xzilla@users.sourceforge.net> wrote:
> On Friday 25 May 2007 12:39, Jaime Casanova wrote:
> > On 5/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Bernd Helmle <mailings@oopsware.de> writes:
> > > > --On Freitag, Mai 25, 2007 10:49:29 +0000 Jaime Casanova
> > > >
> > > > <systemguards@gmail.com> wrote:
> > > >> No, because the RemovePgTempFiles() call in PostmasterMain() will
> > > >> remove all tmp files at startup.
> > >
> > > I believe we do not call RemovePgTempFiles during a crash recovery
> > > cycle; this is intentional on the theory that the temp files might
> > > contain useful debugging clues.
> >
> > ah, i forgot that
> >
> > >  So there is a potential problem there.
> > > Not sure how important it really is though --- neither crashes nor
> > > tablespace drops ought to be so common that we need a really nice
> > > solution.
> >
> > the only semi-sane solution i can think of, is to have a superuser
> > only function that acts as a wrapper for RemovePgTempFiles(), but
> > still exists a chance for shoot yourself on the foot...
>
> If there was a way for DBA's to know they could safely delete the left-over
> files (maybe the files timestamp is older than postmaster start; though not
> sure how you measure that), then I think this would be enough to give them a
> way out.  Of course maybe that level of smarts could be put into drop
> tablespace itself?
>

i don't think silently delete the files is a good idea, specially if
the files are left there intencionally...

but what, exactly, we want to do? delete the files or maybe sending an
HINT just after the error so we can inform the DBA about the temp
files and let him decide.

comments?

BTW, postmaster startup is in PgStartTime, right?

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: Reviewing temp_tablespaces GUC patch

From
Robert Treat
Date:
On Tuesday 29 May 2007 20:06, Jaime Casanova wrote:
> On 5/27/07, Robert Treat <xzilla@users.sourceforge.net> wrote:
> > On Friday 25 May 2007 12:39, Jaime Casanova wrote:
> > > On 5/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > Bernd Helmle <mailings@oopsware.de> writes:
> > > > > --On Freitag, Mai 25, 2007 10:49:29 +0000 Jaime Casanova
> > > > >
> > > > > <systemguards@gmail.com> wrote:
> > > > >> No, because the RemovePgTempFiles() call in PostmasterMain() will
> > > > >> remove all tmp files at startup.
> > > >
> > > > I believe we do not call RemovePgTempFiles during a crash recovery
> > > > cycle; this is intentional on the theory that the temp files might
> > > > contain useful debugging clues.
> > >
> > > ah, i forgot that
> > >
> > > >  So there is a potential problem there.
> > > > Not sure how important it really is though --- neither crashes nor
> > > > tablespace drops ought to be so common that we need a really nice
> > > > solution.
> > >
> > > the only semi-sane solution i can think of, is to have a superuser
> > > only function that acts as a wrapper for RemovePgTempFiles(), but
> > > still exists a chance for shoot yourself on the foot...
> >
> > If there was a way for DBA's to know they could safely delete the
> > left-over files (maybe the files timestamp is older than postmaster
> > start; though not sure how you measure that), then I think this would be
> > enough to give them a way out.  Of course maybe that level of smarts
> > could be put into drop tablespace itself?
>
> i don't think silently delete the files is a good idea, specially if
> the files are left there intencionally...
>
> but what, exactly, we want to do? delete the files or maybe sending an
> HINT just after the error so we can inform the DBA about the temp
> files and let him decide.
>

I guess the thing to do is error that you cannot drop a non-empty tablespace, 
with a hint that there are files older than pg start time in the tablespace 
directory. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL