Thread: Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
From
"Simon Riggs"
Date:
On Mon, 2007-06-04 at 14:41 -0400, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > One of the main reasons for the implementation was to allow larger > > queries to work faster by utilising multiple temp tablespaces for the > > same query. > > > The original ideal implementation was to use round-robin/cyclic > > selection, which allows much better usage in the above case. > > Really? What if multiple backends are all hitting the same tablespaces > in the same order? A random selection seems much less likely to risk > having any self-synchronizing behavior. I'd like a single backend to never reuse a temp tablespace that is actively being used so that large queries won't randomly conflict with themselves. That's pretty certain to draw complaints, IMHO. We can do this two ways - cycle thru temp tablespaces, as originally suggested (not by me...) - pick a random tablespace **other than ones already in active use** -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Mon, 2007-06-04 at 14:41 -0400, Tom Lane wrote: >> "Simon Riggs" <simon@2ndquadrant.com> writes: >>> The original ideal implementation was to use round-robin/cyclic >>> selection, which allows much better usage in the above case. >> >> Really? What if multiple backends are all hitting the same tablespaces >> in the same order? A random selection seems much less likely to risk >> having any self-synchronizing behavior. > I'd like a single backend to never reuse a temp tablespace that is > actively being used so that large queries won't randomly conflict with > themselves. That's pretty certain to draw complaints, IMHO. > We can do this two ways > - cycle thru temp tablespaces, as originally suggested (not by me...) > - pick a random tablespace **other than ones already in active use** Idea 2 fails as soon as you have more temp files than tablespaces, and also requires tracking which tablespaces are currently in use, a bit of complexity we do not have in there. Perhaps a reasonable compromise could work like this: at the first point in a transaction where a temp file is created, choose a random list element, and thereafter advance cyclically for the duration of that transaction. This ensures within-transaction spread-out while still having some randomness across backends. The reason I'm thinking per-transaction is that we could tie this to setting up a cached list of tablespace OIDs, which would avoid the overhead of repeat parsing and tablespace validity checking. We had rejected using a long-lived cache because of the problem of tablespaces getting dropped, but I think one that lasts only across a transaction would be OK. And the reason I'm thinking a cache is important is that if you really want to get any win from this idea, you need to spread the temp files across tablespaces *per file*, which is not the way it works now. As committed, the code selects one temp tablespace per sort or hashjoin. The submitted patch already did it that way for sorts, and I forced the same for hashjoins, because I wanted to be sure to minimize the number of executions of aforesaid parsing/checking. So really that patch is entirely wrong, and selection of the tablespace for a temp file needs to be pushed much further down. Assuming, that is, that you think this point is important enough to drive the whole design; which I find rather questionable in view of the fact that the submitted patch contained no mention whatever of any such consideration. Or is this just another way in which its documentation was not up to snuff? regards, tom lane
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthatallows selection of
From
"Simon Riggs"
Date:
On Mon, 2007-06-04 at 15:34 -0400, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > On Mon, 2007-06-04 at 14:41 -0400, Tom Lane wrote: > >> "Simon Riggs" <simon@2ndquadrant.com> writes: > >>> The original ideal implementation was to use round-robin/cyclic > >>> selection, which allows much better usage in the above case. > >> > >> Really? What if multiple backends are all hitting the same tablespaces > >> in the same order? A random selection seems much less likely to risk > >> having any self-synchronizing behavior. > > > I'd like a single backend to never reuse a temp tablespace that is > > actively being used so that large queries won't randomly conflict with > > themselves. That's pretty certain to draw complaints, IMHO. > > > We can do this two ways > > - cycle thru temp tablespaces, as originally suggested (not by me...) > > - pick a random tablespace **other than ones already in active use** > > Idea 2 fails as soon as you have more temp files than tablespaces, and > also requires tracking which tablespaces are currently in use, a bit of > complexity we do not have in there. > > Perhaps a reasonable compromise could work like this: at the first point > in a transaction where a temp file is created, choose a random list > element, and thereafter advance cyclically for the duration of that > transaction. This ensures within-transaction spread-out while still > having some randomness across backends. Works for me. > The reason I'm thinking per-transaction is that we could tie this to > setting up a cached list of tablespace OIDs, which would avoid the > overhead of repeat parsing and tablespace validity checking. We had > rejected using a long-lived cache because of the problem of tablespaces > getting dropped, but I think one that lasts only across a transaction > would be OK. No problem with that. > And the reason I'm thinking a cache is important is that if you really > want to get any win from this idea, you need to spread the temp files > across tablespaces *per file*, which is not the way it works now. > As committed, the code selects one temp tablespace per sort or hashjoin. > The submitted patch already did it that way for sorts, and I forced the > same for hashjoins, because I wanted to be sure to minimize the number > of executions of aforesaid parsing/checking. So really that patch is > entirely wrong, and selection of the tablespace for a temp file needs > to be pushed much further down. Well, I was looking to achieve poor man's parallelism. If you have a query with two or more temp files active then you will be reading from one while writing to another. That could then allow you to rely on OS file writers to give you asynch I/O like behaviour. I can see what you're thinking though and it sounds even better, but I'm guessing that's a much larger change anyway. > Assuming, that is, that you think this point is important enough to > drive the whole design; which I find rather questionable in view of the > fact that the submitted patch contained no mention whatever of any such > consideration. Or is this just another way in which its documentation > was not up to snuff? Well, it was listed in the TODO, but I guess that was lost somewhere along the line. Oh well. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthatallows selection of
From
Bruce Momjian
Date:
Simon Riggs wrote: > > Assuming, that is, that you think this point is important enough to > > drive the whole design; which I find rather questionable in view of the > > fact that the submitted patch contained no mention whatever of any such > > consideration. Or is this just another way in which its documentation > > was not up to snuff? > > Well, it was listed in the TODO, but I guess that was lost somewhere > along the line. Oh well. The TODO description was removed once the item was complete because sometimes the description doesn't match the implementation. The description was: It could start with a random tablespace from a supplied list and cycle through the list. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthatallows selection of
From
"Jaime Casanova"
Date:
On 6/5/07, Bruce Momjian <bruce@momjian.us> wrote: > Simon Riggs wrote: > > > Assuming, that is, that you think this point is important enough to > > > drive the whole design; which I find rather questionable in view of the > > > fact that the submitted patch contained no mention whatever of any such > > > consideration. Or is this just another way in which its documentation > > > was not up to snuff? > > > > Well, it was listed in the TODO, but I guess that was lost somewhere > > along the line. Oh well. > > The TODO description was removed once the item was complete because > sometimes the description doesn't match the implementation. The > description was: > > It could start with a random tablespace from a supplied list and > cycle through the list. > that is what the patch did but it did it one tablespace per BufFile (not per file) of course, it parses the GUC on every GetTempTablespaces() call :( -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
From
Bernd Helmle
Date:
--On Montag, Juni 04, 2007 15:34:14 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > The reason I'm thinking per-transaction is that we could tie this to > setting up a cached list of tablespace OIDs, which would avoid the > overhead of repeat parsing and tablespace validity checking. We had > rejected using a long-lived cache because of the problem of tablespaces > getting dropped, but I think one that lasts only across a transaction > would be OK. Hmm i tried an allocated oid list in TopMemoryContext per backend, but i didn't find any issue with that... What's the reason we cannot work with a long-living cache during backend lifetime? Dropping a tablespace caused get_tablespace_name() to return an InvalidOid and the tablespace selection code to switch to $PGDATA/pgsql_tmp... -- Thanks Bernd
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
From
Bernd Helmle
Date:
--On Montag, Juni 04, 2007 15:34:14 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Perhaps a reasonable compromise could work like this: at the first point > in a transaction where a temp file is created, choose a random list > element, and thereafter advance cyclically for the duration of that > transaction. This ensures within-transaction spread-out while still > having some randomness across backends. Doing this on transaction-level looks pretty nice; The original code choose the random element on backend startup (or every time you call SET). Btw. i saw you've removed the random selection implemented by MyProcId % num_temp_tablespaces. I liked this idea, because PID should be pretty random on many OS? -- Thanks Bernd
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
From
"Jaime Casanova"
Date:
On 6/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Perhaps a reasonable compromise could work like this: at the first point > in a transaction where a temp file is created, choose a random list > element, and thereafter advance cyclically for the duration of that > transaction. This ensures within-transaction spread-out while still > having some randomness across backends. > > The reason I'm thinking per-transaction is that we could tie this to > setting up a cached list of tablespace OIDs, which would avoid the > overhead of repeat parsing and tablespace validity checking. We had > rejected using a long-lived cache because of the problem of tablespaces > getting dropped, but I think one that lasts only across a transaction > would be OK. > > And the reason I'm thinking a cache is important is that if you really > want to get any win from this idea, you need to spread the temp files > across tablespaces *per file*, which is not the way it works now. ok. are you doing this? or can i prepare a patch that implements this? i guess we can allocate the memory for the list in TopTransactionContext. -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook
Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthat allows selection of
From
Tom Lane
Date:
"Jaime Casanova" <systemguards@gmail.com> writes: > On 6/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Perhaps a reasonable compromise could work like this: at the first point >> in a transaction where a temp file is created, choose a random list >> element, and thereafter advance cyclically for the duration of that >> transaction. > ok. are you doing this? or can i prepare a patch that implements this? > i guess we can allocate the memory for the list in TopTransactionContext. I'll work on it ... I want to rejigger the API between fd.c and tablespace.c anyway. (fd.c still shouldn't be calling tablespace.c ...) regards, tom lane