Thread: looping over a small record set over and over in a function
I've a record set on which I have to loop several times. The function should return the same record set + one more computed field. Something that in could look like: foreach(row) { // compute stuff if(...) { } // place stuff in field[N+1] of the row } if(some condition) { // } foreach(row) { // compute stuff if(...) { } // place stuff in a field[N+1] of the row } if(some condition) { // } ... actually return row + computed field. in pgplsql where each loop depends on the result of the previous. The dataset is very small. If I could easily load all the dataset into an array, loop through it and then just update the computed field it would be nice... but how? Is it really worth to load the whole record set in an array, loop over etc... in spite of eg. building a temp table with the same structure of the input record set + 1 field, loop over the table etc... what about all the UPDATEs involved to change field N+1 of the temp table? Will be they expensive? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Jun 19, 2009, at 8:23 PM, Ivan Sergio Borgonovo wrote: > I've a record set on which I have to loop several times. > The function should return the same record set + one more computed > field. > Something that in could look like: > > foreach(row) { > // compute stuff > if(...) { > } > // place stuff in field[N+1] of the row > } > if(some condition) { > // > } > foreach(row) { > // compute stuff > if(...) { > } > // place stuff in a field[N+1] of the row > } > if(some condition) { > // > } > ... > > actually return row + computed field. > > in pgplsql > > where each loop depends on the result of the previous. > The dataset is very small. > > > If I could easily load all the dataset into an array, loop through > it and then just update the computed field it would be nice... but > how? You could add a column to your query as a placeholder for the computed value. For example, SELECT *, 0 AS computed_value FROM table. If you use a scrollable cursor (possible in PL/pgSQL these days, although it still has some limitations) you could just loop through its results and rewind the cursor until you're done. It does look like you're implementing some kind of aggregate function though. Maybe you can implement it as an actual aggregate function, maybe in combination with the windowing stuff in 8.4? Then you could just add the aggregate to your query instead of using your query results in a function. That's about the inverse of what you're attempting now. You seem to be implementing something that would look like SELECT sum(SELECT * FROM table), while you may be better off aiming for SELECT SUM(value) FROM table. Considering you want to loop over the results multiple times that may not be possible; only you can tell. > Is it really worth to load the whole record set in an array, loop > over etc... in spite of eg. building a temp table with the same > structure of the input record set + 1 field, loop over the table > etc... what about all the UPDATEs involved to change field N+1 of > the temp table? Will be they expensive? Neither of those look like a convenient approach. With the array you lose all the meta-data the record set provided (types, column names, etc), while the temp table approach looks like it will be kind of inefficient. Is the result of that function volatile or can you maybe store it after computing it once and fire a trigger to update the computed value if the underlying data changes? I wonder why you need to re-iterate over your result set multiple times? It's kind of rare that once isn't enough. And lastly, if your current approach really is the only way to compute what you're after, then maybe PL/pgSQL isn't the right match for the problem; it looks like you'd be better served by a language that can work with arrays of typed structures. As I'm not familiar with the other PL languages I can't tell whether they would be suitable in that respect, but I suspect Python or Java would be able to handle this better. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a3cba54759154137769037!
On Sat, 20 Jun 2009 12:30:42 +0200 Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote: > You could add a column to your query as a placeholder for the > computed value. > For example, SELECT *, 0 AS computed_value FROM table. > If you use a scrollable cursor (possible in PL/pgSQL these days, > although it still has some limitations) you could just loop > through its results and rewind the cursor until you're done. > It does look like you're implementing some kind of aggregate > function though. Maybe you can implement it as an actual aggregate > function, maybe in combination with the windowing stuff in 8.4? > Then you could just add the aggregate to your query instead of > using your query results in a function. That's about the inverse > of what you're attempting now. > You seem to be implementing something that would look like SELECT > sum(SELECT * FROM table), while you may be better off aiming for > SELECT SUM(value) FROM table. > > Considering you want to loop over the results multiple times that > may not be possible; only you can tell. > > Is it really worth to load the whole record set in an array, loop > > over etc... in spite of eg. building a temp table with the same > > structure of the input record set + 1 field, loop over the table > > etc... what about all the UPDATEs involved to change field N+1 of > > the temp table? Will be they expensive? > > Neither of those look like a convenient approach. With the array > you lose all the meta-data the record set provided (types, column > names, etc), while the temp table approach looks like it will be > kind of inefficient. > Is the result of that function volatile or can you maybe store it > after computing it once and fire a trigger to update the computed > value if the underlying data changes? The function is volatile. What I was thinking about was to fire a trigger to wipe the temp table if the table on which the computation is made is changed. Considering I can't make the loop run in parallel aggregates may be a way to go... but somehow they look as they are making the implementation a bit hard to manage. Anyway it still have to be seen if I could actually implement the loops with aggregates since every loop has 2 "side effects" compute some aggregates for the whole record set and compute an extra field for each row. > I wonder why you need to re-iterate over your result set multiple > times? It's kind of rare that once isn't enough. > And lastly, if your current approach really is the only way to > compute what you're after, then maybe PL/pgSQL isn't the right > match for the problem; it looks like you'd be better served by a Yeah. I gave a look to python but I don't want to add one more language to the mix. I enjoy strict type checking of plpgsql even if some bit of syntactic sugar would help to make it more pleasing and I think it is the most lightweight among the offer. Still I don't know how easy it is with eg. python to load an array with a result set, change it and place it back into the table where it was coming from. > language that can work with arrays of typed structures. As I'm not > familiar with the other PL languages I can't tell whether they > would be suitable in that respect, but I suspect Python or Java > would be able to handle this better. Your suggestion about cursor could be the way... but I don't know enough about cursors internals to understand if updating a field of a cursor will cause disk writes. Currently my main concern is making this things readable and extensible. I'm not planning to optimise yet. The result set on which the computations are made is pretty small. It is just taken out from a large one. But I don't want to cut my way to optimisation. So one way could be: select into a temp table the record set. Build up a trigger that will wipe the temp table if the tables on which the record set is built changes. This may even not be necessary, since everything related to that record set is going to happen in one session. loop several times over the temp table Since every loop should actually correspond to a function... I may wonder if I could build up the cursor and pass it along to several functions. Every function will MOVE FIRST, update the one row of the record set and return some computed fields that will be used by the next function etc... If this is possible, this way have the advantage to be extensible. I wonder if it is efficient since I don't know if an UPDATE table set where current of cursor on a temp table is going to incur in any disk write. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Jun 20, 2009, at 8:35 AM, Ivan Sergio Borgonovo wrote: >> And lastly, if your current approach really is the only way to >> compute what you're after, then maybe PL/pgSQL isn't the right >> match for the problem; it looks like you'd be better served by a > > Yeah. I gave a look to python but I don't want to add one more > language to the mix. > I enjoy strict type checking of plpgsql even if some bit of > syntactic sugar would help to make it more pleasing and I think it > is the most lightweight among the offer. > Still I don't know how easy it is with eg. python to load an array > with a result set, change it and place it back into the table where > it was coming from. > >> language that can work with arrays of typed structures. As I'm not >> familiar with the other PL languages I can't tell whether they >> would be suitable in that respect, but I suspect Python or Java >> would be able to handle this better. > > > Your suggestion about cursor could be the way... but I don't know > enough about cursors internals to understand if updating a field of > a cursor will cause disk writes. I have not tried this, but the documentation says arrays can be created for "any built-in or user-defined base type, enum type, or composite type." So maybe you could define a composite type and stuff those into a single array? John DeSoi, Ph.D.
On Fri, 2009-06-19 at 20:23 +0200, Ivan Sergio Borgonovo wrote: > If I could easily load all the dataset into an array, loop through > it and then just update the computed field it would be nice... but > how? Are you sure you _really_ need a PL/PgSQL function for this? Can't you: SELECT x.*, computefield(x, p) FROM smalltable x, otherparams p; or something along those lines? -- Craig Ringer
cursors, temp tables, dynamic sql technique (was and is: looping over a small record set over and over in a function)
From
Ivan Sergio Borgonovo
Date:
On Sun, 21 Jun 2009 10:57:51 +0800 Craig Ringer <craig@postnewspapers.com.au> wrote: > On Fri, 2009-06-19 at 20:23 +0200, Ivan Sergio Borgonovo wrote: > > > If I could easily load all the dataset into an array, loop > > through it and then just update the computed field it would be > > nice... but how? > > Are you sure you _really_ need a PL/PgSQL function for this? Not really sure. I'm investigating. But well I noticed that even working with a fully fledged procedural language I'll have to loop 2 times to replace/add to the "original" field the computed one, so it's surely not a big loss if I "loop" 2 times with SQL. I think everything could be summed up as: select into t myaggregate1(field) from dataset where condition1; if(t>10) then update dataset set field=myfunc1(a,b,c) where condition1; end if; select into t myaggregate2(field) from dataset where condition2; if(t>44) then update dataset set field=myfunc2(a,b,c) where condition2; end if; Actually there is a trick I could use to "skip" the update loop and pass the result to the next loop but it looks convenient in a more general case if the aggregate should be computed on the "updated" value. I'm still not sure if the class of function I'm working with are always of the above class where myaggregate works on the "original" field. Of course myaggregate could compute myfunc... but then I'll have to compute myfunc twice. In a loop I could create temp table t as select *, field as field1, field as field2, field as result from t1 join t2 on... join t3 on... where ... on commit drop; flipflop='field1'; flopflip='field2'; foreach(row) { if(condition on row) { flopflip=myfunc(row[flipflop]); } else { flopflip=row[flipflop]; } agg+=flopflip; } if(condition on agg) { switch flipflop } foreach(row) { if(condition on row) { flopflip=myfunc(row[flipflop]); } else { flopflip=row[flipflop]; } agg+=flopflip; } if(condition on agg) { switch flipflop } execute 'update t set result=' || flipflop; I think this could be obtained using cursors. This is going to be much more verbose since to use the "flipflop" technique I'll have to use dynamic statements and EXECUTE. EXECUTE 'UPDATE t SET' || flipflop || '=' || newfield || 'WHERE CURRENT OF cursor'; even "condition on row" should be expressed as a dynamic statement if condition involve the computed field. Compared to the select + update solution it is going to avoid looping and checking the condition twice but it is going to work on a larger dataset and run the update even for unchanged row (maybe the else condition could be omitted??). I'm not sure that actual set of functions have the update condition equal to the select condition anyway. But I think wasting updates has a larger cost if they require disk IO. I think stuff may look more "convenient" if there was a simple and cheap way to load a record set into an array, do stuff with the procedural language and substitute the old record set with the computed one. Currently I'll get most of the speed up from looping through a smaller data set avoiding to join over a very large table. Considering the very small data set I'm expecting (no more than 20 record) I don't think dynamic statement, lack of indexes etc... are going to have an impact. As soon as clients will increase, optimizing the loops may be necessary. I'll have to see if all the cases I'm dealing with could be solved by the SELECT + conditional UPDATE technique since it looks much easier to maintain. Furthermore I suspect that for my real case the conditional check will be made on the original value so that myfunction will be computed at most once and I can delay an aggregate computation on the computed field after the last loop has been executed. The remaining costs will be: - looping twice on the same table (getting the aggregate + updating the table) - updating if it causes disk IO I even suspect that the test will mostly fail so that updates will be rare but I still would like to understand how this could work in the most general case and how the temp table + cursor + flipflop technique is going to work internally. I think I really don't have a clear picture of how temp tables really work. They can be seen by concurrent transactions in the same session. But if the transaction in which a temp table is created is not committed yet, other transactions won't see it. If the transaction where the temp table is created is aborted... no other transaction will ever know about the existence of the temp table. If the temp table is defined as DROP ON COMMIT... somehow there is no need to make the changes happening on the temp table land on disk. So it may actually look more as a temporary, private storage that doesn't have to be aware of concurrency. So temp tables should require less disk IO. Is it? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
Re: cursors, temp tables, dynamic sql technique (was and is: looping over a small record set over and over in a function)
From
Craig Ringer
Date:
On Sun, 2009-06-21 at 14:57 +0200, Ivan Sergio Borgonovo wrote: > I think everything could be summed up as: > > select into t myaggregate1(field) from dataset where condition1; > if(t>10) then > update dataset set field=myfunc1(a,b,c) where condition1; > end if; > > select into t myaggregate2(field) from dataset where condition2; > if(t>44) then > update dataset set field=myfunc2(a,b,c) where condition2; > end if; That's really too simplified to see what you're actually doing. I've found that in the _vast_ majority of non-trigger cases where I've considered using PL/PgSQL, a bit more thought and proper consideration of the use of generate_series, subqueries in FROM, join types, etc has allowed me to find a way to do it in SQL. It's almost always faster, cleaner, and nicer if I do find a way to express it in SQL, too. > I think I really don't have a clear picture of how temp tables > really work. > They can be seen by concurrent transactions in the same session. Eh? In this context, what do you mean by "session"? Did you mean consecutive rather than concurrent, ie: BEGIN; CREATE TEMPORARY TABLE x (...); -- do other work COMMIT; BEGIN; -- Can see `x' from here COMMIT; ? Normally, you couldn't see another sessions temp tables, even after commit, unless you explicitly schema-qualified their names - eg 'pg_temp_4.x' . If even then; I haven't actually checked. Two concurrent sessions that issue 'CREATE TEMPORARY TABLE x(...)' get two DIFFERENT tables, both named `x', in different pg_temp schema, eg 'pg_temp_2.x' and 'pg_temp_3.x' . > But if the transaction in which a temp table is created is not > committed yet, other transactions won't see it. Of course, since PostgreSQL doesn't support the READ UNCOMMITTED isolation level. > So it may actually look more as a temporary, private storage > that doesn't have to be aware of concurrency. > > So temp tables should require less disk IO. Is it? The big thing is that they're private to a session, so different sessions can concurrently be doing things with temp tables by the same name without treading on each others' feet. Because they're limited to the lifetime of the session, though, PostgreSQL doesn't need to care about ensuring that they're consistent in the case of a backend crash, unexpected server reset, etc. Tom Lane recently pointed out that as a result writes don't need to go through the WAL, so my understanding is that you're avoiding the doubled-up disk I/O from that. They also don't need to be written with O_SYNC or fsync()ed since we don't care if writes make it to the table in order. As a result I'm pretty sure temp tables don't ever have to hit the disk. If the OS has enough write-cache space it can just store them in RAM from creation to deletion. -- Craig Ringer
Re: cursors, temp tables, dynamic sql technique (was and is: looping over a small record set over and over in a function)
From
Ivan Sergio Borgonovo
Date:
On Sun, 21 Jun 2009 21:43:16 +0800 Craig Ringer <craig@postnewspapers.com.au> wrote: > On Sun, 2009-06-21 at 14:57 +0200, Ivan Sergio Borgonovo wrote: > > > I think everything could be summed up as: > > > > select into t myaggregate1(field) from dataset where condition1; > > if(t>10) then > > update dataset set field=myfunc1(a,b,c) where condition1; > > end if; > > > > select into t myaggregate2(field) from dataset where condition2; > > if(t>44) then > > update dataset set field=myfunc2(a,b,c) where condition2; > > end if; > > That's really too simplified to see what you're actually doing. I'm still checking if real case go far beyond to the above or I've some more general case. > I've found that in the _vast_ majority of non-trigger cases where > I've considered using PL/PgSQL, a bit more thought and proper > consideration of the use of generate_series, subqueries in FROM, > join types, etc has allowed me to find a way to do it in SQL. It's > almost always faster, cleaner, and nicer if I do find a way to > express it in SQL, too. The stuff is really serial in its nature and can't be made parallel. > > I think I really don't have a clear picture of how temp tables > > really work. > > They can be seen by concurrent transactions in the same session. > Eh? In this context, what do you mean by "session"? Did you mean > consecutive rather than concurrent, ie: http://www.postgresql.org/docs/8.3/interactive/sql-createtable.html "If specified, the table is created as a temporary table. Temporary tables are automatically dropped at the end of a session" I'd interpret it as a connection. I don't even know if it is possible to send more than one command over a single connection and wait for the results asynchronously. Any clarification? If a session is a connection and it's not possible to send more than one command and wait for the results asyncronously... or the server doesn't run in parallel several command sent across the same connection... than a temp table looks like what I need. > BEGIN; > CREATE TEMPORARY TABLE x (...); > -- do other work > COMMIT; > BEGIN; > -- Can see `x' from here > COMMIT; > > ? I was planning to use ON COMMIT DROP > Normally, you couldn't see another sessions temp tables, even after > commit, unless you explicitly schema-qualified their names - eg > 'pg_temp_4.x' . If even then; I haven't actually checked. > Two concurrent sessions that issue 'CREATE TEMPORARY TABLE x(...)' > get two DIFFERENT tables, both named `x', in different pg_temp > schema, eg 'pg_temp_2.x' and 'pg_temp_3.x' . That's perfect, and what I really need. > > But if the transaction in which a temp table is created is not > > committed yet, other transactions won't see it. > Of course, since PostgreSQL doesn't support the READ UNCOMMITTED > isolation level. > > So it may actually look more as a temporary, private storage > > that doesn't have to be aware of concurrency. > > So temp tables should require less disk IO. Is it? > The big thing is that they're private to a session, so different > sessions can concurrently be doing things with temp tables by the > same name without treading on each others' feet. > Because they're limited to the lifetime of the session, though, > PostgreSQL doesn't need to care about ensuring that they're > consistent in the case of a backend crash, unexpected server > reset, etc. Tom Lane recently pointed out that as a result writes > don't need to go through the WAL, so my understanding is that > you're avoiding the doubled-up disk I/O from that. They also don't > need to be written with O_SYNC or fsync()ed since we don't care if > writes make it to the table in order. > As a result I'm pretty sure temp tables don't ever have to hit the > disk. If the OS has enough write-cache space it can just store > them in RAM from creation to deletion. So postgresql actually issues writes to disk and delegate to the OS management of the cache/actual write on disk. I thought it could just try to hold them in RAM and still delegate to the OS to save them on disk in swap if the system is short on RAM. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
Re: cursors, temp tables, dynamic sql technique (was and is: looping over a small record set over and over in a function)
From
Craig Ringer
Date:
On Sun, 2009-06-21 at 22:11 +0200, Ivan Sergio Borgonovo wrote: > > > I think I really don't have a clear picture of how temp tables > > > really work. > > > They can be seen by concurrent transactions in the same session. > > > Eh? In this context, what do you mean by "session"? Did you mean > > consecutive rather than concurrent, ie: > > http://www.postgresql.org/docs/8.3/interactive/sql-createtable.html > "If specified, the table is created as a temporary table. Temporary > tables are automatically dropped at the end of a session" > > I'd interpret it as a connection. Correctly. > I don't even know if it is possible to send more than one command > over a single connection and wait for the results asynchronously. > Any clarification? To an extent cursors provide that ability. The "result" is returned quite promptly, but it's a placeholder that allows you to retrieve the real results progressively as you need them. Whether the database generates the results immediately and stores them to return later, or whether it generates them on demand, isn't something you can easily tell it's up to the database. I don't know if PostgreSQL will pause the execution of set-returning PL/PgSQL functions connected to cursors until the client requests more data. Anyone? By the way, a cursor is only valid within the context of the transaction in which it's created. If you don't explicitly ask to use a cursor, then no, commands cannot run concurrently within the same session. > > As a result I'm pretty sure temp tables don't ever have to hit the > > disk. If the OS has enough write-cache space it can just store > > them in RAM from creation to deletion. > > So postgresql actually issues writes to disk and delegate to the OS > management of the cache/actual write on disk. Yes. > I thought it could just try to hold them in RAM and still delegate > to the OS to save them on disk in swap if the system is short on RAM. For a variety of reasons, you REALLY don't want it to work that way. OS memory managers tend to be _much_ better and faster at managing pages that're backed by a file. They'll write dirty data out pre-emptively so that execution doesn't stall when memory runs low; they write data to the file in order for best disk performance; they efficiently buffer and read-ahead when pulling the data back in, etc. The OS knows much less about what anonymous memory (memory not backed by a file) "means" to a program and can't be as clever with it. Swapping tends to be _much_ more CPU expensive than writing dirty buffers to a file. It's a lot more expensive to retrieve from disk, too, and usually involves lots of seeks for quite scattered pages instead of nice block readahead. The OS knows much better than PostgreSQL does when the table will fit in RAM and when it needs to spill to disk, and it's much better at managing that than Pg can ever be. It's great that Pg just uses the OS's hardware knowledge, system-wide awareness, and highly optimised memory manager + disk IO management to take care of the problem. -- Craig Ringer
On Mon, 22 Jun 2009 07:26:56 +0800 Craig Ringer <craig@postnewspapers.com.au> wrote: > > http://www.postgresql.org/docs/8.3/interactive/sql-createtable.html > > "If specified, the table is created as a temporary table. > > Temporary tables are automatically dropped at the end of a > > session" > > > > I'd interpret it as a connection. > > Correctly. > > > I don't even know if it is possible to send more than one command > > over a single connection and wait for the results asynchronously. > > Any clarification? > > To an extent cursors provide that ability. The "result" is returned > quite promptly, but it's a placeholder that allows you to retrieve > the real results progressively as you need them. Whether the > database generates the results immediately and stores them to > return later, or whether it generates them on demand, isn't > something you can easily tell it's up to the database. http://www.postgresql.org/docs/8.3/interactive/libpq-async.html "PQsendQuery cannot be called again (on the same connection) until PQgetResult has returned a null pointer, indicating that the command is done." Asynchronous calls can't be made to parallelize postgres queries on the same "session", but just to parallelize client and server work. So a temp table seems as private as I need it. I wonder what will happen if I put something like pgpool between postgresql and a web app. > > So postgresql actually issues writes to disk and delegate to the > > OS management of the cache/actual write on disk. > Yes. > > I thought it could just try to hold them in RAM and still > > delegate to the OS to save them on disk in swap if the system is > > short on RAM. > For a variety of reasons, you REALLY don't want it to work that > way. mmm... first sorry for the noise... Interpret the following as reality checks. I'm perfectly aware building up a DB is not easy, and I'm not pretending I know how to write one. ;) > OS memory managers tend to be _much_ better and faster at managing > pages that're backed by a file. They'll write dirty data out > pre-emptively so that execution doesn't stall when memory runs > low; they write data to the file in order for best disk > performance; they efficiently buffer and read-ahead when pulling > the data back in, etc. > The OS knows much less about what anonymous memory (memory not > backed by a file) "means" to a program and can't be as clever with > it. Swapping tends to be _much_ more CPU expensive than writing But issuing a write to disk Postgresql doesn't actually say anything more about what it is placing on the disk and how it is going to access it... and it is actually adding overhead to move it back and forward, no matter if this overhead happens on RAM or disk. Actually since temp table are private to the connection they should (?) be private to a postgresql process, so the OS should be able to do a good job. I don't see any atomicity constraint, so... if something fail while writing to RAM, as you said you shouldn't need a WAL. > dirty buffers to a file. It's a lot more expensive to retrieve > from disk, too, and usually involves lots of seeks for quite > scattered pages instead of nice block readahead. Once you're running out of memory I see no guaranty your file will end up in a fast easily accessible area of your disk... and you're going to add the overhead associated with a file system (journalling, permissions/ownership, locks) swap is volatile... and it should offers the guaranty you "need" for a temp table. > The OS knows much better than PostgreSQL does when the table will > fit in RAM and when it needs to spill to disk, and it's much > better at managing that than Pg can ever be. It's great that Pg > just uses the OS's hardware knowledge, system-wide awareness, and > highly optimised memory manager + disk IO management to take care > of the problem. The same should be true for virtual memory, not just file management and postgresql has a configuration file that should give a clue to the DB about the expected workload and hardware. Surely postgresql can't forecast how many and how large the temp tables for a single connection will be... but substantially I got the idea that a connection is somehow serial in its execution and that storage could be garbage collected or just released early (drop table, on commit drop). This looks as it is taking temp tables very far from the standard. And yeah... once you want to do memory management/resource management inside SQL you've opened the doors of Hell. But well For what I could see about SQL99 the definition of temp table is very terse... and a bit confusing (at least for me) about global and local. I gave a quick look at what's available on MS SQL... and they have an sort of "in memory temp table" but you can't modify its schema. -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Mon, Jun 22, 2009 at 09:53:59AM +0200, Ivan Sergio Borgonovo wrote: > > The OS knows much less about what anonymous memory (memory not > > backed by a file) "means" to a program and can't be as clever with > > it. Swapping tends to be _much_ more CPU expensive than writing > > But issuing a write to disk Postgresql doesn't actually say anything > more about what it is placing on the disk and how it is going to > access it... and it is actually adding overhead to move it back and > forward, no matter if this overhead happens on RAM or disk. > Actually since temp table are private to the connection they should > (?) be private to a postgresql process, so the OS should be able to > do a good job. > I don't see any atomicity constraint, so... if something fail while > writing to RAM, as you said you shouldn't need a WAL. For the record, temp tables are in fact handled differently, in particular they are not stored in the shared_buffers, but instead are in backend local (private) buffers, whose size is controlled by temp_buffers. They are indeed not WAL archived, nor written to disk unless needed. So yes, small temp tables will likely stay in memory, but large temp tables may spill to disk. There's no flushing or syncing so quite likely they'll end up in the OS disk cache for a while. Once the temp table is deleted, the file is deleted and the OS throws that data away. So temp tables most likely won't use any disk I/O, but they *can* if the need arises. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment
On Mon, 22 Jun 2009 11:40:08 +0200 Martijn van Oosterhout <kleptog@svana.org> wrote: > On Mon, Jun 22, 2009 at 09:53:59AM +0200, Ivan Sergio Borgonovo > wrote: > > > The OS knows much less about what anonymous memory (memory not > > > backed by a file) "means" to a program and can't be as clever > > > with it. Swapping tends to be _much_ more CPU expensive than > > > writing > > > > But issuing a write to disk Postgresql doesn't actually say > > anything more about what it is placing on the disk and how it is > > going to access it... and it is actually adding overhead to move > > it back and forward, no matter if this overhead happens on RAM > > or disk. Actually since temp table are private to the connection > > they should (?) be private to a postgresql process, so the OS > > should be able to do a good job. > > I don't see any atomicity constraint, so... if something fail > > while writing to RAM, as you said you shouldn't need a WAL. > > For the record, temp tables are in fact handled differently, in > particular they are not stored in the shared_buffers, but instead > are in backend local (private) buffers, whose size is controlled by > temp_buffers. They are indeed not WAL archived, nor written to disk > unless needed. > So yes, small temp tables will likely stay in memory, but large > temp tables may spill to disk. There's no flushing or syncing so > quite likely they'll end up in the OS disk cache for a while. Once > the temp table is deleted, the file is deleted and the OS throws > that data away. So temp tables most likely won't use any disk I/O, > but they *can* if the need arises. Just to make it extra-clear to people unaware of pg internals... since the second paragraph may seems to contradict the first one... could be "nor written to disk unless needed" rephrased as: even repeated UPDATE/INSERT won't issue writes (no matter if they end up on disk or not, it won't issue writes to the OS) if the table fit the buffer? I see the default is somehow "large" (8M) and it is not pre allocated. Looks nice. > Have a nice day, thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it