Re: temp tables, sessions, pgpool and disk - Mailing list pgsql-general
From | Ivan Sergio Borgonovo |
---|---|
Subject | Re: temp tables, sessions, pgpool and disk |
Date | |
Msg-id | 20090622095359.52f715a8@dawn.webthatworks.it Whole thread Raw |
In response to | Re: cursors, temp tables, dynamic sql technique (was and is: looping over a small record set over and over in a function) (Craig Ringer <craig@postnewspapers.com.au>) |
Responses |
Re: temp tables, sessions, pgpool and disk
|
List | pgsql-general |
On Mon, 22 Jun 2009 07:26:56 +0800 Craig Ringer <craig@postnewspapers.com.au> wrote: > > http://www.postgresql.org/docs/8.3/interactive/sql-createtable.html > > "If specified, the table is created as a temporary table. > > Temporary tables are automatically dropped at the end of a > > session" > > > > I'd interpret it as a connection. > > Correctly. > > > I don't even know if it is possible to send more than one command > > over a single connection and wait for the results asynchronously. > > Any clarification? > > To an extent cursors provide that ability. The "result" is returned > quite promptly, but it's a placeholder that allows you to retrieve > the real results progressively as you need them. Whether the > database generates the results immediately and stores them to > return later, or whether it generates them on demand, isn't > something you can easily tell it's up to the database. http://www.postgresql.org/docs/8.3/interactive/libpq-async.html "PQsendQuery cannot be called again (on the same connection) until PQgetResult has returned a null pointer, indicating that the command is done." Asynchronous calls can't be made to parallelize postgres queries on the same "session", but just to parallelize client and server work. So a temp table seems as private as I need it. I wonder what will happen if I put something like pgpool between postgresql and a web app. > > So postgresql actually issues writes to disk and delegate to the > > OS management of the cache/actual write on disk. > Yes. > > I thought it could just try to hold them in RAM and still > > delegate to the OS to save them on disk in swap if the system is > > short on RAM. > For a variety of reasons, you REALLY don't want it to work that > way. mmm... first sorry for the noise... Interpret the following as reality checks. I'm perfectly aware building up a DB is not easy, and I'm not pretending I know how to write one. ;) > OS memory managers tend to be _much_ better and faster at managing > pages that're backed by a file. They'll write dirty data out > pre-emptively so that execution doesn't stall when memory runs > low; they write data to the file in order for best disk > performance; they efficiently buffer and read-ahead when pulling > the data back in, etc. > The OS knows much less about what anonymous memory (memory not > backed by a file) "means" to a program and can't be as clever with > it. Swapping tends to be _much_ more CPU expensive than writing But issuing a write to disk Postgresql doesn't actually say anything more about what it is placing on the disk and how it is going to access it... and it is actually adding overhead to move it back and forward, no matter if this overhead happens on RAM or disk. Actually since temp table are private to the connection they should (?) be private to a postgresql process, so the OS should be able to do a good job. I don't see any atomicity constraint, so... if something fail while writing to RAM, as you said you shouldn't need a WAL. > dirty buffers to a file. It's a lot more expensive to retrieve > from disk, too, and usually involves lots of seeks for quite > scattered pages instead of nice block readahead. Once you're running out of memory I see no guaranty your file will end up in a fast easily accessible area of your disk... and you're going to add the overhead associated with a file system (journalling, permissions/ownership, locks) swap is volatile... and it should offers the guaranty you "need" for a temp table. > The OS knows much better than PostgreSQL does when the table will > fit in RAM and when it needs to spill to disk, and it's much > better at managing that than Pg can ever be. It's great that Pg > just uses the OS's hardware knowledge, system-wide awareness, and > highly optimised memory manager + disk IO management to take care > of the problem. The same should be true for virtual memory, not just file management and postgresql has a configuration file that should give a clue to the DB about the expected workload and hardware. Surely postgresql can't forecast how many and how large the temp tables for a single connection will be... but substantially I got the idea that a connection is somehow serial in its execution and that storage could be garbage collected or just released early (drop table, on commit drop). This looks as it is taking temp tables very far from the standard. And yeah... once you want to do memory management/resource management inside SQL you've opened the doors of Hell. But well For what I could see about SQL99 the definition of temp table is very terse... and a bit confusing (at least for me) about global and local. I gave a quick look at what's available on MS SQL... and they have an sort of "in memory temp table" but you can't modify its schema. -- Ivan Sergio Borgonovo http://www.webthatworks.it
pgsql-general by date: