Thread: BUG #16373: Behavior of Temporary table creation
The following bug has been logged on the website: Bug reference: 16373 Logged by: Hritik Gupta Email address: hritik122@gmail.com PostgreSQL version: 10.11 Operating system: MacOS Description: Hi! Have a small doubt I’m hoping you can help me with..simply put, where are the temporary tables created? AFAIK, they stay in the memory till the size reaches the temp_buffers limit, and then to the storage, for where it is accessed using the temp buffer, and this is inline with the below PG forum as well : https://www.postgresql.org/message-id/flat/CAMuJg2SnL78C7kWE4Fu8c7NUEFW_bf-4G2c9-4Sa%3DfMLV4bLqA%40mail.gmail.com Kind of refers in the parameter itself as well: https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-TEMP-BUFFERS https://stackoverflow.com/questions/57306828/does-postgres-support-in-memory-temp-table Here, Tom supported the understanding as well so I was thinking this is how its supposed to work. But testing it recently bought a contradictory behavior where the temp tables are directly created on disk even if they are well below the temp_buffer size (which to my understanding should’ve been in the memory). Below is the testing: hrigupta=# select version(); -[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------- version | PostgreSQL 10.11 on x86_64-apple-darwin18.7.0, compiled by Apple clang version 11.0.0 (clang-1100.0.33.12), 64-bit hrigupta=# show temp_buffers; -[ RECORD 1 ]+---- temp_buffers | 8MB hrigupta=# create temporary table hg1 (id int); CREATE TABLE Expanded display is on. hrigupta=# select * from pg_class where relname like 'hg1'; -[ RECORD 1 ]-------+------ relname | hg1 relnamespace | 57585 reltype | 57589 .. relfilenode | 57587 .. hrigupta=# select pg_relation_filepath(57587); -[ RECORD 1 ]--------+-------------------- pg_relation_filepath | base/32944/t3_57587 #inserted few records : hrigupta=# insert into hg1 values (generate_series(1,100000)); INSERT 0 100000 hrigupta=# \dt+ List of relations -[ RECORD 1 ]-------------------- Schema | pg_temp_3 Name | hg1 Type | table Owner | hrigupta Size | 3920 kB Here the size is ~3.8 MB and should’ve been in memory as its still within temp_buffers, but the file is actually created on disk with 3.8 MB size (and increases as I insert the records). this should’ve been the behavior after the size exceeds 8 MB (temp_buffers) 8c859021cf33:32944 hrigupta$ ls -lhr | head -3 total 33456 -rw------- 1 hrigupta admin 24K Apr 16 17:04 t3_57587_fsm -rw------- 1 hrigupta admin 3.8M Apr 16 17:05 t3_57587 So my questions are: - why are temp tables creating files in the disk when there the size is actually less than the memory limit (temp_buffers) — Please note that there is only one temp table for the session (and nothing else consuming the temp_buffers) - Another question I am concerned with is that as Tom mentioned, “toast table is also temp”. So technically access to toast tables are done through temp buffers rather than the shared buffer? How does that work.. because we would need to pull up pages for the relation in shared buffer to access it, but they might have associated toast tables as well. Are access to the data in those toast tables done separately in temp buffers? ==> has a discussion on this and it seems like the mentioned reference is for toast tables associated with temp tables only, which makes sense. i'd just like to re-confirm the statement.. Thanks so much for the help!
PG Bug reporting form <noreply@postgresql.org> writes: > Hi! Have a small doubt I’m hoping you can help me with..simply put, where > are the temporary tables created? > AFAIK, they stay in the memory till the size reaches the temp_buffers limit, > and then to the storage, for where it is accessed using the temp buffer, They are accessed through temp buffers, yes, but storage space gets reserved on-disk immediately whenever the relation is extended. This is the same as for non-temp tables. regards, tom lane
Hi Tom, thanks a lot for the reply and clearing this out for me!
I did some digging and figured that the backend file is created as soon as the create temp table is executed like you mentioned (with relpersistence as RELPERSISTENCE_TEMP, further calling smgrcreate() to create the backend file).
https://doxygen.postgresql.org/storage_8c_source.html#l00118
https://doxygen.postgresql.org/smgr_8c_source.html#l00333
Q=> So does that mean postgres does not have *only* “in memory” temporary tables in any case?
At best, what it is going to do is to cache the temp. table pages in the temp buffer, but the blocks are still there on the storage, its just cached in memory for the time being, no matter how small the temporary table is. is it correct?
There are conflicting information around this and postgres docs are also not very clear with regards to temp tables and how its handled.
In the below forum discussion as well, it is mentioned that “For a small table the data will be in the memory, For a large table if data is not fit in memory then data will be flushed to disk periodically as the database engine needs more working space for other requests.
..
The memory is controlled by temp_buffers parameter”
https://www.postgresql.org/message-id/CALnrrJThsS3ZshTvsUOpBNbSQKkRWSnRaPS9CtVHwYUJh4%2Bgww%40mail.gmail.com
Q=> I might seem to be misunderstanding the working here, but flushing from the temp buffers to the disk by checkpointer will be done every time the checkpointer kicks in, and not only if the temp_buffer is full/exceeded, right?
Thanks again for all the help!
Regards, I did some digging and figured that the backend file is created as soon as the create temp table is executed like you mentioned (with relpersistence as RELPERSISTENCE_TEMP, further calling smgrcreate() to create the backend file).
https://doxygen.postgresql.org/storage_8c_source.html#l00118
https://doxygen.postgresql.org/smgr_8c_source.html#l00333
Q=> So does that mean postgres does not have *only* “in memory” temporary tables in any case?
At best, what it is going to do is to cache the temp. table pages in the temp buffer, but the blocks are still there on the storage, its just cached in memory for the time being, no matter how small the temporary table is. is it correct?
There are conflicting information around this and postgres docs are also not very clear with regards to temp tables and how its handled.
In the below forum discussion as well, it is mentioned that “For a small table the data will be in the memory, For a large table if data is not fit in memory then data will be flushed to disk periodically as the database engine needs more working space for other requests.
..
The memory is controlled by temp_buffers parameter”
https://www.postgresql.org/message-id/CALnrrJThsS3ZshTvsUOpBNbSQKkRWSnRaPS9CtVHwYUJh4%2Bgww%40mail.gmail.com
Q=> I might seem to be misunderstanding the working here, but flushing from the temp buffers to the disk by checkpointer will be done every time the checkpointer kicks in, and not only if the temp_buffer is full/exceeded, right?
Thanks again for all the help!
On Fri, 17 Apr 2020 at 19:04, Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> Hi! Have a small doubt I’m hoping you can help me with..simply put, where
> are the temporary tables created?
> AFAIK, they stay in the memory till the size reaches the temp_buffers limit,
> and then to the storage, for where it is accessed using the temp buffer,
They are accessed through temp buffers, yes, but storage space gets
reserved on-disk immediately whenever the relation is extended.
This is the same as for non-temp tables.
regards, tom lane
--
Regards,
Hritik GuptaHritik Gupta <hritik122@gmail.com> writes: > Q=> So does that mean postgres does not have *only* “in memory” temporary > tables in any case? > At best, what it is going to do is to cache the temp. table pages in the > temp buffer, but the blocks are still there on the storage, its just cached > in memory for the time being, no matter how small the temporary table is. > is it correct? No. We *reserve space* on disk, typically by writing zeroes. But actual data would only get spilled there if we run out of temp buffer space. It's entirely possible for a temp table's data to stay in memory for its whole existence. For regular tables, there are effects such as checkpoints that would cause data to get written out from shared buffers periodically, even if there's plenty of buffer space. That doesn't apply to temp tables. The reason for reserving space is to try to avoid having to deal with ENOSPC failures during buffer spills. On newer filesystems that can happen anyway, of course, but it's still useful on non-COW filesystems. regards, tom lane
Ah okay, that makes perfect sense and connects the dots in my head with regards to temp tables and much more. Thanks so much for the explanation and the reasoning behind why we reserve the space on the disk (i'm guessing its the same reasoning for other entities as well, for instance, WAL files).
Again, thank you so much, it has been such a help! I hope you're safe, do have a great day!
Regards,
Hritik On Fri, 17 Apr 2020 at 23:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hritik Gupta <hritik122@gmail.com> writes:
> Q=> So does that mean postgres does not have *only* “in memory” temporary
> tables in any case?
> At best, what it is going to do is to cache the temp. table pages in the
> temp buffer, but the blocks are still there on the storage, its just cached
> in memory for the time being, no matter how small the temporary table is.
> is it correct?
No. We *reserve space* on disk, typically by writing zeroes. But actual
data would only get spilled there if we run out of temp buffer space.
It's entirely possible for a temp table's data to stay in memory for
its whole existence.
For regular tables, there are effects such as checkpoints that would
cause data to get written out from shared buffers periodically, even
if there's plenty of buffer space. That doesn't apply to temp tables.
The reason for reserving space is to try to avoid having to deal with
ENOSPC failures during buffer spills. On newer filesystems that
can happen anyway, of course, but it's still useful on non-COW
filesystems.
regards, tom lane
--
Regards,
Hritik Gupta