Thread: Please help me understand unlogged tables

Please help me understand unlogged tables

From
Alexander Stoddard
Date:
If a table is set to unlogged is it inherently non-durable? That, is any crash or unsafe shutdown _must_ result in truncation upon recovery?

I can imagine a table that is bulk loaded in a warehousing scenario and then sitting statically could be safe, but maybe the question becomes how could the system know it is unchanged if it isn't logged...

Would a correct solution be to bulk load into an unlogged table and then (fairly cheaply?) set the table back to being logged? 

If I my mental model is correct the table would be liable to being lost to a crash during loading but once set back to logged it is durable. Is that correct?

Is it the case that setting a large table to logged status after loading it unlogged is a reasonably cheap operation?

Thank you,
Alex Stoddard

Re: Please help me understand unlogged tables

From
Jeff Janes
Date:
On Wed, Jan 31, 2018 at 8:52 AM, Alexander Stoddard <alexander.stoddard@gmail.com> wrote:
If a table is set to unlogged is it inherently non-durable? That, is any crash or unsafe shutdown _must_ result in truncation upon recovery?

Yes.
 
I can imagine a table that is bulk loaded in a warehousing scenario and then sitting statically could be safe, but maybe the question becomes how could the system know it is unchanged if it isn't logged...

Right, that is the problem.  It would be nice to solve it, but there are no immediate plans that I know of to implement that.  It is hard, because during recovery the system can't query system catalog tables to get information out of them.  That is why unlogged tables are indicate by the existence of certain files in the filesystem, so the system can know they are unlogged without querying the system catalogs to find out.
 
Would a correct solution be to bulk load into an unlogged table and then (fairly cheaply?) set the table back to being logged?

Unfortunately, it is not cheap for large tables.

I have some large datasets which I could always regenerate if needed, which I don't want blowing out my wal archive.  I've resigned myself to just reloading them after a crash (or after using pg_basebackup to reclone production to make a fresh test/dev server).  If the ETL step from the raw source is expensive, then I \copy the table out to a file, and then \copy it back in after a crash, to save on the ET time.
  

Cheers,

Jeff

Re: Please help me understand unlogged tables

From
Michael Paquier
Date:
On Wed, Jan 31, 2018 at 10:52:38AM -0600, Alexander Stoddard wrote:
> If a table is set to unlogged is it inherently non-durable? That, is any
> crash or unsafe shutdown _must_ result in truncation upon recovery?

Yes, they are designed like that.  Upon recovery all unlogged tables are
re-initialized.  What happens internally is that their init fork files
gets copied to become the one in use.

> I can imagine a table that is bulk loaded in a warehousing scenario and
> then sitting statically could be safe, but maybe the question becomes how
> could the system know it is unchanged if it isn't logged...
>
> Would a correct solution be to bulk load into an unlogged table and then
> (fairly cheaply?) set the table back to being logged?

Note that switching a table from unlogged to logged using ALTER TABLE
generates a large amount of WAL, as you need to make this table's data
consistent it is necessary to copy a full image of it.

> If I my mental model is correct the table would be liable to being lost to
> a crash during loading but once set back to logged it is durable. Is that
> correct?

Yes, ALTER TABLE makes that possible but be careful about the WAL
generated because of the table rewrite.

> Is it the case that setting a large table to logged status after loading it
> unlogged is a reasonably cheap operation?

If you load a logged table the cost would be the same as switching it
from unlogged to logged.  So what matters is how much operation is
generated between the time you created the table, like random writes on
it.  For an initial load on a very large table, you could reduce
wal_level temporarily to minimal, and make the WAL generated less
painful.  For a one-time load this can matter.
--
Michael

Attachment