Thread: Truncation of UNLOGGED tables upon restart.

Truncation of UNLOGGED tables upon restart.

From

Ravi Krishna

Date:

01 November 2018, 20:49:15

Per documentation unlogged tables are not crash safe and PG will truncate it when it restarts after a crash. Does this
applyto even read only unlogged tables.
 

For example:

On Monday I load data into unlogged tables.

Then from Tue onwards the table is only read by application.

On Fri morning PG crashes.  When it restarts will it truncate all unlogged tables even though they remained
unchanged.  I hope it does not.

Re: Truncation of UNLOGGED tables upon restart.

From

"David G. Johnston"

Date:

01 November 2018, 20:58:12

On Thu, Nov 1, 2018 at 1:49 PM Ravi Krishna <srkrishna1@aol.com> wrote:

Per documentation unlogged tables are not crash safe and PG will truncate it when it restarts after a crash. Does this apply to even read only unlogged tables.

For example:

On Monday I load data into unlogged tables.

Then from Tue onwards the table is only read by application.

On Fri morning PG crashes. When it restarts will it truncate all unlogged tables even though they remained
unchanged. I hope it does not.

"an unlogged table is automatically truncated after a crash or unclean shutdown." - there are no exceptions noted.

There is no such thing as a "read only" table in PostgreSQL. All tables are read/write no matter that frequency of either event. There is nothing inherently special about "no writes for 4 days" and "no writes for 10 seconds" that would allow for a distinction to be made. There could be write in progress on the table just as it crashes Friday.

David J.

Re: Truncation of UNLOGGED tables upon restart.

From

Ravi Krishna

Date:

01 November 2018, 21:20:19

> There is no such thing as a "read only" table in PostgreSQL.  All tables are read/write no matter that frequency of
eitherevent.  There is nothing > inherently special about "no writes for 4 days" and "no writes for 10 seconds" that
wouldallow for a distinction to be made.  There could be write > in progress on the table just as it crashes Friday. 

I am aware that unlogged tables have no entries in WAL, but I assumed (incorrectly) that PG will at least keep track
whetherany writes was done on a table since last checkpoint, and if none, it will find no reason to truncate it. 

The use case I was thinking about is that if we have to load a large set of data every weekend and use it for reporting
untilnext weekend, why not create those tables as unlogged.

Re: Truncation of UNLOGGED tables upon restart.

From

Ron

Date:

01 November 2018, 21:29:57

On 11/01/2018 03:58 PM, David G. Johnston wrote:

On Thu, Nov 1, 2018 at 1:49 PM Ravi Krishna <srkrishna1@aol.com> wrote:
Per documentation unlogged tables are not crash safe and PG will truncate it when it restarts after a crash. Does this apply to even read only unlogged tables.

For example:

On Monday I load data into unlogged tables.

Then from Tue onwards the table is only read by application.

On Fri morning PG crashes. When it restarts will it truncate all unlogged tables even though they remained
unchanged. I hope it does not.

"an unlogged table is automatically truncated after a crash or unclean shutdown." - there are no exceptions noted.

There is no such thing as a "read only" table in PostgreSQL. All tables are read/write no matter that frequency of either event. There is nothing inherently special about "no writes for 4 days" and "no writes for 10 seconds" that would allow for a distinction to be made. There could be write in progress on the table just as it crashes Friday.

Too bad you can't say ALTER TABLESPACE foo SET READONLY = false;

That would effectively give you read only tables. I've done that before on the legacy RDBMS I managed.

--
Angular momentum makes the world go 'round.

Re: Truncation of UNLOGGED tables upon restart.

From

Stephen Frost

Date:

01 November 2018, 23:06:32

Greetings,

* Ravi Krishna (srkrishna1@aol.com) wrote:
> > There is no such thing as a "read only" table in PostgreSQL.  All tables are read/write no matter that frequency of
eitherevent.  There is nothing > inherently special about "no writes for 4 days" and "no writes for 10 seconds" that
wouldallow for a distinction to be made.  There could be write > in progress on the table just as it crashes Friday. 
>
> I am aware that unlogged tables have no entries in WAL, but I assumed (incorrectly) that PG will at least keep track
whetherany writes was done on a table since last checkpoint, and if none, it will find no reason to truncate it. 

No, we don't currently track that information but it's an interesting
idea, at least imv.

> The use case I was thinking about is that if we have to load a large set of data every weekend and use it for
reportinguntil next weekend, why not create those tables as unlogged. 

Seems like a pretty useful use-case.  I had been thinking for a while,
based on a comment made be someone else (Vik Fearing..), that we should
have a way to turn an unlogged table into an 'init table' or similar-
that is, just copy the data from the main fork into the init fork and
then fsync it, then the data is there on restart.

Having a way to say 'this data has been fsyncd' is a pretty interesting
idea though.  I wonder how hard it'd be to make that work.

Thanks!

Stephen

Attachment

signature.asc

Re: Truncation of UNLOGGED tables upon restart.

From

Michael Paquier

Date:

02 November 2018, 00:17:58

On Thu, Nov 01, 2018 at 07:06:32PM -0400, Stephen Frost wrote:
> No, we don't currently track that information but it's an interesting
> idea, at least imv.

What would be the use case for it?  What you are looking for here is
gathering information about all pages in a relation and just aggregate
which one has the newest LSN, which you can do at SQL level using
pageinspect to grab all the page LSNs, then use pg_control_checkpoint()
with the LSN of the last checkpoint to know if a table has been written
as such.  I agree that there could be cheaper solutions than that, but
it is hard if the use cases in need of such a thing balance with the
extra maintenance involved by a new feature when there are already tools
allowing one to do that.

> Seems like a pretty useful use-case.  I had been thinking for a while,
> based on a comment made be someone else (Vik Fearing..), that we should
> have a way to turn an unlogged table into an 'init table' or similar-
> that is, just copy the data from the main fork into the init fork and
> then fsync it, then the data is there on restart.

That's interesting.  We already have all the facilities to be able to
handle properly init forks, and the code code better in making sure that
WAL-logging of init forks happens when it should.

> Having a way to say 'this data has been fsyncd' is a pretty interesting
> idea though.  I wonder how hard it'd be to make that work.

That's however not much different from a "CREATE TABLE save_data AS
SELECT * FROM unlogged_table" happening at some point in time?
--
Michael

Attachment

signature.asc

Re: Truncation of UNLOGGED tables upon restart.

From

Stephen Frost

Date:

02 November 2018, 00:52:23

Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:
> On Thu, Nov 01, 2018 at 07:06:32PM -0400, Stephen Frost wrote:
> > No, we don't currently track that information but it's an interesting
> > idea, at least imv.
>
> What would be the use case for it?  What you are looking for here is
> gathering information about all pages in a relation and just aggregate
> which one has the newest LSN, which you can do at SQL level using
> pageinspect to grab all the page LSNs, then use pg_control_checkpoint()
> with the LSN of the last checkpoint to know if a table has been written
> as such.  I agree that there could be cheaper solutions than that, but
> it is hard if the use cases in need of such a thing balance with the
> extra maintenance involved by a new feature when there are already tools
> allowing one to do that.

I'm not clear at all on what you're getting at here.  Yes, we could scan
the file and find the newest LSN and see if it's been changed since the
last checkpoint, but when would we do that..?  My thinking was that we'd
handle this single "is this data any good" bit pretty much identically
to how the visibility map works today- something checks and says "this
data is all good" and then if anyone touches it, the bit gets flipped
back to 'dirty'.

One thing to realize is that we'd need to hold a lock that prevents
changes to the table while we're doing this scan though, so I don't
think this would be included in VACUUM or run by autovacuum; instead
it'd need to be some new user-level command, I think.

Another idea might be to have two bits- one which is "I'm checking this
unlogged table to see if it's been changed" and the other to say "it's
all good", and then the 'check' process can scan set the first bit, scan
the relation, and if the first bit is still set then it can set the
second bit.  Any update to the relation would clear both bits.

I have to say that I really do think we should probably have some
top-level user command for this though, and as mentioned elsewhere, I
bet users would really like a way to say "don't allow further updates to
this table until I say so", to prevent a change from mistakenly being
made and a subsequent crash causing the data to be lost.  Having
unlogged tables where sometimes, if you're lucky, your data isn't lost,
but, whoops, other times you aren't lucky and it *is* lost, just doesn't
seem very appealing and definitely goes against the POLA.  Having an
explicit command would address that too.

> > Seems like a pretty useful use-case.  I had been thinking for a while,
> > based on a comment made be someone else (Vik Fearing..), that we should
> > have a way to turn an unlogged table into an 'init table' or similar-
> > that is, just copy the data from the main fork into the init fork and
> > then fsync it, then the data is there on restart.
>
> That's interesting.  We already have all the facilities to be able to
> handle properly init forks, and the code code better in making sure that
> WAL-logging of init forks happens when it should.
>
> > Having a way to say 'this data has been fsyncd' is a pretty interesting
> > idea though.  I wonder how hard it'd be to make that work.
>
> That's however not much different from a "CREATE TABLE save_data AS
> SELECT * FROM unlogged_table" happening at some point in time?

Well, that's equivilant to just making the table 'logged' instead of
'unlogged', but that isn't what I think people are looking for- there's
quite a few cases where it'd be nice to just have a table that's
initialized on database crash/restart, or backup/restore, but otherwise
is unlogged.

Thanks!

Stephen

Attachment

signature.asc