Thread: Truncation of UNLOGGED tables upon restart.
Per documentation unlogged tables are not crash safe and PG will truncate it when it restarts after a crash. Does this applyto even read only unlogged tables. For example: On Monday I load data into unlogged tables. Then from Tue onwards the table is only read by application. On Fri morning PG crashes. When it restarts will it truncate all unlogged tables even though they remained unchanged. I hope it does not.
On Thu, Nov 1, 2018 at 1:49 PM Ravi Krishna <srkrishna1@aol.com> wrote:
Per documentation unlogged tables are not crash safe and PG will truncate it when it restarts after a crash. Does this apply to even read only unlogged tables.
For example:
On Monday I load data into unlogged tables.
Then from Tue onwards the table is only read by application.
On Fri morning PG crashes. When it restarts will it truncate all unlogged tables even though they remained
unchanged. I hope it does not.
"an unlogged table is automatically truncated after a crash or unclean shutdown." - there are no exceptions noted.
There is no such thing as a "read only" table in PostgreSQL. All tables are read/write no matter that frequency of either event. There is nothing inherently special about "no writes for 4 days" and "no writes for 10 seconds" that would allow for a distinction to be made. There could be write in progress on the table just as it crashes Friday.
David J.
> There is no such thing as a "read only" table in PostgreSQL. All tables are read/write no matter that frequency of eitherevent. There is nothing > inherently special about "no writes for 4 days" and "no writes for 10 seconds" that wouldallow for a distinction to be made. There could be write > in progress on the table just as it crashes Friday. I am aware that unlogged tables have no entries in WAL, but I assumed (incorrectly) that PG will at least keep track whetherany writes was done on a table since last checkpoint, and if none, it will find no reason to truncate it. The use case I was thinking about is that if we have to load a large set of data every weekend and use it for reporting untilnext weekend, why not create those tables as unlogged.
On 11/01/2018 03:58 PM, David G. Johnston wrote:
Too bad you can't say ALTER TABLESPACE foo SET READONLY = false;
That would effectively give you read only tables. I've done that before on the legacy RDBMS I managed.
On Thu, Nov 1, 2018 at 1:49 PM Ravi Krishna <srkrishna1@aol.com> wrote:Per documentation unlogged tables are not crash safe and PG will truncate it when it restarts after a crash. Does this apply to even read only unlogged tables.
For example:
On Monday I load data into unlogged tables.
Then from Tue onwards the table is only read by application.
On Fri morning PG crashes. When it restarts will it truncate all unlogged tables even though they remained
unchanged. I hope it does not."an unlogged table is automatically truncated after a crash or unclean shutdown." - there are no exceptions noted.There is no such thing as a "read only" table in PostgreSQL. All tables are read/write no matter that frequency of either event. There is nothing inherently special about "no writes for 4 days" and "no writes for 10 seconds" that would allow for a distinction to be made. There could be write in progress on the table just as it crashes Friday.
Too bad you can't say ALTER TABLESPACE foo SET READONLY = false;
That would effectively give you read only tables. I've done that before on the legacy RDBMS I managed.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
Greetings, * Ravi Krishna (srkrishna1@aol.com) wrote: > > There is no such thing as a "read only" table in PostgreSQL. All tables are read/write no matter that frequency of eitherevent. There is nothing > inherently special about "no writes for 4 days" and "no writes for 10 seconds" that wouldallow for a distinction to be made. There could be write > in progress on the table just as it crashes Friday. > > I am aware that unlogged tables have no entries in WAL, but I assumed (incorrectly) that PG will at least keep track whetherany writes was done on a table since last checkpoint, and if none, it will find no reason to truncate it. No, we don't currently track that information but it's an interesting idea, at least imv. > The use case I was thinking about is that if we have to load a large set of data every weekend and use it for reportinguntil next weekend, why not create those tables as unlogged. Seems like a pretty useful use-case. I had been thinking for a while, based on a comment made be someone else (Vik Fearing..), that we should have a way to turn an unlogged table into an 'init table' or similar- that is, just copy the data from the main fork into the init fork and then fsync it, then the data is there on restart. Having a way to say 'this data has been fsyncd' is a pretty interesting idea though. I wonder how hard it'd be to make that work. Thanks! Stephen
Attachment
On Thu, Nov 01, 2018 at 07:06:32PM -0400, Stephen Frost wrote: > No, we don't currently track that information but it's an interesting > idea, at least imv. What would be the use case for it? What you are looking for here is gathering information about all pages in a relation and just aggregate which one has the newest LSN, which you can do at SQL level using pageinspect to grab all the page LSNs, then use pg_control_checkpoint() with the LSN of the last checkpoint to know if a table has been written as such. I agree that there could be cheaper solutions than that, but it is hard if the use cases in need of such a thing balance with the extra maintenance involved by a new feature when there are already tools allowing one to do that. > Seems like a pretty useful use-case. I had been thinking for a while, > based on a comment made be someone else (Vik Fearing..), that we should > have a way to turn an unlogged table into an 'init table' or similar- > that is, just copy the data from the main fork into the init fork and > then fsync it, then the data is there on restart. That's interesting. We already have all the facilities to be able to handle properly init forks, and the code code better in making sure that WAL-logging of init forks happens when it should. > Having a way to say 'this data has been fsyncd' is a pretty interesting > idea though. I wonder how hard it'd be to make that work. That's however not much different from a "CREATE TABLE save_data AS SELECT * FROM unlogged_table" happening at some point in time? -- Michael
Attachment
Greetings, * Michael Paquier (michael@paquier.xyz) wrote: > On Thu, Nov 01, 2018 at 07:06:32PM -0400, Stephen Frost wrote: > > No, we don't currently track that information but it's an interesting > > idea, at least imv. > > What would be the use case for it? What you are looking for here is > gathering information about all pages in a relation and just aggregate > which one has the newest LSN, which you can do at SQL level using > pageinspect to grab all the page LSNs, then use pg_control_checkpoint() > with the LSN of the last checkpoint to know if a table has been written > as such. I agree that there could be cheaper solutions than that, but > it is hard if the use cases in need of such a thing balance with the > extra maintenance involved by a new feature when there are already tools > allowing one to do that. I'm not clear at all on what you're getting at here. Yes, we could scan the file and find the newest LSN and see if it's been changed since the last checkpoint, but when would we do that..? My thinking was that we'd handle this single "is this data any good" bit pretty much identically to how the visibility map works today- something checks and says "this data is all good" and then if anyone touches it, the bit gets flipped back to 'dirty'. One thing to realize is that we'd need to hold a lock that prevents changes to the table while we're doing this scan though, so I don't think this would be included in VACUUM or run by autovacuum; instead it'd need to be some new user-level command, I think. Another idea might be to have two bits- one which is "I'm checking this unlogged table to see if it's been changed" and the other to say "it's all good", and then the 'check' process can scan set the first bit, scan the relation, and if the first bit is still set then it can set the second bit. Any update to the relation would clear both bits. I have to say that I really do think we should probably have some top-level user command for this though, and as mentioned elsewhere, I bet users would really like a way to say "don't allow further updates to this table until I say so", to prevent a change from mistakenly being made and a subsequent crash causing the data to be lost. Having unlogged tables where sometimes, if you're lucky, your data isn't lost, but, whoops, other times you aren't lucky and it *is* lost, just doesn't seem very appealing and definitely goes against the POLA. Having an explicit command would address that too. > > Seems like a pretty useful use-case. I had been thinking for a while, > > based on a comment made be someone else (Vik Fearing..), that we should > > have a way to turn an unlogged table into an 'init table' or similar- > > that is, just copy the data from the main fork into the init fork and > > then fsync it, then the data is there on restart. > > That's interesting. We already have all the facilities to be able to > handle properly init forks, and the code code better in making sure that > WAL-logging of init forks happens when it should. > > > Having a way to say 'this data has been fsyncd' is a pretty interesting > > idea though. I wonder how hard it'd be to make that work. > > That's however not much different from a "CREATE TABLE save_data AS > SELECT * FROM unlogged_table" happening at some point in time? Well, that's equivilant to just making the table 'logged' instead of 'unlogged', but that isn't what I think people are looking for- there's quite a few cases where it'd be nice to just have a table that's initialized on database crash/restart, or backup/restore, but otherwise is unlogged. Thanks! Stephen