Home > mailing lists

Re: Accidental removal of a file causing various problems - Mailing list pgsql-hackers

From	Pavan Deolasee
Subject	Re: Accidental removal of a file causing various problems
Date	September 4, 2018 08:13:58
Msg-id	CABOikdPC=LCZ650F5ka8Bzx3NHaguwv6ZVQe6DByvGV0th83iw@mail.gmail.com Whole thread Raw
In response to	Re: Accidental removal of a file causing various problems (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Sat, Aug 25, 2018 at 1:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Actually, I think the main point is given that we've somehow got into
a situation like that, how do we get out again?

I and Alvaro discussed this off-list a bit and we came up with couple of ideas.

1. Reserve some buffers in the shared buffers for system critical functionality. As this case shows, failure to write blocks populated the entire shared buffers with bad blocks and thus making the database completely inaccessible, even for remedial actions. So the idea is to leave aside say first 100 (or some such number) of blocks for system catalogs and allocate buffers from the remaining pool for user tables. Since will at least help in cases where one bad user table does not bring down the entire cluster. Of course, this may not help if the system catalogs themselves are unwritable. But that's probably a major issue anyways.

2. Provide either an automatic or manual way to evict unwritable buffers to a spillover file or set of files. The buffer pool can then be rescued from the critical situation and the DBA can manually inspect the spillover files to take any corrective action, if needed and if feasible. My idea was to create a shadow relfilenode and write buffers to their logical location. Alvaro though thinks that writing one block per file (relfilenode/fork/block) is a better idea since that provides an easy way for DBA to take action. Irrespective of whether we pick one file per block or per relfilenode, a more interesting question is: should this be automatic or require administrative action?

Does either of the ideas sound interesting enough for further work?

Thanks,

Pavan

Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Pavan Deolasee
Date: 04 September 2018, 08:01:00
Subject: Re: MERGE SQL statement for PG12

From: Dilip Kumar
Date: 04 September 2018, 08:19:17
Subject: Re: pg_verify_checksums failure with hash indexes

Re: Accidental removal of a file causing various problems - Mailing list pgsql-hackers

Previous

Next