Thread: initdb should create a warning message [was Re: [ADMIN] Size on Disk]

initdb should create a warning message [was Re: [ADMIN] Size on Disk]

From
Oliver Elphick
Date:
On Wed, 2003-11-26 at 05:53, Tom Lane wrote:
> Grzegorz Dostatni <dostatnig@yahoo.com> writes:
> > Currently the datase is roughly 80 Megs. About half of
> > the size is stored in pg_xlog directory. I managed to
> > figure out that those files are transaction log files?
> > How can I delete them safely?
>
> You can NOT.  Don't even think about going there.
>
> What you can do, if you intend only low-update-volume usage,
> is reduce checkpoint_segments to reduce the number of WAL files
> the system wants to keep around.

The use of the word "log" in the directory name does tend to invite this
error, and some have acted on it without asking first.  I think initdb
should put a README.IMPORTANT file in $PGDATA to say,

        pg_xlog and pg_clog are crucial to the preservation of your
        data. They do not contain standard log files.  Do not even think
        about deleting them to save space; you would destroy your
        database.

The cost is only one disk block per cluster, and it might deflect some
of the weaponry pointed at hapless feet...

Patch for initdb.c attached

I notice that pg_clog and pg_xlog are not mentioned in the index to the
documentation, which makes it more difficult for people to find out what
they are.  I therefore also attach a doc patch to add index entries for
those two files.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "Who shall ascend into the hill of the LORD? or who
      shall stand in his holy place? He that hath clean
      hands, and a pure heart..."            Psalms 24:3,4
Index: src/bin/initdb/initdb.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/bin/initdb/initdb.c,v
retrieving revision 1.15
diff -c -r1.15 initdb.c
*** src/bin/initdb/initdb.c    29 Nov 2003 19:52:04 -0000    1.15
--- src/bin/initdb/initdb.c    30 Nov 2003 21:52:47 -0000
***************
*** 179,184 ****
--- 179,185 ----
  static int    set_paths(void);
  static char **replace_token(char **, char *, char *);
  static void set_short_version(char *, char *);
+ static void set_warning_file(void);
  static void set_null_conf(void);
  static void test_buffers(void);
  static void test_connections(void);
***************
*** 1064,1069 ****
--- 1065,1088 ----
  }

  /*
+  * write out the warning file in the data dir; this is to try to ensure
+  * that users don't delete pg_xlog in the belief that it is "just" a log
+  * file
+  */
+ static void
+ set_warning_file(void)
+ {
+     FILE       *warning_file;
+     char       *path;
+
+     path = xmalloc(strlen(pg_data) + 20);
+     sprintf(path, "%s/README.IMPORTANT", pg_data);
+     warning_file = fopen(path, PG_BINARY_W);
+     fprintf(warning_file, "pg_xlog and pg_clog are crucial to the preservation of your\ndata. They do not contain
standardlog files.  Do not even think\nabout deleting them to save space; you would destroy your\ndatabase.\n"); 
+     fclose(warning_file);
+ }
+
+ /*
   * set up an empty config file so we can check buffers and connections
   */
  static void
***************
*** 2427,2432 ****
--- 2446,2454 ----

      /* Top level PG_VERSION is checked by bootstrapper, so make it first */
      set_short_version(short_version, NULL);
+
+     /* Write the warning file - a warning not to delete pg_xlog! */
+     set_warning_file();

      /*
       * Determine platform-specific config settings
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v
retrieving revision 2.32
diff -c -r2.32 backup.sgml
*** doc/src/sgml/backup.sgml    29 Nov 2003 19:51:36 -0000    2.32
--- doc/src/sgml/backup.sgml    30 Nov 2003 22:22:35 -0000
***************
*** 342,347 ****
--- 342,350 ----

      <listitem>
       <para>
+       <indexterm scope="All">
+          <primary>pg_clog</primary>
+       </indexterm>
        If you have dug into the details of the file system layout of the data you
        may be tempted to try to back up or restore only certain
        individual tables or databases from their respective files or
Index: doc/src/sgml/wal.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/wal.sgml,v
retrieving revision 1.26
diff -c -r1.26 wal.sgml
*** doc/src/sgml/wal.sgml    29 Nov 2003 19:51:38 -0000    1.26
--- doc/src/sgml/wal.sgml    30 Nov 2003 22:22:35 -0000
***************
*** 83,88 ****
--- 83,92 ----
     <title>Future Benefits</title>

     <para>
+    <indexterm scope="All">
+    <primary>pg_clog</primary>
+    </indexterm>
+
      The UNDO operation is not implemented. This means that changes
      made by aborted transactions will still occupy disk space and that
      a permanent <filename>pg_clog</filename> file to hold
***************
*** 283,288 ****
--- 287,295 ----
    </para>

    <para>
+    <indexterm scope="All">
+    <primary>pg_xlog</primary>
+    </indexterm>
     <acronym>WAL</acronym> logs are stored in the directory
     <filename>pg_xlog</filename> under the data directory, as a set of
     segment files, each 16 MB in size.  Each segment is divided into 8
Index: doc/src/sgml/ref/pg_resetxlog.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/ref/pg_resetxlog.sgml,v
retrieving revision 1.8
diff -c -r1.8 pg_resetxlog.sgml
*** doc/src/sgml/ref/pg_resetxlog.sgml    29 Nov 2003 19:51:39 -0000    1.8
--- doc/src/sgml/ref/pg_resetxlog.sgml    30 Nov 2003 22:22:35 -0000
***************
*** 73,78 ****
--- 73,84 ----
    </para>

    <para>
+       <indexterm scope="All">
+          <primary>pg_clog</primary>
+       </indexterm>
+       <indexterm scope="All">
+          <primary>pg_xlog</primary>
+       </indexterm>
     The <literal>-o</>, <literal>-x</>, and <literal>-l</> switches allow
     the next OID, next transaction ID, and WAL starting address values to
     be set manually.  These are only needed when

Re: initdb should create a warning message [was Re:

From
Neil Conway
Date:
Oliver Elphick <olly@lfix.co.uk> writes:
> The use of the word "log" in the directory name does tend to invite
> this error, and some have acted on it without asking first.  I think
> initdb should put a README.IMPORTANT file in $PGDATA to say [...]

If someone deletes something from $PGDATA without understanding what
it is, they deserve what they get.

I do agree that we could stand to document the purpose of pg_clog
and pg_xlog more clearly. However, this information belongs in the
standard documentation, not scattered throughout $PGDATA.

-Neil


Re: initdb should create a warning message [was Re:

From
Oliver Elphick
Date:
On Sun, 2003-11-30 at 23:18, Neil Conway wrote:
> Oliver Elphick <olly@lfix.co.uk> writes:
> > The use of the word "log" in the directory name does tend to invite
> > this error, and some have acted on it without asking first.  I think
> > initdb should put a README.IMPORTANT file in $PGDATA to say [...]
>
> If someone deletes something from $PGDATA without understanding what
> it is, they deserve what they get.

People have a distressing tendency to want to shoot themselves in the
foot; and the somewhat unfortunate naming of those files contributes to
the problem.  While it is satisfying to see stupidity properly rewarded,
it is more neighbourly at least to attempt to protect a fool from his
folly.  It is also kinder to those who may be depending on him for the
protection of their data.

> I do agree that we could stand to document the purpose of pg_clog
> and pg_xlog more clearly. However, this information belongs in the
> standard documentation, not scattered throughout $PGDATA.

Then it needs to be stated very prominently.  But the place to put a
sign saying "Dangerous cliff edge" is beside the path that leads along
it.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "Who is like unto thee, O LORD, among the gods? who is
      like thee, glorious in holiness, fearful in praises,
      doing wonders?"             Exodus 15:11


Re: initdb should create a warning message [was Re: [ADMIN] Size on Disk]

From
Tom Lane
Date:
Oliver Elphick <olly@lfix.co.uk> writes:
> On Sun, 2003-11-30 at 23:18, Neil Conway wrote:
>> I do agree that we could stand to document the purpose of pg_clog
>> and pg_xlog more clearly. However, this information belongs in the
>> standard documentation, not scattered throughout $PGDATA.

> Then it needs to be stated very prominently.  But the place to put a
> sign saying "Dangerous cliff edge" is beside the path that leads along
> it.

How about changing the names of those directories?

            regards, tom lane

Re: initdb should create a warning message [was Re: [ADMIN]

From
Bruce Momjian
Date:
Tom Lane wrote:
> Oliver Elphick <olly@lfix.co.uk> writes:
> > On Sun, 2003-11-30 at 23:18, Neil Conway wrote:
> >> I do agree that we could stand to document the purpose of pg_clog
> >> and pg_xlog more clearly. However, this information belongs in the
> >> standard documentation, not scattered throughout $PGDATA.
>
> > Then it needs to be stated very prominently.  But the place to put a
> > sign saying "Dangerous cliff edge" is beside the path that leads along
> > it.
>
> How about changing the names of those directories?

I thought about that, but what would we call them?  We could change xlog
to wal, I guess.  That might actually be clearer.  xlog could become
xstatus or xactstatus or just xact.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: initdb should create a warning message [was Re: [ADMIN]

From
Mike Mascari
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
>
>>Oliver Elphick <olly@lfix.co.uk> writes:
>>
>>>On Sun, 2003-11-30 at 23:18, Neil Conway wrote:
>>>
>>>>I do agree that we could stand to document the purpose of pg_clog
>>>>and pg_xlog more clearly. However, this information belongs in the
>>>>standard documentation, not scattered throughout $PGDATA.
>>
>>>Then it needs to be stated very prominently.  But the place to put a
>>>sign saying "Dangerous cliff edge" is beside the path that leads along
>>>it.
>>
>>How about changing the names of those directories?
>
>
> I thought about that, but what would we call them?  We could change xlog
> to wal, I guess.  That might actually be clearer.  xlog could become
> xstatus or xactstatus or just xact.
>

active_xdata
active_cdata

Mike Mascari
mascarm@mascari.com



Re: initdb should create a warning message [was Re:

From
Greg Stark
Date:
Oliver Elphick <olly@lfix.co.uk> writes:

> Then it needs to be stated very prominently.  But the place to put a
> sign saying "Dangerous cliff edge" is beside the path that leads along
> it.

The only way to make this prominent would be a file with the *name* "THIS
DIRECTORY CONTAINS CRITICAL DATA". Not a "README" with that message inside.

-- 
greg



Re: initdb should create a warning message [was Re:

From
Andrew Dunstan
Date:
Greg Stark wrote:

>Oliver Elphick <olly@lfix.co.uk> writes:
>
>  
>
>>Then it needs to be stated very prominently.  But the place to put a
>>sign saying "Dangerous cliff edge" is beside the path that leads along
>>it.
>>    
>>
>
>The only way to make this prominent would be a file with the *name* "THIS
>DIRECTORY CONTAINS CRITICAL DATA". Not a "README" with that message inside.
>
>  
>

Renaming the directories is the only suggestion I've seen that makes 
sense. The others remind me of the warning that is now placed on coffee 
cup lids at fast food places: "Caution, Contents May Be Hot".

cheers

andrew



Re: initdb should create a warning message [was Re:

From
Oliver Elphick
Date:
On Mon, 2003-12-01 at 16:39, Andrew Dunstan wrote:
> Renaming the directories is the only suggestion I've seen that makes 
> sense. The others remind me of the warning that is now placed on coffee 
> cup lids at fast food places: "Caution, Contents May Be Hot".

I agree that renaming the directories is the best solution.

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
========================================   "Who is like unto thee, O LORD, among the gods? who is      like thee,
gloriousin holiness, fearful in praises,      doing wonders?"             Exodus 15:11 
 



Re: initdb should create a warning message [was Re:

From
Tilo Schwarz
Date:
Greg Stark writes:
> Oliver Elphick <olly@lfix.co.uk> writes:
> > Then it needs to be stated very prominently.  But the place to put a
> > sign saying "Dangerous cliff edge" is beside the path that leads along
> > it.Greg Stark <gsstark@mit.edu>, p
>
> The only way to make this prominent would be a file with the *name* "THIS
> DIRECTORY CONTAINS CRITICAL DATA". Not a "README" with that message inside.

That's exacly what I did, after some "root" came along and moved my pgdata
away while postmaster was running. The data was not that important in that
case, but nevertheless I put a file with a name like

NEVER_MOVE_THIS_DIRECTORY_WHILE_POSTMASTER_PROCESS_IS_RUNNING.txt

in pgdata and wrote a few lines in that file, how to shutdown postmaster
properly.

But renaming pgdata to something like that would be even better and could be
done alrealy (if I'm right).

Regards,Tilo