I don't want to back up index files - Mailing list pgsql-general

From Glen Parker
Subject I don't want to back up index files
Date
Msg-id 49B719D6.70209@nwlink.com
Whole thread Raw
Responses Re: I don't want to back up index files  ("Joshua D. Drake" <jd@commandprompt.com>)
List pgsql-general
I am wondering the feasibility of having PG continue to work even if
non-essential indexes are gone or corrupt.  I brought this basic concept
up at some point in the past, but now I have a different motivation, so
I want to strike up discussion about it again.  This time around, I
simply don't want to back up indexes if I don't have to.  Because
indexes contain essentially redundant data, losing one does not equate
to losing real data.  Therefore, backing them up represents a lot of
overhead for very little benefit.

Here's the basic idea:

1) New field to pg_index (indvalid boolean).
2) Query planner skips indexes where indvalid = false.
3) Executer does not update indexes where indvalid = false.
4) Executer refuses insert or update to unique columns where indvalid =
false, throwing an error.
5) WAL roll forward marks indvalid = false if index file(s) are missing,
rather than panicking.
6) REINDEX recognizes syntax to only build indexes with indvalid =
false, marks indvalid = true.

Close to 25% of the on disk bulk of my database is index files.  It
would save a significant amount of the system resources used during the
backup, if I didn't have to archive the index files.  In the unlikely
event that a restore/roll forward becomes necessary, I could simply
issue something like "REINDEX DATABASE foo INVALID;" to restore all the
missing indexes and return the database to full function.  Prior to a
reindex, the database would perform poorly and refuse to do certain
inserts and updates, but the data would be available.  Backup files
would be smaller, and the restore/roll forward would be faster.

No down sides jump out at me, and it seems to me that for a regular PG
code hacker this could actually be fairly simple to implement.

Any chance of something like this being done in the future?


-Glen



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Enable user access from remote host
Next
From: Glen Parker
Date:
Subject: I don't want to back up index files