On Thu, 2004-07-08 at 16:19, Alvaro Herrera wrote:
> On Thu, Jul 08, 2004 at 02:58:01PM +0100, Simon Riggs wrote:
>
> > I've discovered that CREATE DATABASE doesn't redo correctly in an
> > archive recovery test.
> >
> > This isn't a bug --in the current code--, because when crash recovery
> > occurs, the database directories are already there, so this only doesn't
> > work when using the PITR patches. During archive recovery, nothing is
> > there, so needs to be created.
> >
> > It looks like CREATE DATABASE doesn't produce redo, nor is there a
> > replay command created for it.
>
> [...]
>
> > The FileNameOpenFile fails when the first relation in the database is
> > created. The code assumes that any failure of the FileNameOpenFile is
> > because the file is already there, then tries to open it which also
> > fails. The failure is caused by the fact that there is no directory (as
> > well as no file), but that isn't tested for.
>
> I don't think it's a good idea to just create a directory if it's not
> already there. It would mean creating a spurious directory with an
> empty file if the data is corrupted and a wrong RelFileNode is in memory
> for whatever reason.
>
> The correct solution would be to emit a XLog record for CREATE
> DATABASE ...
I'd prefer a formal approach, hence why I raised this.
Interesting what you say about about the other stuff - a simple create
dir worked without any problems - I wonder if a whole bunch of stuff is
missing there?
It does seem likely that other rm_redo functions for other resource
managers contain similar bugs when used with Archive Recovery.
Hopefully, these will emerge during beta...
...its too late in the day for me to fix some of these - I need to
complete the PITR functionality of the archive recovery.
Best Regards, Simon Riggs