Thread: Testing needed for recent tablespace hacking

Testing needed for recent tablespace hacking

From
Tom Lane
Date:
Would someone check that I didn't break the Windows port with this
recent commit:

  Log Message:
  -----------
  Add WAL logging for CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.
  Fix TablespaceCreateDbspace() to be able to create a dummy directory
  in place of a dropped tablespace's symlink.  This eliminates the open
  problem of a PANIC during WAL replay when a replayed action attempts
  to touch a file in a since-deleted tablespace.  It also makes for a
  significant improvement in the usability of PITR replay.

I had to do some fooling around with platform-specific code, so it's
possible that things are broken.  Please test that the above-mentioned
four commands still work.  Also see if they work when WAL-replayed.
(The easy way to check this is to do one and then "kill -9" the backend
as soon as it completes.)  One test case for the former PANIC bug is to
run the regression tests using "make installcheck", then immediately
start another backend and kill -9 it (you must do this before a
checkpoint occurs in order to exercise the bug case).

Some code I'm particularly worried about is in DROP TABLESPACE:

    /*
     * Okay, try to remove the symlink.  We must however deal with the
     * possibility that it's a directory instead of a symlink --- this
     * could happen during WAL replay (see TablespaceCreateDbspace),
     * and it is also the normal case on Windows.
     */
    if (lstat(location, &st) == 0 && S_ISDIR(st.st_mode))
    {
        if (rmdir(location) < 0)
            ereport(ERROR,
                    (errcode_for_file_access(),
                     errmsg("could not remove directory \"%s\": %m",
                            location)));
    }
    else
    {
        if (unlink(location) < 0)
            ereport(ERROR,
                    (errcode_for_file_access(),
                     errmsg("could not unlink symbolic link \"%s\": %m",
                            location)));
    }

Is there any reason lstat() wouldn't work on Windows?

            regards, tom lane

Re: Testing needed for recent tablespace

From
markir@coretech.co.nz
Date:
Hmmm... not entirely sure if this is related - but I get a compile failure on
win2000 pro :

gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations
-L../../src/port -L/usr/local/lib  -o postgres.exe
-Wl,--base-file,postgres.base postgres.exp access/SUBSYS.o bootstrap/SUBSYS.o
catalog/SUBSYS.o parser/SUBSYS.o commands/SUBSYS.o executor/SUBSYS.o
lib/SUBSYS.o libpq/SUBSYS.o main/SUBSYS.o nodes/SUBSYS.o optimizer/SUBSYS.o
port/SUBSYS.o postmaster/SUBSYS.o regex/SUBSYS.o rewrite/SUBSYS.o
storage/SUBSYS.o tcop/SUBSYS.o utils/SUBSYS.o ../../src/timezone/SUBSYS.o
-lpgport -lz -lwsock32 -lm  -lws2_32
commands/SUBSYS.o(.text+0x2a827):tablespace.c: undefined reference to `slat'
make[2]: *** [postgres] Error 1
make[2]: Leaving directory
`/home/Administrator/develop/c/postgresql-8.0.0beta1/src/backend'
make[1]: *** [all] Error 2
make[1]: Leaving directory
`/home/Administrator/develop/c/postgresql-8.0.0beta1/src'
make: *** [all] Error 2

regards

Mark

Quoting Tom Lane <tgl@sss.pgh.pa.us>:

> Would someone check that I didn't break the Windows port with this
> recent commit:
>



Re: Testing needed for recent tablespace hacking

From
"Magnus Hagander"
Date:
>Would someone check that I didn't break the Windows port with this
>recent commit:


Took a whlie, sorry. Been away or busy for a couple of days.



>I had to do some fooling around with platform-specific code, so it's
>possible that things are broken.  Please test that the above-mentioned
>four commands still work.  Also see if they work when WAL-replayed.
>(The easy way to check this is to do one and then "kill -9" the backend
>as soon as it completes.)  One test case for the former PANIC bug is to
>run the regression tests using "make installcheck", then immediately
>start another backend and kill -9 it (you must do this before a
>checkpoint occurs in order to exercise the bug case).

I've tested this:
psql to a backend in a test db, do create tablespace
kill -9 (using taskman, say about 5 seconds later)
psql to a backend - tablespace is there. do drop tablespace
kill -9 (using taskman, say about 5 seconds later)
psql to a backend - tablespace is gone


create database also appears to work from what I can tell.


Is that enouhg testing, or are more steps needed?


>Is there any reason lstat() wouldn't work on Windows?

It's #defined to stat(), which should work. It won't know anything about
our symlinks, but that shouldn't matter.


//Magnus

Re: Testing needed for recent tablespace hacking

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> Is that enouhg testing, or are more steps needed?

Sounds all right, but you might want to try the original failure case
too: run the regression tests (make installcheck) then start and kill
a backend.

            regards, tom lane