Thread: two questions related to tablespace in PG8.0.1

two questions related to tablespace in PG8.0.1

From
"Qingqing Zhou"
Date:
Here are two questions related to PG8.0.1:

1. durability of "create tablespace" - what happens if several checkpoints
done after "create tablespace" then system crashes - without redo, will the
PG_VERSION file and symlinks survive in win32? Seems checkpoint didn't sync
the content of PG_VERSION file.

2. possible race on "set_short_version(location)" while creating
tablespace - what if two processes reach this point at the same time? So
directory emptiness check will not fail and both will create their own
PG_VERSION file ...

Thanks,
Qingqing






Re: two questions related to tablespace in PG8.0.1

From
Tom Lane
Date:
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
> Here are two questions related to PG8.0.1:
> 1. durability of "create tablespace" - what happens if several checkpoints
> done after "create tablespace" then system crashes - without redo, will the
> PG_VERSION file and symlinks survive in win32? Seems checkpoint didn't sync
> the content of PG_VERSION file.

There is no such thing as crash without redo: that is what WAL is all
about.  The creation of the tablespace will be correctly replayed from
WAL.  (Of course, this claim depends on various assumptions about
whether fsync behaves per spec ... but if it does not, tablespace
creation is hardly the only thing that will fail.)

> 2. possible race on "set_short_version(location)" while creating
> tablespace - what if two processes reach this point at the same time?

There is no "race" --- the point of that code is to ensure that if
two users concurrently try to create two tablespaces pointing at the
same directory, only one will succeed.  In any case, since tablespace
creation requires superuser permissions, there is no issue about
whether the user might be malicious ... an attacker who has gained
database superuser can already break things in arbitrary ways.
        regards, tom lane


Re: two questions related to tablespace in PG8.0.1

From
"Qingqing Zhou"
Date:
> There is no such thing as crash without redo: that is what WAL is all
> about.  The creation of the tablespace will be correctly replayed from
> WAL.  (Of course, this claim depends on various assumptions about
> whether fsync behaves per spec ... but if it does not, tablespace
> creation is hardly the only thing that will fail.)

Yes, if replayed, the creation will be ok. But the case I mentioned will not
replay the WAL. The point is that current mdsync() implementation does not
take care of streams, so the files opened by AllocateFile() will not get
flushed. Most files like "pg_fsm.cache" are ok, since we don't expect them
to survive after crash. But is PG_VERSION in creation of tablespace ok?

> > 2. possible race on "set_short_version(location)" while creating
> > tablespace - what if two processes reach this point at the same time?
>
> There is no "race" --- the point of that code is to ensure that if
> two users concurrently try to create two tablespaces pointing at the
> same directory, only one will succeed.  In any case, since tablespace
> creation requires superuser permissions, there is no issue about
> whether the user might be malicious ... an attacker who has gained
> database superuser can already break things in arbitrary ways.
>

understood.

> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>