Thread: two questions related to tablespace in PG8.0.1
Here are two questions related to PG8.0.1: 1. durability of "create tablespace" - what happens if several checkpoints done after "create tablespace" then system crashes - without redo, will the PG_VERSION file and symlinks survive in win32? Seems checkpoint didn't sync the content of PG_VERSION file. 2. possible race on "set_short_version(location)" while creating tablespace - what if two processes reach this point at the same time? So directory emptiness check will not fail and both will create their own PG_VERSION file ... Thanks, Qingqing
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes: > Here are two questions related to PG8.0.1: > 1. durability of "create tablespace" - what happens if several checkpoints > done after "create tablespace" then system crashes - without redo, will the > PG_VERSION file and symlinks survive in win32? Seems checkpoint didn't sync > the content of PG_VERSION file. There is no such thing as crash without redo: that is what WAL is all about. The creation of the tablespace will be correctly replayed from WAL. (Of course, this claim depends on various assumptions about whether fsync behaves per spec ... but if it does not, tablespace creation is hardly the only thing that will fail.) > 2. possible race on "set_short_version(location)" while creating > tablespace - what if two processes reach this point at the same time? There is no "race" --- the point of that code is to ensure that if two users concurrently try to create two tablespaces pointing at the same directory, only one will succeed. In any case, since tablespace creation requires superuser permissions, there is no issue about whether the user might be malicious ... an attacker who has gained database superuser can already break things in arbitrary ways. regards, tom lane
> There is no such thing as crash without redo: that is what WAL is all > about. The creation of the tablespace will be correctly replayed from > WAL. (Of course, this claim depends on various assumptions about > whether fsync behaves per spec ... but if it does not, tablespace > creation is hardly the only thing that will fail.) Yes, if replayed, the creation will be ok. But the case I mentioned will not replay the WAL. The point is that current mdsync() implementation does not take care of streams, so the files opened by AllocateFile() will not get flushed. Most files like "pg_fsm.cache" are ok, since we don't expect them to survive after crash. But is PG_VERSION in creation of tablespace ok? > > 2. possible race on "set_short_version(location)" while creating > > tablespace - what if two processes reach this point at the same time? > > There is no "race" --- the point of that code is to ensure that if > two users concurrently try to create two tablespaces pointing at the > same directory, only one will succeed. In any case, since tablespace > creation requires superuser permissions, there is no issue about > whether the user might be malicious ... an attacker who has gained > database superuser can already break things in arbitrary ways. > understood. > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >