two questions related to tablespace in PG8.0.1
Here are two questions related to PG8.0.1:
1. durability of "create tablespace" - what happens if several checkpoints
done after "create tablespace" then system crashes - without redo, will the
PG_VERSION file and symlinks survive in win32? Seems checkpoint didn't sync
the content of PG_VERSION file.
2. possible race on "set_short_version(location)" while creating
tablespace - what if two processes reach this point at the same time? So
directory emptiness check will not fail and both will create their own
PG_VERSION file ...
Thanks,
Qingqing
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
Here are two questions related to PG8.0.1:
1. durability of "create tablespace" - what happens if several checkpoints
done after "create tablespace" then system crashes - without redo, will the
PG_VERSION file and symlinks survive in win32? Seems checkpoint didn't sync
the content of PG_VERSION file.
There is no such thing as crash without redo: that is what WAL is all
about. The creation of the tablespace will be correctly replayed from
WAL. (Of course, this claim depends on various assumptions about
whether fsync behaves per spec ... but if it does not, tablespace
creation is hardly the only thing that will fail.)
2. possible race on "set_short_version(location)" while creating
tablespace - what if two processes reach this point at the same time?
There is no "race" --- the point of that code is to ensure that if
two users concurrently try to create two tablespaces pointing at the
same directory, only one will succeed. In any case, since tablespace
creation requires superuser permissions, there is no issue about
whether the user might be malicious ... an attacker who has gained
database superuser can already break things in arbitrary ways.
regards, tom lane
There is no such thing as crash without redo: that is what WAL is all
about. The creation of the tablespace will be correctly replayed from
WAL. (Of course, this claim depends on various assumptions about
whether fsync behaves per spec ... but if it does not, tablespace
creation is hardly the only thing that will fail.)
Yes, if replayed, the creation will be ok. But the case I mentioned will not
replay the WAL. The point is that current mdsync() implementation does not
take care of streams, so the files opened by AllocateFile() will not get
flushed. Most files like "pg_fsm.cache" are ok, since we don't expect them
to survive after crash. But is PG_VERSION in creation of tablespace ok?
2. possible race on "set_short_version(location)" while creating
tablespace - what if two processes reach this point at the same time?There is no "race" --- the point of that code is to ensure that if
two users concurrently try to create two tablespaces pointing at the
same directory, only one will succeed. In any case, since tablespace
creation requires superuser permissions, there is no issue about
whether the user might be malicious ... an attacker who has gained
database superuser can already break things in arbitrary ways.
understood.
Show quoted text
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org