Thursday, August 12, 2010

Atomicity and files

We all greatly value atomic operations. Either they fully succeed or they completely fail. Even better, if they fail, it is like they never occurred, so that it is not necessary to clean things up.

The term itself is widely used in the database community (the A of ACID stands for Atomicity) and in the parallel computing community. The meaning is essentially very similar, though the scope is different. Moreover, the idea is pervasive in computer science.

Writing exception safe code in C++[0] is essentially a strive for atomicity. C++ code is strongly exception safe if it has rollback semantics. That is to say failed operations have no side effects and consequently leave all data as it never happened. More details here [1]. It is usually considered extremely expensive to write all the code to be strongly exception safe.

In languages such as Clojure this comes for free. Memory is transactional[2], that is to say, it support the ACI (without D) properties of a relational database. Of course durability is not an issue: we are dealing with memory, after all. This is not extremely expensive as, by default, variables in Clojure are not modifiable. Essentially they are but names of const object. If a variable needs to change its value, vars, refs or atoms (but they are an entirely different story) must be used. And they are governed by a software transactional memory system [3].

Another famous source of atomic operations, is the POSIX standard. Some system calls are guaranteed to be atomic. In the following, we are mostly concerned with open (2)[4] and rename (2)[5].

In the open(2) system call, the check for the existence of the file and the creation of the file if it does not exist shall be atomic with respect to other threads executing open() naming the same filename in the same directory with O_EXCL and O_CREAT set.
For this reason, open can be used to implement concurrency structures such as locks; the link (2)[6] system call is often used as well. Moreover, the open function shall be used in Easier to Ask Forgiveness (EAFP) strategies [7], which are fare more secure and effective because open is atomic.

Unfortunately for us, the write(2) [8] system call is only partially atomic. That is to say if the requested write is “small enough” (that is to say the number of bytes are < PIPE_BUF), then the single write action is atomic. If this is not the case, the write is not atomic. Moreover, multiple write actions are, of course, not atomic.

To make things even worse, the whole open-write-close thing is not atomic at all. And is destructive. The typical problem is that when you open a file with O_TRUNC (or with something which in turns calls an open with O_TRUNC set) you destroy the content of the file. However, you may have successive errors which prevents the correct and complete writing of the new file. The typical solution is using temporaries.

References

[0] “Strive for exception safe code”, Effective C++, 3rd ed, S. Meyers, Addison-Wesley, pp. 127-134
[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1997/N1077.asc
[2] http://clojure.org/refs
[3] http://en.wikipedia.org/wiki/Software_transactional_memory
[4] http://www.opengroup.org/onlinepubs/009695399/functions/open.html
[5] http://www.opengroup.org/onlinepubs/009695399/functions/rename.html
[6] http://www.opengroup.org/onlinepubs/009695399/functions/link.html
[7] Python in a Nutshell, 2nd ed., A. Martelli, O’Reilly Media, pp. 134-136
[8] http://www.opengroup.org/onlinepubs/009695399/functions/write.html

No comments: