FAQ for Frangipani, Thekkath, Mann, Lee, SOSP 1997
Q: Why are we reading this paper?
A: Primarily as an illustration of cache coherence.
But there are other interesting aspects. It's impressive that recovery is possible if a Frangipani workstation crashes while in the middle of updating the shared file system; we usually think of a thread crashing while holding mutexes as being not recoverable. The idea of each Frangipani workstation having its own log, stored in a public place so that anyone can recover from it, is clever. Further, the logs are intertwined in an unusual way: the updates to a given object may be spread over many logs. This makes replaying a single log tricky (hence Frangipani's version numbers). Building a system out of simple shared storage (Petal) and smart but decentralized participants is interesting.
Q: What are the clients and servers in this system?
A: Petal servers store block data. Each user's workstation runs an instance of Frangipani. Frangipani acts as a client of Petal, but also acts as a file server for applications running on the same workstation. So it's not clear whether one should use "client" or "server" to refer to a Frangipani instance.
Q: Section 5 says that a Frangipani workstation must invalidate cached data when it releases a read lock, and write modified cached data back to Petal when it releases a write lock. Doesn't that severely limit how much Frangipani benefits from caching?
A: After a Frangipani workstation acquires a lock, it will hold onto the lock until some other workstation tries to acquire it. That is, if a user creates a file in a directory (thus acquiring a write lock for that directory, as well as caching the directory's information), and then pauses a bit, and then creates a second file in the same directory, the user's workstation will likely not have to acquire the directory lock twice. And as a consequence, the workstation will read the directory information from Petal just once, and modify that information only in its local cached copy. If some other workstation needs to use that directory, then that other workstation will send an acquire message to the lock server, which will send a revoke message to the first workstation, which will write the directory's modified information from its cache to Petal (and then delete the data from its cache), and then tell the lock service it is giving up the lock. So, as long as there isn't much read/write sharing, Frangipani does a pretty good job of caching locks and data.
Q: Why do the Petal servers in the Frangipani system have a block interface? Why not have file servers (like AFS), that know about things like directories and files?
One reason is that the authors developed Petal first. Petal already solved many fault tolerance and scaling problems, so using it simplified some aspects of the Frangipani design. And this arrangement moves work from centralized servers to workstations, which helps Frangipani performance scale well as more workstations are added.
However, the Petal/Frangipani split makes enforcing invariants on file-system-level structures harder, since no one entity is in charge. Frangipani builds its own transaction system (using the lock service and Frangipani's logs) in order to be able to make complex atomic updates to the file system stored in Petal.
Q: Can a workstation running Frangipani break security?
A: Yes. Since the file system logic is in users' workstations, the design trusts the workstations. A user could modify the local Frangipani software and read/write other users' data in Petal. This makes Frangipani unattractive if users are not trusted. It might still make sense in a small organization, or if Frangipani ran on separate dedicated servers (not on workstations) and talked to user workstations with a protocol like NFS.
Q: How does Frangipani differ from GFS?
A: A big architectural difference is that GFS has most of the file system logic in the servers, while Frangipani distributes the logic over the workstations that run Frangipani. That is, Frangipani doesn't really have a notion of file server in the way that GFS does. Frangipani puts the file system logic in the workstations in order tmo allow them to perform file system operations purely in their local caches. This makes sense when most activity is workstations reading and writing a single user's (cached) files. Frangipani has a lot of mechanism to ensure that workstation caches stay coherent, both so that a write on one workstation is immediately visible to a read on another workstation, and so that complex operations (like creating a file) are atomic even if other workstations are trying to look at the file or directory involved. This last situation is tricky for Frangipani because there's no designated file server that executes all operations on a given file or directory.
In contrast, GFS doesn't have caches at all, since its focus is sequential read and write of giant files that are too big to fit in any cache. It gets high performance for reads of giant files by striping each file over many GFS servers. Because GFS has no caches, GFS does not have a cache coherence protocol. Because the file system logic is in the servers, GFS clients are relatively simple; only the servers need to have locking and worry about crash recovery.
Frangipani appears as a real file system that you can use with any existing workstation program. GFS doesn't present as a file system in that sense; applications have to be written explicitly to use GFS via library calls.
Q: What is Digital Equipment Corporation (DEC)?
A: It's the company at which the authors worked. DEC sold computers and systems software. Unix was originally developed on DEC hardware (though at Bell Labs, not at Digital).
Q: What does the comment "File creation takes longer..." in Section 9.2 mean?