Mercurial > p > roundup > code
diff roundup/backends/blobfiles.py @ 3960:bc412bb2ccd3
Workaround race condition in file storage during transaction commit [SF#883580]
| author | Richard Jones <richard@users.sourceforge.net> |
|---|---|
| date | Thu, 07 Feb 2008 00:57:59 +0000 |
| parents | 1dab48842cbd |
| children | 35d2976cb4ba |
line wrap: on
line diff
--- a/roundup/backends/blobfiles.py Wed Jan 30 21:04:17 2008 +0000 +++ b/roundup/backends/blobfiles.py Thu Feb 07 00:57:59 2008 +0000 @@ -15,7 +15,7 @@ # BASIS, AND THERE IS NO OBLIGATION WHATSOEVER TO PROVIDE MAINTENANCE, # SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. # -#$Id: blobfiles.py,v 1.23 2007-10-26 06:52:26 richard Exp $ +#$Id: blobfiles.py,v 1.24 2008-02-07 00:57:59 richard Exp $ '''This module exports file storage for roundup backends. Files are stored into a directory hierarchy. ''' @@ -36,8 +36,179 @@ return num_files class FileStorage: - """Store files in some directory structure""" + """Store files in some directory structure + + Some databases do not permit the storage of arbitrary data (i.e., + file content). And, some database schema explicitly store file + content in the fielsystem. In particular, if a class defines a + 'filename' property, it is assumed that the data is stored in the + indicated file, outside of whatever database Roundup is otherwise + using. + + In these situations, it is difficult to maintain the transactional + abstractions used elsewhere in Roundup. In particular, if a + file's content is edited, but then the containing transaction is + not committed, we do not want to commit the edit. Similarly, we + would like to guarantee that if a transaction is committed to the + database, then the edit has in fact taken place. + + This class provides an approximation of these transactional + requirements. + + For classes that do not have a 'filename' property, the file name + used to store the file's content is a deterministic function of + the classname and nodeid for the file. The 'filename' function + computes this name. The name will contain directories and + subdirectories, but, suppose, for the purposes of what follows, + that the filename is 'file'. + + Edit Procotol + ------------- + + When a file is created or edited, the following protocol is used: + + 1. The new content of the file is placed in 'file.tmp'. + + 2. A transaction is recored in 'self.transactions' referencing the + 'doStoreFile' method of this class. + + 3. At some subsequent point, the database 'commit' function is + called. This function first performs a traditional database + commit (for example, by issuing a SQL command to commit the + current transaction), and, then, runs the transactions recored + in 'self.transactions'. + + 4. The 'doStoreFile' method renames the 'file.tmp' to 'file'. + + If Step 3 never occurs, but, instead, the database 'rollback' + method is called, then that method, after rolling back the + database transaction, calls 'rollbackStoreFile', which removes + 'file.tmp'. + + Race Condition + -------------- + + If two Roundup instances (say, the mail gateway and a web client, + or two web clients running with a multi-process server) attempt + edits at the same time, both will write to 'file.tmp', and the + results will be indeterminate. + + Crash Analysis + -------------- + + There are several situations that may occur if a crash (whether + because the machine crashes, because an unhandled Python exception + is raised, or because the Python process is killed) occurs. + + Complexity ensues because backuping up an RDBMS is generally more + complex than simply copying a file. Instead, some command is run + which stores a snapshot of the database in a file. So, if you + back up the database to a file, and then back up the filesystem, + it is likely that further database transactions have occurred + between the point of database backup and the point of filesystem + backup. + + For the purposes, of this analysis, we assume that the filesystem + backup occurred after the database backup. Furthermore, we assume + that filesystem backups are atomic; i.e., the at the filesystem is + not being modified during the backup. + + 1. Neither the 'commit' nor 'rollback' methods on the database are + ever called. + + In this case, the '.tmp' file should be ignored as the + transaction was not committed. + + 2. The 'commit' method is called. Subsequently, the machine + crashes, and is restored from backups. + The most recent filesystem backup and the most recent database + backup are not in general from the same instant in time. + + This problem means that we can never be sure after a crash if + the contents of a file are what we intend. It is always + possible that an edit was made to the file that is not + reflected in the filesystem. + + 3. A crash occurs between the point of the database commit and the + call to 'doStoreFile'. + + If only one of 'file' and 'file.tmp' exists, then that + version should be used. However, if both 'file' and 'file.tmp' + exist, there is no way to know which version to use. + + Reading the File + ---------------- + + When determining the content of the file, we use the following + algorithm: + + 1. If 'self.transactions' reflects an edit of the file, then use + 'file.tmp'. + + We know that an edit to the file is in process so 'file.tmp' is + the right choice. If 'file.tmp' does not exist, raise an + exception; something has removed the content of the file while + we are in the process of editing it. + + 2. Otherwise, if 'file.tmp' exists, and 'file' does not, use + 'file.tmp'. + + We know that the file is supposed to exist because there is a + reference to it in the database. Since 'file' does not exist, + we assume that Crash 3 occurred during the initial creation of + the file. + + 3. Otherwise, use 'file'. + + If 'file.tmp' is not present, this is obviously the best we can + do. This is always the right answer unless Crash 2 occurred, + in which case the contents of 'file' may be newer than they + were at the point of database backup. + + If 'file.tmp' is present, we know that we are not actively + editing the file. The possibilities are: + + a. Crash 1 has occurred. In this case, using 'file' is the + right answer, so we will have chosen correctly. + + b. Crash 3 has occurred. In this case, 'file.tmp' is the right + answer, so we will have chosen incorrectly. However, 'file' + was at least a previously committed value. + + Future Improvements + ------------------- + + One approach would be to take advantage of databases which do + allow the storage of arbitary date. For example, MySQL provides + the HUGE BLOB datatype for storing up to 4GB of data. + + Another approach would be to store a version ('v') in the actual + database and name files 'file.v'. Then, the editing protocol + would become: + + 1. Generate a new version 'v', guaranteed to be different from all + other versions ever used by the database. (The version need + not be in any particular sequence; a UUID would be fine.) + + 2. Store the content in 'file.v'. + + 3. Update the database to indicate that the version of the node is + 'v'. + + Now, if the transaction is committed, the database will refer to + 'file.v', where the content exists. If the transaction is rolled + back, or not committed, 'file.v' will never be referenced. In the + event of a crash, under the assumptions above, there may be + 'file.v' files that are not referenced by the database, but the + database will be consistent, so long as unreferenced 'file.v' + files are never removed until after the database has been backed + up. + """ + + tempext = '.tmp' + """The suffix added to files indicating that they are uncommitted.""" + def __init__(self, umask): self.umask = umask @@ -54,6 +225,13 @@ subdir = str(int(nodeid) / 1000) return os.path.join(subdir, name) + def _tempfile(self, filename): + """Return a temporary filename. + + 'filename' -- The name of the eventual destination file.""" + + return filename + self.tempext + def filename(self, classname, nodeid, property=None, create=0): '''Determine what the filename for the given node and optionally property is. @@ -64,14 +242,45 @@ ''' filename = os.path.join(self.dir, 'files', classname, self.subdirFilename(classname, nodeid, property)) - if create or os.path.exists(filename): + # If the caller is going to create the file, return the + # post-commit filename. It is the callers responsibility to + # add self.tempext when actually creating the file. + if create: return filename - # try .tmp - filename = filename + '.tmp' + tempfile = self._tempfile(filename) + + # If an edit to this file is in progress, then return the name + # of the temporary file containing the edited content. + for method, args in self.transactions: + if (method == self.doStoreFile and + args == (classname, nodeid, property)): + # There is an edit in progress for this file. + if not os.path.exists(tempfile): + raise IOError('content file for %s not found'%tempfile) + return tempfile + if os.path.exists(filename): return filename + # Otherwise, if the temporary file exists, then the probable + # explanation is that a crash occurred between the point that + # the database entry recording the creation of the file + # occured and the point at which the file was renamed from the + # temporary name to the final name. + if os.path.exists(tempfile): + try: + # Clean up, by performing the commit now. + os.rename(tempfile, filename) + except: + pass + # If two Roundup clients both try to rename the file + # at the same time, only one of them will succeed. + # So, tolerate such an error -- but no other. + if not os.path.exists(filename): + raise IOError('content file for %s not found'%filename) + return filename + # ok, try flat (very old-style) if property: filename = os.path.join(self.dir, 'files', '%s%s.%s'%(classname, @@ -98,7 +307,7 @@ os.makedirs(os.path.dirname(name)) # save to a temp file - name = name + '.tmp' + name = self._tempfile(name) # make sure we don't register the rename action more than once if not os.path.exists(name): @@ -136,13 +345,13 @@ name = self.filename(classname, nodeid, property, 1) # the file is currently ".tmp" - move it to its real name to commit - if name.endswith('.tmp'): + if name.endswith(self.tempext): # creation dstname = os.path.splitext(name)[0] else: # edit operation dstname = name - name = name + '.tmp' + name = self._tempfile(name) # content is being updated (and some platforms, eg. win32, won't # let us rename over the top of the old file) @@ -159,8 +368,8 @@ ''' # determine the name of the file to delete name = self.filename(classname, nodeid, property) - if not name.endswith('.tmp'): - name += '.tmp' + if not name.endswith(self.tempext): + name += self.tempext os.remove(name) def isStoreFile(self, classname, nodeid):
