We previously spoke about atomic file clobbering with open(2) and renameat2(2).

This can be used to perform an atomic file creation too.

At a really high level the idea is:

  1. Create a temporary file.
  2. Complete setting it up.
  3. Rename the temporary file into place.

In more detail:

  1. Decide whether creating a new file or replacing an existing one.

    This determines which flags renameat2(2) gets later. You can stat(2) the destination path before proceeding which allows you to detect whether the target was added or removed while you were building the file.

  2. Pick a temporary file path.

    This needs to be on the same file system as the destination, since we are going to use renameat2(2) later.

    Ideally this would be in a directory that is cleaned up automatically so that if your program or computer crashes you won't have the temporary file left around.

    It's rare to be able to do this, since that pretty much just leaves a creating files on a tmpfs when your /tmp is a mounted tmpfs, or creating files on your root partition when /tmp is not a mounted tmpfs.

    You could find the root directory of the mount point the destination is in and put it in a temporary directory in there, but this would still require system integration to have it automatically cleaned up on mount.

    With either /tmp or a custom tempdir on the mount point you also have the problem that when you later use renameat2(2) it can fail from another process mounting a directory on top of one of the directories in the destination path.

    So even though it can leave a temporary file behind creating your temporary file in the same directory as your target is probably the best approach when atomically creating a file with a temporary file.

    linkat(2) and O_TMPFILE provide a better solution for this, but that's a topic for another article, and only works for regular files.

  3. Pick a random file path in your temporary directory.

    This is what mkstemp(3) does under the hood.

    Since we intend to check whether the target file already exists we could safely use [tmpnam(3)][] or [mktemp(3)][] to make it (though not [tempnam(3)][] since it prefers the TMPDIR environment variable for creating the temp file in even if a directory is passed), however in their zeal to stop people writing insecure programs the POSIX standard has either removed or marked these functions obsolete, so you probably want your own implementation.

  4. Create your new file in a way that will fail if it already exists.

    For regular files this requires passing O_CREAT|O_EXCL to the flags.

    The system calls for creating fifos, device nodes, symlinks, directories and unix sockets already default to failing if the target already exists.

    open(2), mkfifo(3), mknod(2), symlink(2) and mkdir(2) return an errno of EEXIST.

    bind(2) (for unix sockets) returns EADDRINUSE.

    open(2), mkfifo(3), mknod(2), symlink(2) and mkdir(2) have variant syscalls with the "at" suffix which take a file descriptor for the directory to create them in.

    Using these or changing directory into the destination provides resilience against the filesystem mounts changing mid-operation.

    When using bind(2) with a unix socket you probably want to chdir(2) anyway, since the maximum length path is shorter than PATH_MAX.

    If creation fails, go back to step 4.

  5. Complete initialisation of the file.

    What this involves depends on the type of file and your application.

    Regular files will want their contents written and metadata set.

    Everything else probably just wants a few metadata tweaks.

  6. Rename the file into place.

    This is the renameat2(2) trick mentioned in an earlier article.

    1. If you explicitly want to create the file pass RENAME_NOREPLACE.
    2. If you explicitly want to replace a file pass RENAME_EXCHANGE and [unlink(2)][] the temporary file,
    3. If you're not sure stat(2) the destination before creating and pass RENAME_NOREPLACE if it didn't exist beforehand, and RENAME_EXCHANGE as above if it did, so if you get a failure you can warn, ask or abort.
    4. If you're really sure you don't care if it either creates a new file or replaces an old one then don't pass any flags.

    Few of the errors that may happen are recoverable from, (though EEXIST when you don't care about clobbering can be handled by removing the destination while linking fails) but checking the return code can help you provide useful error messages.

    • ENOENT when using RENAME_EXCHANGE means the destination file doesn't exist when it should.
    • EEXIST when using RENAME_NOREPLACE means the destination file exists when it should not.
    • EROFS or EXDEV mean some process changed the file system mounts while the operation was in-progress. EROFS being making it read-only, and EXDEV being some mount on top of your directory.

    If you're particularly paranoid you can use stat(2) to check whether the destination file is still reachable from the root directory by checking whether the st_dev and st_ino match.

And there you have it. Atomic file updates.

What would this be useful for

This is useful if you have many processes that might make use of a file and you need to update it to a new version.

This would be useful for shared caches (such as /etc/ld.so.cache) which can't use a lock file to synchronise updates either for performance or legacy client reasons.

It can also be used to atomically upgrade a service running on a unix socket, by starting the new version of the service on a temporary socket then renaming it into place, then closing and removing the old socket.

With care this can be generalised up to a directory with no subdirectories by hard-linking the contents into a temporary directory, making whatever changes are necessary, then swapping directories and unlinking the temporary directory and contents.

This doesn't work for subdirectories because you can't have a directory hardlink and if any of those subdirectories were in use you would get very different results if they were using that directory instead of your new version of that directory.