You should be familiar with the mv(1) command by now, which moves a file from one place to another.
If you're writing shell scripts, or if you're using a library which lets you move files then you don't need to worry about how it works, but if you don't have a library available you might be surprised by the amount of effort required.
It's just a rename(2), right?
If you can guarantee that the destination is on the same filesystem and that you don't care if it replaces some other file, then yes.
Not clobbering
mv(1) has the -n
or --no-clobber
option
to prevent accidentally overwriting a file.
The naive way to do this would be to check whether the file exists before calling rename(2), but this is a TOCTTOU bug which can cause this to overwrite if another thread puts a file there.
To do this safely use renameat2(2) with the RENAME_NOREPLACE
flag,
which will make it fail if the destination already exists.
The destination is on another filesystem
On a modern Linux distribution your files are usually spread across multiple file systems, so your persistent files are on a filesystem mounted from local storage, but your operating system puts temporary files on a different file system so they get removed when your computer shuts down.
Unfortunately, the rename(2) system call does not work if the destination is on a different file system.
Checking ahead of time whether a path is on a different file system
is traditionally handled by calling stat(2),
and checking whether the st_dev
field differs,
but this is another TOCTTOU bug waiting to happen
and rename(2) sets errno(3) to EXDEV
which lets you know it failed for being on another filesystem
in the same system call you would have made anyway.
If you care about still being able to move the file when its destination is on a different file system then you need a fallback when this happens.
So we fall back to copying the file and removing the old one?
In principle, yes, though actually implementing this is surprisingly difficult.
Handling the fallback logic itself is not straight-forward, we'll get that out of the way first.
We can fallback to rename(2) if renameat2(2) is not implemented but only if we don't need to handle not clobbering the target.
When that happens we need to fall back to the copy,
which can use O_EXCL
with O_CREAT
to only write to the file if it didn't already exist.
If unlinking the source file fails because the file doesn't exist, then that means that we were able to copy its contents while something else removed it.
Given the file was written to its destination and it no longer exists where it used to it can be argued that the operation as a whole was successful.
/* my-mv.c */
#include <stdbool.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/syscall.h>
#if !HAVE_DECL_RENAMEAT2
static inline int renameat2(int oldfd, const char *oldname, int newfd, const char *newname, unsigned flags) {
return syscall(__NR_renameat2, oldfd, oldname, newfd, newname, flags);
}
#endif
#ifndef RENAME_NOREPLACE
#define RENAME_NOREPLACE (1<<0)
#endif
int copy_file(const char *source, const char *target, bool no_clobber);
int move_file(const char *source, const char *target, bool no_clobber) {
int ret;
ret = renameat2(AT_FDCWD, source, AT_FDCWD, target, no_clobber ? RENAME_NOREPLACE : 0);
if (ret == 0)
return ret;
if (errno == EXDEV)
goto xdev;
if (errno != ENOSYS) {
perror("renaming file");
return ret;
}
/* Have to skip to copy if unimplemented since rename can't detect EEXIST */
if (no_clobber)
goto xdev;
rename:
ret = rename(source, target);
if (ret == 0)
return ret;
if (errno == EXDEV)
goto xdev;
perror("renaming file");
return ret;
xdev:
ret = copy_file(source, target, no_clobber);
if (ret < 0)
return ret;
if (unlink(source) < 0 && errno != ENOENT) {
perror("unlinking source file");
return -1;
}
return ret;
}
So we open both files, and loop reading data then writing it?
This will produce a file that when read, will produce the same stream of bytes as the original.
You could use stdio(3) to copy the contents,
but that will have to be left as an exercise for the reader,
since I don't like its record-based interface,
I prefer to deal with file descriptors over FILE*
handles,
and the buffering makes error handling more… interesting.
So, broadly, the idea is to read into a buffer, then write from the buffer to the target file.
However, EINTR
is a problem,
many system calls can be interrupted before they do anything,
and read(2) and write(2) may return less than you asked for.
Glibc has a handy TEMP_FAILURE_RETRY
macro for handling EINTR
,
but to handle the short reads and writes,
you need to always work in a loop.
int naive_contents_copy(int srcfd, int tgtfd) {
/* 1MB buffer, too small makes it slow,
shrink this if you feel memory pressure on an embedded device */
char buf[1 * 1024 * 1024];
ssize_t total_copied = 0;
ssize_t ret;
for (;;) {
ssize_t n_read;
ret = TEMP_FAILURE_RETRY(read(srcfd, buf, sizeof(buf)));
if (ret < 0) {
perror("Reading from source");
return ret;
}
n_read = ret;
/* Reached the end of the file */
if (n_read == 0)
return n_read;
while (n_read > 0) {
ret = TEMP_FAILURE_RETRY(write(tgtfd, buf, n_read));
if (ret < 0) {
perror("Writing to target");
return ret;
}
n_read -= ret;
total_copied += ret;
}
}
return 0;
}
int copy_file(const char *source, const char *target, bool no_clobber) {
int srcfd = -1;
int tgtfd = -1;
srcfd = open(source, O_RDONLY);
if (srcfd == -1) {
perror("Opening source file");
return srcfd;
}
tgtfd = open(target, O_WRONLY|O_CREAT|(no_clobber ? O_EXCL : 0), 0600);
if (tgtfd == -1) {
perror("Opening target file");
return tgtfd;
}
return naive_contents_copy(srcfd, tgtfd);
}
Making use of our new function
So now we have a nice move_file
function
that will fall back to copying it if renaming does not work.
But code is of no use in isolation, we need a program for it to live in, and the simplest way to use it is a command-line program.
#include <getopt.h>
#include <string.h>
int main(int argc, char *argv[]) {
char *source;
char *target;
bool no_clobber = false;
enum opt {
OPT_NO_CLOBBER = 'n',
OPT_CLOBBER = 'N',
};
static const struct option opts[] = {
{ .name = "no-clobber", .has_arg = no_argument, .val = OPT_NO_CLOBBER, },
{ .name = "clobber", .has_arg = no_argument, .val = OPT_CLOBBER, },
{},
};
for (;;) {
int ret = getopt_long(argc, argv, "nN", opts, NULL);
if (ret == -1)
break;
switch (ret) {
case '?':
return 1;
case OPT_NO_CLOBBER:
case OPT_CLOBBER:
no_clobber = (ret == OPT_NO_CLOBBER);
break;
}
}
if (optind == argc || argc > optind + 2) {
fprintf(stderr, "1 or 2 positional arguments required\n");
return 2;
}
source = argv[optind];
if (argc == optind + 2)
target = argv[optind + 1];
else
/* Move into the current directory with the same name */
target = basename(source);
if (move_file(source, target, no_clobber) >= 0)
return 0;
return 1;
}
$ if echo 'int main(){(void)renameat2;}' | gcc -include stdio.h -xc - -o/dev/null 2>/dev/null; then
> HAVE_DECL_RENAMEAT2=1
> else
> HAVE_DECL_RENAMEAT2=0
> fi
$ make CFLAGS="-D_GNU_SOURCE -DHAVE_DECL_RENAMEAT2=$HAVE_DECL_RENAMEAT2" my-mv
$ ./my-mv
1 or 2 positional arguments required
$ touch test-file
$ ./my-mv test-file clobber-file
$ ls test-file clobber-file
ls: cannot access test-file: No such file or directory
clobber-file
$ ./my-mv -n test-file clobber-file
rename2: No such file or directory
$ touch test-file
$ ./my-mv --no-clobber test-file clobber-file
rename2: File exists
$ ./my-mv test-file clobber-file
$ ls test-file clobber-file
ls: cannot access test-file: No such file or directory
So we've got a complete fallback for rename(2) now?
Not quite.
For most purposes this is likely to be sufficient, but there's a lot more to a file than the data you can read out of it, far more than I can cover in this article, so there will be follow-up articles to cover copying other aspects of files including:
- Sparseness
- Speed
- Metadata
- Atomicity
- Other types of file
Thanks for this article it's really useful, it would be interesting to know why you're not fond of the stdio interface, and also potentially worth mentioning that EINTR is really only something that needs to be explicitly handled on Linux, from signal(7):
This behaviour seems less than helpful to me, it would be really good to know if there's a good reason why GNU/Linux doesn't just restart the call (as the BSDs do)
Also, I had no idea you could make system calls without having a C wrapper for them, so thanks for that as well!
I don't like the buffering behaviour, it defers writes late enough that I lose context about what bit of the write failed, so when I go to flush and close I can't say how much was actually written.
There's also the fact that some signals can be configured to restart, but not others. Signals in general are a bit of a mess, so my way of dealing with it is to wrap everything in a retry and leave signal handling for shut down and use a better form of IPC for everything else.
Yep, it's a pain when the wrappers don't exist because you need to handle the error return calling convention yourself, since part of what the libc wrappers do is set errno. I tend to use negative error number returns in my own code rather than errno which makes it actually closer to what I prefer.