So now we've got an equivalent to a slow rename(2), right?

Not quite, rename(2) is atomic. It disappears from the old location and reappears whole at the new one at the same time.

Our move leaves it partially in the new location while it's copying which can confuse another program that might be looking at it while it is being copied.

Atomic creation with a temporary file

This approach was described in a previous article.

The bulk of the change is to include the clobber options, the rest is to open a temporary file in the target's directory (as seen in open_tmpfile) and using rename_file after the contents have been changed.

int open_tmpfile(const char *target, char **tmpfn_out) {
    char *template = malloc(strlen(target) + sizeof("./.tmpXXXXXX"));
    char *dir = NULL;
    int ret;
    strcpy(template, target);
    dir = dirname(template);
    if (dir != template)
        strcpy(template, dir);
    strcat(template, "/");
    strcat(template, ".tmp");
    strcat(template, basename(target));
    strcat(template, "XXXXXX");
    ret = mkstemp(template);
    if (ret >= 0)
        *tmpfn_out = template;
    else
        free(template);
    return ret;
}

int copy_file(char *source, char *target, struct stat *source_stat,
              enum clobber clobber, enum setgid setgid, int required_flags) {
    int srcfd = -1;
    int tgtfd = -1;
    int ret = -1;
    char *tmppath = NULL;

    ret = open(source, O_RDONLY);
    if (ret == -1) {
        perror("Open source file");
        goto cleanup;
    }
    srcfd = ret;

    ret = set_selinux_create_context(target, source_stat->st_mode);
    if (ret != 0) {
        perror("Set selinux create context");
        goto cleanup;
    }

    ret = open_tmpfile(target, &tmppath);
    if (ret == -1) {
        perror("Open temporary target file");
        goto cleanup;
    }
    tgtfd = ret;

    ret = copy_contents(srcfd, tgtfd);
    if (ret < 0)
        goto cleanup;

    ret = fchmod(tgtfd, source_stat->st_mode);
    if (ret < 0)
        goto cleanup;

    ret = fix_owner(target, source_stat, setgid, tgtfd);
    if (ret < 0)
        goto cleanup;

    ret = copy_flags(srcfd, tgtfd, required_flags);
    if (ret < 0)
        goto cleanup;

    ret = copy_xattrs(srcfd, tgtfd);
    if (ret < 0)
        goto cleanup;

    ret = copy_posix_acls(srcfd, tgtfd);
    if (ret < 0)
        goto cleanup;

    {
        struct timespec times[] = { source_stat->st_atim, source_stat->st_mtim, };
        ret = futimens(tgtfd, times);
        if (ret < 0)
            goto cleanup;
    }

    ret = rename_file(tmppath, target, clobber);
cleanup:
    close(srcfd);
    close(tgtfd);
    if (tmppath && ret != 0)
        (void)unlink(tmppath);
    free(tmppath);
    return ret;
}

Atomic creation with an anonymous temporary file

Astute readers may have noticed that if your program crashes before renaming, that the temporary file may be left behind.

A potential solution to this is to use the O_TMPFILE option of open(2) and then linkat(2) to link the temporary file into place.

This is handy, because the file is cleared up automatically when the file descriptor is closed.

However an issue with this approach is that linkat(2) always returns EEXIST if the target exists, so if you wanted to clobber a file you would need to fall back to the temporary file approach as above, by using linkat(2) to link the file in at a temporary path and then use renameat2(2) to put it in place, which loses the primary benefit of the linkat(2) in the first place.

This may be worthwhile as it reduces the period where the file may be left behind, but such an approach is left as an exercise to the reader.

As usual, the code may be downloaded from my-mv.c and the accompanying Makefile may be downloaded.

So we've got something we can use to move any file now?

If you define "file" as a regular file, yes.

If you say "everything is a file", then you open it up to character devices, block devices, unix sockets, symbolic links, directories and btrfs subvolumes.

However every enterprise needs a given definition of done.

For now I think copying a regular file is enough, so detail in other types of file will have to be left for a future series.

Firstly, it isn't const-correct:

test.c: In function ‘open_tmpfile’:
test.c:12:31: warning: passing argument 1 of ‘__xpg_basename’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
     strcat(template, basename(target));
                               ^~~~~~
In file included from test.c:1:0:
/usr/include/libgen.h:34:14: note: expected ‘char *’ but argument is of type ‘const char *’
 extern char *__xpg_basename (char *__path) __THROW;
              ^~~~~~~~~~~~~~

Second, strcpy(template, dirname(template)); has undefined behaviour since dirname(template) will probably point to template and strcpy() does not work with overlapping buffers. I think you have to use memmove() or two separate buffers.

Finally, it leaks the template string if mkstemp() fails, though that's not very likely.

Comment by womble2 [livejournal.com] Thu Nov 17 02:45:18 2016

Sorry I didn't notice your reply sooner.

Firstly, it isn't const-correct:

  test.c: In function ‘open_tmpfile’:
  test.c:12:31: warning: passing argument 1 of ‘__xpg_basename’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
       strcat(template, basename(target));
                                 ^~~~~~
  In file included from test.c:1:0:
  /usr/include/libgen.h:34:14: note: expected ‘char *’ but argument is of type ‘const char *’
   extern char *__xpg_basename (char *__path) __THROW;
                ^~~~~~~~~~~~~~

This is from the awkwardness of there being two different definitions of basename. I don't get this error in the full version of the code (http://yakking.branchable.com/posts/moving-files-8-atomicity/my-mv.c) since I need to pull in the definitions with:

#include <libgen.h>          /* dirname */
#undef basename
#include <string.h>          /* basename */

If you are aware of a better way to pull in the definitions please enlighten me, I'm working from the manpages as to how to pull the definitions in.

Second, strcpy(template, dirname(template)); has undefined behaviour since dirname(template) will probably point to template and strcpy() does not work with overlapping buffers. I think you have to use memmove() or two separate buffers.

Thanks, I apparently missed the warning about overlaps when I checked the documentation. Given I only actually want to do any copy when dirname(template) doesn't point to template I've gone with:

char *dir = NULL;
strcpy(template, target);
dir = dirname(template);
if (dir != template)
    strcpy(template, dir);

Finally, it leaks the template string if mkstemp() fails, though that's not very likely.

Thanks. I have no excuse for how I missed that, I was probably writing late into the night and then didn't go back and check.

Comment by Richard Maw Mon Nov 28 23:13:53 2016