Previously we spoke about the common, POSIX file metadata.

This is not the only metadata that a program that handles copying files has to worry about on Linux.

Files also have some additional flags for changing their behaviour, or possibly read-only flags for providing extra information.

Accessing flags.

Flags were originally a feature of the ext2 filesystem, which means they don't have a dedicated system call, since filesystem specific features are often implemented as ioctls.

It also explains why you might see it called EXT2_IOC_GETFLAGS or EXT2_IOC_SETFLAGS.

When using ioctls, it's good to be paranoid, since the same ioctl number can be used for different devices, and you wouldn't want to accidentally do something unintended.

It's possible to check whether it's an appropriate file by using stat and checking the file mode.

We previously used this pattern for the file clone ioctl on btrfs, but included a check that it was a btrfs filesystem.

Since file flags are applicable to multiple filesystems checking the filesystem type should not be necessary.

#include <sys/stat.h>
#include <errno.h>
#include <linux/fs.h>
#include <sys/ioctl.h>

int get_flags(int fd, int *flags_out) {
    struct stat st;
    int ret = 0;
    ret = fstat(fd, &st);
    if (ret < 0)
        return ret;
    if (!S_ISREG(st.st_mode) && !S_ISDIR(st.st_mode)
        && !S_ISLNK(st.st_mode)) {
        errno = ENOTTY;
        return -1;
    }
    return ioctl(fd, FS_IOC_GETFLAGS, flags_out);
}

int set_flags(int fd, const int *flags) {
    struct stat st;
    int ret = 0;
    ret = fstat(fd, &st);
    if (ret < 0)
        return ret;
    if (!S_ISREG(st.st_mode) && !S_ISDIR(st.st_mode)
        && !S_ISLNK(st.st_mode)) {
        errno = ENOTTY;
        return -1;
    }
    return ioctl(fd, FS_IOC_SETFLAGS, flags);
}

As an aside, I find it odd that the set flags ioctl takes a const int* rather than an int since I know of no CPU that has shorter pointers than integers.

Copying flags

Since filesystems have different capabilities, they unfortunately accept different sets of flags.

include/linux/fs.h has definitions for all the flags which are agreed on by every filesystem, though they may not support them.

Since you can't trust flags on two filesystems to mean the same thing, if they are on different filesystems then you must attempt to only set flags they both agree on.

include/linux/fs.h defines FS_FL_USER_MODIFIABLE for this.

Because filesystems may not implement every commonly defined flag and will refuse to set flags if any provided aren't recognised, you can either define logic for looking up the flags supported and setting those all at once, or try setting each flag in-turn so you can determine whether failing to copy that flag is a problem.

Since the kernel doesn't expose which flags a filesystem supports at runtime the set of flags your program thinks a filesystem supports can get out of date, so setting the flags one at a time is the most flexible option.

The code below uses ffs(3) to iterate through the bits set in the integer since C doesn't have an operator to do it, but ffs(3) may be a compiler builtin which uses special instructions.

int copy_flags(int srcfd, int tgtfd, int required_flags) {
    int ret;
    int srcflags;
    int tgtflags;
    int newflags;
    struct statfs srcfs, tgtfs;

    ret = get_flags(srcfd, &srcflags);
    if (ret != 0) {
        /* If we don't support flags we have none to update. */
        if (errno == EINVAL || errno == ENOTTY)
            return 0;
        return ret;
    }

    ret = get_flags(tgtfd, &tgtflags);
    if (ret != 0) {
        if (required_flags == 0 && (errno == EINVAL || errno == ENOTTY))
            return 0;
        return ret;
    }

    ret = fstatfs(srcfd, &srcfs);
    if (ret != 0)
        return ret;

    ret = fstatfs(tgtfd, &tgtfs);
    if (ret != 0)
        return ret;

    /* If on different fs need to mask to commonly agreed flags */
    if (srcfs.f_type != tgtfs.f_type) {
        srcflags &= FS_FL_USER_MODIFIABLE;
        tgtflags &= FS_FL_USER_MODIFIABLE;
        if ((srcflags & required_flags) != required_flags) {
            errno = EINVAL;
            return -1;
        }
    }

    /* Skip setting flags if they are the same */
    if (srcflags == tgtflags)
        return 0;

    /* Clear any flags that are set which we want to remove */
    newflags = tgtflags & srcflags;
    ret = set_flags(tgtfd, &newflags);
    if (ret != 0) {
        /* Can't set flags on the target, but we didn't require any. */
        if (required_flags == 0 && errno == EINVAL)
            return 0;
        return ret;
    }
    tgtflags = newflags;

    /* Use srcflags for flags we want to set,
       which are everything not already set. */
    srcflags &= ~tgtflags;
    while (srcflags) {
        int flag = 1 << (ffs(srcflags) - 1);

        newflags = tgtflags | flag;
        ret = set_flags(tgtfd, &newflags);
        /* Fail if this flag is required and unsettable */
        if (ret != 0 && (flag & required_flags))
            return ret;
        if (ret == 0)
            tgtflags = newflags;

        srcflags &= ~flag;
    }

    return 0;
}

Driving it

Since copying flags may or may not be a problem you need a way to decide, and, since that may depend on the context it's being called in, feedback may be more useful than a heuristic.

A command-line application could ask for confirmation of whether it's acceptable to not set a flag, but this is awkward for programs used in batch scripts and you may have already noticed the above code uses required_flags so the user can declare which flags they consider essential.

Bitwise or-ing flags together is an acceptable C API, but if it's from a command-line then that's not manageable.

We could do our own thing and name each flag, but chattr(1) exists for modifying a file's flags, and as awkward as it can be to remember the character to flag association, it's better to imitate something that users would be familiar with.

/* Convert a chattr style flags string into flags */
static int parse_flags(const char *flagstr) {
    static struct flags_char {
        int flag;
        char flagchar;
    } flags_chars[] = {
        { FS_SECRM_FL, 's' },
        { FS_UNRM_FL, 'u' },
        { FS_COMPR_FL, 'c' },
        { FS_SYNC_FL, 'S' },
        { FS_IMMUTABLE_FL, 'i' },
        { FS_APPEND_FL, 'a' },
        { FS_NODUMP_FL, 'd' },
        { FS_NOATIME_FL, 'A' },
        { FS_JOURNAL_DATA_FL, 'j' },
        { FS_NOTAIL_FL, 't' },
        { FS_DIRSYNC_FL, 'D' },
        { FS_TOPDIR_FL, 'T' },
#ifdef FS_EXTENTS_FL
        { FS_EXTENTS_FL, 'e'},
#endif
        { FS_NOCOW_FL, 'C' },
#ifdef FS_PROJINHERIT_FL
        { FS_PROJINHERIT_FL, 'P' },
#endif
    };
    int flags = 0;

    for (int i = 0; i < (sizeof flags_chars / sizeof *flags_chars); i++) {
        if (strchr(flagstr, flags_chars[i].flagchar))
            flags |= flags_chars[i].flag;
    }

    return flags;
}

As with previous articles, the full version of the my-mv.c source file and the Makefile may be downloaded.

Conclusion

Well that was more work than I expected, but we've copied all the metadata now right?

Well, no. It turns out 32 possible flags isn't enough.

Flags are compact and relatively easy to set, but 32 booleans is just not enough for everything you need, especially when some get reserved for other filesystems or are read-only.

A solution to this would be to just add more ioctls, but that would lead to the same problems with needing to know how to translate them between different filesystems.

What's needed is a unified API for setting this extra information, these… extended attributes. We'll cover these next time.