Locking is a concurrency primitive, usually allowing shared/read locking and exclusive/write locking.

Linux allows you to take locks on files.

There's the POSIX lockf(3) file locks, which let you lock a specific range of a file, but have various pitfalls.

There's BSD flock(2) file locks, which have per-file-descriptor rather than per-process locks, but don't allow locking a file range.

There's also the new (since Linux 3.15) Linux-specific F_OFD_{SETLK,SETLKW,GETLK} fcntl(2) commands, which are file-descriptor bound and offer file ranges.

I'm only interested in the file-descriptor bound locks, and to keep that simple I'm not going to use file ranges, so we're going to discuss flock(2) and flock(1).

For the sake of conciseness, I will be using python and shell for examples, rather than using the C api directly.

Taking locks

The basic principle is that, since it's a file-descriptor bound lock, you need to open the file, then use the locking call on the file descriptor.

The following programs can be used to atomically generate unique IDs.

They will wait for any other programs that may be using the file to finish before reading, updating and writing to it.

#/bin/sh
LOCKFILE="$1"
shift
exec 100<>"$LOCKFILE" # open file descriptor for read-write
flock 100
read -u 100 num
printf '%d\n' "$num"
num="$(expr "$num" + 1)"
# Need to re-open to be able to write to the beginning
printf '%d\n' "$num" >/proc/self/fd/100

The logic is similar in python, but without having to shell out to flock(1).

#!/usr/bin/python
import sys
from fcntl import flock, LOCK_EX, LOCK_SH, LOCK_UN
lockfile = sys.argv[1]
with open(lockfile, 'rw') as f:
    flock(f.fileno(), LOCK_EX)
    num = int(f.read())
    print(num)
    num += 1
    f.seek(0)
    f.write('%d\n' % num)

Releasing locks

File descriptor locks are released when every reference to the file descriptor is closed, or explicitly with flock(fd, LOCK_UN).

Since file descriptors are closed on process termination, the shell program will release the lock when its process terminates.

The python program uses a context manager with the file, which means it will close the file at the end of the scope, so the file will be closed before termination.

If this were made more explicit, the shell program would end with:

flock -u 100
exec 100>&-

The python program would end with:

    flock(f.fileno(), LOCK_UN)

Using flock(1) with a continuation command hence closes the lock after the command has run.

Lock contention

The purpose of locking is to ensure that resources are protected while the lock is held.

If a lock is not held by any open file descriptors, you can always take it.

There are two ways to hold a lock, exclusively and shared.

The former is the default for the flock(1) command, though it can be explicitly chosen with the --exclusive option, or with the LOCK_EX flag as shown with the python program using the flock(2) syscall.

The latter can be requested with the --shared option, or the LOCK_SH flag.

If a lock is currently held with an exclusive lock, or you want to take an exclusive lock and it is already locked, you can't take the lock.

If the held lock is a shared lock though, you can take a shared lock on the file.

Blocking vs non-blocking

When you are contended on taking a lock, you can either wait for the lock to be released, or fail immediately so you can try something else.

When attempting to take a contended lock, by default you wait for it to be released, however when using flock(2) you can instead of passing LOCK_EX or LOCK_SH, pass LOCK_EX|LOCK_NB and LOCK_SH|LOCK_NB to make this a non-blocking lock, which will immediately return if the lock is contended.

When using flock(1) you would pass --nonblock to do this, and while blocking is the default, you can pass --wait to make it block explicitly.

Blocking locks have the advantage that your process will be suspended until you can take the lock, you are woken up as soon as you can take the lock, and if there is a queue of processes wanting to take the lock, then processes that are waiting get the lock before those that weren't.

However the danger of blocking locks, is that if the other lock doesn't get released, you will not be woken up.

This is a problem when your process needs to be responsive to input.

This can be worked around by having a separate thread to handle user responses, but at some point you've got to draw the line, and say that not being able to take the lock in time is an error.

The neatest way to do this is to use flock(1)'s --timeout option, which you would use from python as:

from subprocess import check_call, CalledSubprocessError
from errno import EAGAIN
from os import strerror
def take_lock(fd, timeout=None, shared=False):
    try:
        check_call(['flock',
                    '--wait' if timeout is None else '--timeout=%d' % timeout,
                    '--shared' if shared else '--exclusive',
                    '--conflict-exit-code=75', #EX_TEMPFAIL
                    str(fd)])
    except CalledSubprocessError as e:
        if e.returncode == 75:
            raise IOError(EAGAIN, strerror(EAGAIN))
        raise

with open(lockfile, 'r') as f:
    take_lock(f.fileno(), timeout=30, shared=True)

Note: Old versions of flock(1) may not support --conflict-exit-code.

It is possible to do locking with a timeout in native python code, by using setitimer(2),

#!/usr/bin/python
from fcntl import flock, LOCK_SH, LOCK_EX, LOCK_NB
from os import strerror
from signal import signal, SIGALRM, setitimer, ITIMER_REAL
from sys import exit

def take_lock(fd, timeout=None, shared=False):
    if timeout is None or timeout == 0:
        flock(fd, (LOCK_SH if shared else LOCK_EX)|(LOCK_NB if timeout == 0 else 0))
        return
    signal(SIGALRM, lambda *_: None)
    setitimer(ITIMER_REAL, timeout)
    # Racy: alarm could be delivered before we try to lock
    flock(fd, LOCK_SH if shared else LOCK_EX)

if __name__ == '__main__':
    from argparse import ArgumentParser
    parser = ArgumentParser()
    parser.add_argument('--shared', action='store_true', default=False)
    parser.add_argument('--exclusive', dest='shared', action='store_false')
    parser.add_argument('--timeout', default=None, type=int)
    parser.add_argument('--wait', dest='timeout', action='store_const', const=None)
    parser.add_argument('--nonblock', dest='timeout', action='store_const', const=0)
    parser.add_argument('file')
    parser.add_argument('argv', nargs='*')
    opts = parser.parse_args()
    if len(opts.argv) == 0:
        fd = int(opts.file)
        take_lock(fd, opts.timeout, opts.shared)
    else:
        from subprocess import call
        with open(opts.file, 'r') as f:
            take_lock(f.fileno(), opts.timeout, opts.shared)
            exit(call(opts.argv))

However, since signals are process-global state, doing it that way can result in a process that has interesting side-effects, especially in a threaded environment, which makes it harder to reason about the behaviour of the program.

It may be nicer to run the blocking flock(2) in a subprocess, just to avoid having to make your main program deal with signals.

The following version works as-before, but uses multiprocessing to run the flock(2) code in a subprocess, and pass any exceptions back to the main process.

#!/usr/bin/python
from errno import EINTR, EAGAIN
from fcntl import flock, LOCK_SH, LOCK_EX, LOCK_NB
from multiprocessing import Pipe, Process
from os import strerror
from signal import signal, SIGALRM, setitimer, ITIMER_REAL
from sys import exit

def _set_alarm_and_lock(fd, pipew, timeout, shared):
    try:
        signal(SIGALRM, lambda *_: None)
        setitimer(ITIMER_REAL, timeout)
        # Racy: alarm could be delivered before we try to lock
        flock(fd, LOCK_SH if shared else LOCK_EX)
    except BaseException as e:
        # This loses the traceback, but it's not pickleable anyway
        pipew.send(e)
        exit(1)
    else:
        pipew.send(None)
        exit(0)

def take_lock(fd, timeout=None, shared=False):
    if timeout is None or timeout == 0:
        flock(fd, (LOCK_SH if shared else LOCK_EX)|(LOCK_NB if timeout == 0 else 0))
        return
    piper, pipew = Pipe(duplex=False)
    p = Process(target=_set_alarm_and_lock,
                args=(fd, pipew, timeout, shared))
    p.start()
    err = piper.recv()
    p.join()
    if err:
        if isinstance(err, IOError) and err.errno == EINTR:
            raise IOError(EAGAIN, strerror(EAGAIN))
        raise err

if __name__ == '__main__':
    from argparse import ArgumentParser
    parser = ArgumentParser()
    parser.add_argument('--shared', action='store_true', default=False)
    parser.add_argument('--exclusive', dest='shared', action='store_false')
    parser.add_argument('--timeout', default=None, type=int)
    parser.add_argument('--wait', dest='timeout', action='store_const', const=None)
    parser.add_argument('--nonblock', dest='timeout', action='store_const', const=0)
    parser.add_argument('file')
    parser.add_argument('argv', nargs='*')
    opts = parser.parse_args()
    if len(opts.argv) == 0:
        fd = int(opts.file)
        take_lock(fd, opts.timeout, opts.shared)
    else:
        from subprocess import call
        with open(opts.file, 'r') as f:
            take_lock(f.fileno(), opts.timeout, opts.shared)
            exit(call(opts.argv))

Converting locks

flock(2) with LOCK_SH when you have a LOCK_EX, or --shared with --exclusive when using flock(1), turns it from an exclusive lock to a shared lock.

Similarly, you can go the other way, converting a shared lock into an exclusive one, though this counts as a contended lock if there are any other holders of shared locks.

You may want to do this if it's for managing the lifetime of a resource.

You would want to hold an exclusive lock on it when making the resource, so that any concurrent users can know that it's being set up, so they can take a blocking lock and wait for it to be ready.

After the resource has been set up, you would convert it to a shared lock, so you can use it yourself, and any other processes wanting to take a shared lock to use it can be woken up and start using it.

When you are finished using the resource you can convert it to an exclusive lock.

You will then know that when you have taken the exclusive lock, that there can be no other users of the resource, so it is safe to remove it.

Lock conversion is important, as you don't want to unlock then re-take the lock, as there is a period where it is unlocked, which other concurrent users might decide means it can be cleaned up.

You'd end up cleaning it up just after setting it up, before you had a chance to use it.

Posted Wed Oct 7 11:00:06 2015 Tags:

An esoteric programming language might be defined as a language which is Turing Complete but not necessarily useful. Examples include brainf**k, whitespace, befunge, malbolge and many more.

There are no real practical reasons to learn an esoteric language. It will never be the easiest way to solve a problem, nor will it be the most readable or most efficient. However, as well being a lot of fun it can be a valuable lesson in thinking logically and, of course, is a great way to impress potential mates.

Writing a program in an esoteric language could be thought of as an artistic endeavour. A sculptor doesn't sculpt in the hope that their sculpture may one day support a bridge or be used as part of a wall. They sculpt because they enjoy it and also perhaps to show off their ability.

If you want to learn an esoteric language then a good way to start is to look at some existing example scripts. Learn, in great depth, how those examples work. Try modifying them, if that works as expected you can try writing your own scripts from scratch. You will want to carefully think about how to break down a problem you are trying to solve into very small and very easily measurable pieces. This is because an esoteric script is very, very hard to debug.

Another way to have fun with esoteric languages is to create your own. There are plenty of tools existing you can use to help you create a language. However that is a subject for another article another day.

Posted Wed Oct 14 11:00:07 2015 Tags:

Something which a lot of people don't realise is that (a) it is possible to write free/open software for closed platforms and that (b) it can be fraught with interesting licence issues. Many of you will, by now (hopefully), be aware of the implications of copyleft licences such as the GNU GPL. Some of you may even be aware of the creative commons and their range of licences which include clauses such as non-commercial use.

When you're looking to include free/open software or other artefacts into your closed platform you need to be very careful to audit their licences to ensure that you're not doing something you're not licensed to do. For example you cannot even aggregate GPL content into your product without complying with certain terms around advertising the fact that it's there, offering to provide the source including your changes, etc. It can get very dangerous if you're incorporating source code files directly into a product binary. There are compatible options such as the BSD or ISC licences though they provide less strong assurances of freedom.

If you do decide to go down the path of using free/open software and resources on your closed platform (and these days, the proliferation of Linux-based products suggests that you are far more likely to do so than not) then you should ensure that you're aware of the licences present on your system and the implications of having them there on the delivered product.

Posted Wed Oct 21 11:00:06 2015

You will, from time to time, feel the need to run a command at a specific time.

at(1) will run a command at a specified time, so if you need a reminder that you should go to bed at midnight, you could run the following command:

$ at midnight
at> echo Go to bed | wall ^D

If you instead need to run a command at regular times, such as a backup job, then you can add a persistent timed job with crontab(1).

You would run crontab -e to add a command to your crontab, as described in crontab(5).

To run nightly obnam backup (instructions taken from Bastian Rieck's blog), add the following to the crontab file.

0 20 * * * /usr/bin/obnam backup $HOME

This could instead be written as a systemd timer unit, which can be used without a separate cron service running, by creating two configuration files as follows:

$ mkdir -p ~/.config/systemd/user
$ cat >~/.config/systemd/user/backup.timer <<'EOF'
> [Unit]
> Description=Backup timer
> [Timer]
> OnCalendar=daily
> [Install]
> WantedBy=default.target
> EOF
$ cat >~/.config/systemd/user/backup.service <<'EOF'
> [Unit]
> Description=Backup Service
> [Service]
> Type=simple
> ExecStart=/usr/bin/obnam backup %h
> [Install]
> WantedBy=default.target
> EOF
$ systemctl --user daemon-reload
$ systemctl --user enable backup.timer backup.service
$ systemctl --user start backup.timer backup.service

This is more verbose than the cron syntax, though arguably less arcane.

However systemd timer units have the advantage of allowing you to set WakeSystem=, which will unsuspend your system to react to timer events.

There are instructions on how to use this to make an alarm clock on Joey Hess' blog.

Posted Wed Oct 28 12:00:07 2015 Tags: