Disclaimer

systemd is a contentious issue in the Linux world, you don't need to go far to read arguments for or against systemd. This article is not about whether it's good or bad, but in the interests of full disclosiure, I feel it is important to declare that I am a systemd contributor, as at the time of writing I have 1 line of code in systemd. Hence I am biased, as I wouldn't waste my time on something I didn't think was worthwhile.

What's with the title?

One of the lead developers wrote an article series called systemd for administrators. The goal of which is to explain how useful systemd is for system administrators, but the systemd project has widened in scope since then, and the full scope of what systemd provides is not common knowledge.

What is systemd

systemd is both the name of a project to provide the boring bits needed to have a working Linux system, and the name of the init process systemd(1), which is the first process run in a Linux system, and is responsible for starting all others.

systemd as PID 1

systemd(1) for initial start-up and management of services

You configure services by putting systemd.service(5) files in /etc/systemd/system, and do run-time manipulation with the systemctl(1) tool.

systemd.service(5) files are a DSL for all the tricky start-up configurations that were previously encoded in fragile init scripts, and generalises the on-demand service start-up in inetd(8) into socket activation.

One of the trickiest parts of start-up is ensuring correct dependency ordering, so services that depend on other services are started after they are ready.

To handle this, you can set Before= in the service file you depend on, or After= in your unit file, and ensure that your services set the right Type= section to notify when the service is ready.

For one-off commands that you'd still like to benefit from systemd's service management, you can use systemd-run(1).

Conclusion

I emphasised the difference between systemd the project and systemd(1) the daemon that inhabits PID 1, because there are other services provided by the former than just the latter, which will be descibed in future articles.

Posted Wed Apr 1 11:00:09 2015 Tags:

A daemon is a process that is always running in the background to provide some form of service. This is different from the nasal demons that may be invoked when you invoke undefined behaviour when writing a C program.

This form of daemon is named after Maxwell's demon, rather than having any diabolic undertones.

The tricky parts of starting a daemon are disassociating onesself from the process that launched it, since a great deal of context is inherited from the parent process; and providing notification when the service has started.

We're going to illustrate how this works with a python script that creates a daemon that writes the numbers 1 to 10 to a file, one every second before exiting.

Detaching from parent environment

#!/usr/bin/python
#test.py
import os, sys, time
filename = os.path.abspath(sys.argv[1])

# To disassociate ourself from our parent process we do a "double fork"
# the first fork being the fork that started our launcher script
daemonpid = os.fork()
if daemonpid == 0:
    # In the daemon process

    # Another thing we inherited from our parent process is our
    # "controlling tty", which can be accessed by opening /dev/tty. We want
    # to isolate ourself from our controlling tty so that a daemon we start
    # cannot access our terminal for input or output.
    # One way to do this is call setsid(), which gives us a new session.
    # Starting a new session also means that we no longer get signals
    # intended for all processes in the launcher session, or all processes
    # in our process group, as would be invoked by using kill() with a
    # negative process group ID.
    os.setsid()

    # Now we also inherited a lot of environment variables, so we're going
    # to clean them all out. Other daemons may need to preserve some
    # variables, but we don't need to.
    for var in os.environ:
        os.unsetenv(var)

    # There's also the umask(), which in inherited and affects what we set
    # the file permissions to by default.
    os.umask(0)

    # You also inherited a bunch of file descriptors. You should not have
    # any, so we must close them all. One rookie mistake is forgetting to
    # close the stdout, and having some debug output be produced by the
    # program. If this is left in, then the daemon will crash when the
    # user logs out, because its output file no longer exists.
    for pid in os.listdir('/proc/self/fd'):
        os.close(int(pid))

    # Because a lot of standard library functions assume that the standard
    # input, output and error are connected, we should replace them here.
    os.open('/dev/null', os.O_RDONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.open('/dev/null', os.O_WRONLY)

    # We should also change the current directory, as we will prevent it
    # from being unmounted otherwise.
    os.chdir('/')

    # Now we have disassociated ourselves from the majority of our
    # context, we can perform our operation
    # Note, we add the NOCTTY flag because we are accepting user input,
    # and if the user told us to open a tty, we would then have a
    # controlling tty again.
    fd = os.open(filename, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
    with os.fdopen(fd, 'w') as f:
        for i in xrange(1, 11):
            f.write('%s\n' % i)
            f.flush()
            time.sleep(1)
    sys.exit(0)

elif daemonpid > 0:
    # In the launcher process

    # Normally when a process dies, its parent process has to call some
    # form of wait() to get information about how it exited, so it can be
    # cleaned up.
    # It gets asynchronously notified with a SIGCHILD, which is annoying,
    # since we would have to have our launcher process live as long as our
    # daemon so we could clean it up.
    # However, if we exit here, then our daemon process is re-parented to
    # process 1, the "init" process, so it is owned by that, rather than
    # the process that started the "launcher" script.
    sys.exit(0)

Instructing the daemon to terminate

However, what if we want it to continue operating until we say it should stop? The traditional way to do this is to have the daemon write a "pid file" to say which process ID should be killed to stop the daemon.

#!/usr/bin/python
#test.py
import os, sys, time, itertools
filename = os.path.abspath(sys.argv[1])
pidfile = os.path.abspath(sys.argv[2])

daemonpid = os.fork()
if daemonpid == 0:
    os.setsid()
    for var in os.environ:
        os.unsetenv(var)
    os.umask(0)
    for pid in os.listdir('/proc/self/fd'):
        os.close(int(pid))
    os.open('/dev/null', os.O_RDONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.chdir('/')
    fd = os.open(filename, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)

    # As soon as we are ready, we write our daemon's pid to the pid file,
    # and not sooner, as we could leave around a pid file for a process
    # that failed to start, hence cannot be stopped, and some software may
    # opt to wait for the pid file to be written as a form of
    # notification.
    pidfd = os.open(pidfile, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
    with os.fdopen(pidfd, 'w') as pidfobj:
        pidfobj.write('%s\n' % os.getpid())

    with os.fdopen(fd, 'w') as f:
        for i in itertools.count():
            f.write('%s\n' % i)
            f.flush()
            time.sleep(1)
elif daemonpid > 0:
    sys.exit(0)

We can then, if we invoke the script as ./test.py test.out test.pid, run kill $(cat test.pid) to end the service.

Synchronous daemon start

However, this still claims to have successfully started the daemon before it has actually finished starting, and has no way of reporting whether starting actually failed. We alluded to possibly using the contents of the pid file as a signal for whether it started successfully, but that would require you to either poll the pid file or watch for writes to it with inotify, and also handle SIGCHILD if the daemon fails to start.

It's a lot easier to just open a pipe and use the end of file marker for whether the process is either ready or failed. This works because you get an end of file if the write-end is closed, which also happens if the process terminates.

#!/usr/bin/python
#test.py
import os, sys, time, itertools
filename = os.path.abspath(sys.argv[1])
pidfile = os.path.abspath(sys.argv[2])

# Here we create our pipe for status reporting, we want it in blocking mode
# so that our launcher process can sleep until it has status to report.
status_pipe_read, status_pipe_write = os.pipe()

daemonpid = os.fork()
if daemonpid == 0:

    # We don't _need_ to close it here as we close every unused file
    # later, but I've closed it here for illustration that we don't need
    # the reading end in the daemon process.
    os.close(status_pipe_read)

    os.setsid()
    for var in os.environ:
        os.unsetenv(var)
    os.umask(0)
    for pid in os.listdir('/proc/self/fd'):
        if int(pid) == status_pipe_write:
            continue
        os.close(int(pid))
    os.open('/dev/null', os.O_RDONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.chdir('/')
    fd = os.open(filename, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
    pidfd = os.open(pidfile, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
    with os.fdopen(pidfd, 'w') as pidfobj:
        pidfobj.write('%s\n' % os.getpid())
    with os.fdopen(fd, 'w') as f:

        # We must close the status pipe file descriptor now to signal eof
        os.close(status_pipe_write)

        for i in itertools.count():
            f.write('%s\n' % i)
            f.flush()
            time.sleep(1)
elif daemonpid > 0:

    # In the launcher process we *must* close the write end of the pipe, or
    # we would not get an end of file on the pipe when it is closed in the
    # daemon process.
    os.close(status_pipe_write)

    # Now we wait for status output
    status = os.read(status_pipe_read, 4096)
    assert status == ''

    # Now we must handle the exit status of the process
    deadpid, status = os.waitpid(daemonpid, os.WNOHANG)
    # If our daemon has prematurely exited, we use its exit code
    if os.WIFEXITED(status):
        ecode = os.WEXITSTATUS(status)
        if ecode != 0:
            sys.stderr.write('Daemon launch failed: %d\n' % ecode)
        sys.exit(ecode)
    else:
        sys.exit(0)

Supporting fork-unsafe libraries

However, there's some libraries that misbehave after a fork, perhaps because they use threads, and after fork, you only get a copy of the thread that called fork. To handle this appropriately you need to exec a helper binary after the fork, so you ought to split your daemon into launcher and payload binaries if you ever use such libraries.

#!/usr/bin/python
#launcher.py
import os, sys
filename = os.path.abspath(sys.argv[1])
pidfile = os.path.abspath(sys.argv[2])
payload = os.path.abspath('payload.py')
status_pipe_read, status_pipe_write = os.pipe()
daemonpid = os.fork()
if daemonpid == 0:
    os.setsid()
    for var in os.environ:
        os.unsetenv(var)
    os.umask(0)
    for pid in os.listdir('/proc/self/fd'):
        if int(pid) == status_pipe_write:
            continue
        os.close(int(pid))
    os.open('/dev/null', os.O_RDONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.open('/dev/null', os.O_WRONLY)
    os.chdir('/')

    # call out to our payload, with the protocol that the status fd is the
    # first argument, and the pid file path is the second
    os.execv(payload, [payload, str(status_pipe_write), pidfile, filename])

elif daemonpid > 0:
    os.close(status_pipe_write)
    status = os.read(status_pipe_read, 4096)
    assert status == ''
    deadpid, status = os.waitpid(daemonpid, os.WNOHANG)
    if os.WIFEXITED(status):
        ecode = os.WEXITSTATUS(status)
        if ecode != 0:
            sys.stderr.write('Daemon launch failed: %d\n' % ecode)
        sys.exit(ecode)
    else:
        sys.exit(0)

We also have a separate payload script, that should be located in the same directory we run the launcher from.

#!/usr/bin/python
#payload.py
import sys, os, time, itertools
status_pipe_write = int(sys.argv[1])
pidfile = sys.argv[2]
filename = sys.argv[3]

fd = os.open(filename, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
pidfd = os.open(pidfile, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
with os.fdopen(pidfd, 'w') as pidfobj:
    pidfobj.write('%s\n' % os.getpid())
with os.fdopen(fd, 'w') as f:
    os.close(status_pipe_write)
    for i in itertools.count():
        f.write('%s\n' % i)
        f.flush()
        time.sleep(1)

Now we have a daemon we can start in either a Type=forking systemd service unit or SysV initscript, and our launcher would be reusable for other kinds of payload with a bit of extra work.

Shedding most of the code and using systemd notify

If we accepted the limitation of only working with systemd, we could do a Type=notify systemd unit, by changing our notification mechanism, and do away with the launcher script.

#!/usr/bin/python
#payload-notify.py
import sys, os, time, itertools
from systemd.daemon import notify
pidfile = sys.argv[1]
filename = sys.argv[2]

fd = os.open(filename, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
pidfd = os.open(pidfile, os.O_WRONLY|os.O_NOCTTY|os.O_CREAT)
with os.fdopen(pidfd, 'w') as pidfobj:
    pidfobj.write('%s\n' % os.getpid())
with os.fdopen(fd, 'w') as f:
    notify('READY=1')
    for i in itertools.count():
        f.write('%s\n' % i)
        f.flush()
        time.sleep(1)
Posted Wed Apr 8 11:00:14 2015 Tags:

udev(8) is the latest development in the complicated history of device management in Linux. It has not been entirely uncontroversial. It was previously hated for its decision to change syntax. Its integration into systemd caused the eudev fork to happen, and the decision to remove firmware loading support made Linus grumble.

This article isn't about that though, this is about what udev(8) is, what it can do, and how you would use it.

udev(8) is for managing the contents of /dev to handle permissions of which users and groups may access devices under which names, and providing an API for other services to use to recieve notification of when devices change.

You tend not to need to interact with udev(8) directly unless you have obscure hardware that you need to categorise, or you need to perform actions when a device appears.

Granting permission to access devices

If the device has already been processed such that there's an entry in /dev, then you can use udevadm(8) and the steps in this ArchLinux wiki page on writing udev rules to work out how to create a persistent symlink.

To change the owner of the file, so you don't need to be root to use it, you can add GROUP="GROUPNAME", subtituting the group name wanted, and set file permission bits with MODE="0660" to make that group able to read or write to that device.

Performing actions when a device appears

Short, simple scripts

The simplest way to do this is to add RUN+="/path/to/your/script" to your udev(8) rule, however this is only suitable for short-lived scripts, as they will be killed if they are stil running after a timeout period.

Starting a daemon per-device

If you need a longer-running service, then you should integrate it with a systemd unit, by defining the unit like:

cat >/etc/systemd/system/my-service@.service <<'EOF'
[Unit]
Description=My Service
[Service]
Type=simple
ExecStart=/path/to/your/script %I
EOF

And add ENV{SYSTEMD_WANTS}="my-service@%k.service" to the udev rule.

This will run /path/to/your/script and pass it the path to the device that has just appeared.

Handling device events from a currently-running daemon

Sometimes it is not appropriate to have a model where there is an instance of your service running for every device that is added. In this case, it is more appropriate to have your service listen for udev events and handle the devices when they appear itself.

The details on how to do this vary with the programming language bindings, but the general flow is:

  1. Initialize the udev library context
  2. Create a udev monitor
  3. Configure the monitor to filter it to the class of devices you care about
  4. Add the monitor to an event loop
  5. Wait for udev to have a matching entry for you
  6. Read the device entry out of udev and process the event

Depending on your programming language of choice, you may want a guide for how to listen for devices with libudev, or pyudev.

Posted Wed Apr 15 11:00:10 2015 Tags:

It has been a while since I last wrote an article about how I make my life easier for myself and I thought today I might share a tidbit with you, dear reader, which might make your life a bit easier too.

Many blogs, news sites, etc, produce what is known as a 'feed' (commonly an rss or atom feed) and there exist a number of 'feed readers' such as liferea if you like desktop apps, or feedly if you want a website. I (as you might expect if you've read any of my other articles where I detail how I work) use something else. I use a tool called rss2email (or r2e).

I love r2e because it lets me keep up with a number of feeds (currently 4 though that number has been as low as 1 and as high as 7 in the past year) without making it hard for me to do so. I thought I'd share my configuration with you all, so that if it tweaks your interest, you might give it a go.

To do this, you'll need a system which is up at least most of the time, and a friendly mail server.

Getting and Setting up r2e

If you're using Debian or a derivative such as Ubuntu:

$ sudo apt-get install rss2email

Otherwise, have a poke in your distributions software package provider and grab it, or get it from their website and install it yourself.

Once you have the r2e program available, there's a few steps to follow.

First make an .rss2email directory in your $HOME and put a config.py into it. The full possibilities of setup.py are detailed in the example config.py provided by the rss2email author. Personally I simply have:

DEFAULT_FROM = 'dsilvers+yakking-changeme@digital-scurf.org'
DATE_HEADER = 1
UNICODE_SNOB = 1

Which ensure that emails are sent from a valid email address so they will actually get through to me (you'll want to substitute your own address there), that r2e will put a date header on the email which uses the date of the posting so that they're ordered nicely for me in my mailbox, and that the emails will make reasonable use of unicode for me.

Next prepare a feed file and in doing so, tell r2e how to contact you:

$ r2e new dsilvers+yakking-changeme@digital-scurf.org

Substitute your own address since clearly I don't really want to be receiving your feeds unsolicited :)

Next we should add a feed, this bit can be repeated as much as you like:

$ r2e add http://yakking.branchable.com/blog/index.atom
$ r2e list
$ r2e run --no-send 1

In the above, you would substitute the number of the feed for the 1 if you are adding more than one feed. The r2e run --no-send N command causes the r2e tool to run that feed but not to email you what it finds -- this is a good way to prime a feed since otherwise the next time r2e runs, it'll find all the posts in that feed as "new" and give you them all, which is not particularly optimal. (If you fancy that behaviour then simply don't do the r2e run --no-send N command and let the flood come).

Finally we need to set up r2e to run regularly. I simply use cron to do this, and have the following in my crontab:

0,30 * * * * r2e run

Which means that on the hour and on the half hour, r2e run will be executed and will email to me any new postings. There's no need to run it this often normally, but if you follow any high volume feeds then you might want to run reasonably often otherwise you'll get big batches.

Final thoughts

In addition to the above, and perhaps a little out of scope for this article, I have my email filtered on my server, so I have an 'RSS' folder into which my r2e email is filtered automatically, meaning that it doesn't clog my INBOX.

I hope that this piques your interest, I know I enjoy not having to remember to visit blogs in my web browser all the time.

Posted Wed Apr 22 11:00:11 2015 Tags:

journald(8) is for collecting log output from processes and textual output from services started by systemd(1).

As is usual for things involving systemd, there's some controversy surrounding it, because of some technical decisions and it being a mandatory component in systems that use systemd(1).

The main concern is that it will replace the venerable syslog, because journald(8) takes over the /dev/log socket used by the syslog(3) logging protocol. However, this does not actually force you to use the Journal, as rsyslogd(8) can read information out of the journal.

The other concern with journald(8) is that it saves logs in a binary format requiring it to be read out with journalctl(1), as opposed to syslog, which saves logs as a text file, so can be processed by other tools.

The source of concern is that the Journal may change to an incompatible format, and there wouldn't be any way to read the logs out, but this can be mitigated by installing rsyslogd(8) in case it ever happens, and the systemd developers have been careful about keeping it backwards compatible.

The reason for the journal storing logs in a binary format, is for performance when searching, as the journal contains more logging information than syslog used to handle.

By default, the Journal stores the standard output and standard error of services in addition to syslog(3) output. This is usually sufficient, but applications can send messages directly too with the sd-journal(3) library.

There is not a great deal of configuration required to use journald(8), but reading messages is possible with the journalctl(1) tool.

I find the most useful option to journalctl(1) is --unit, which most recently saved me a lot of time when debugging service start-up issues, as it allowed me to deduce why things were failing to work properly.

Posted Wed Apr 29 11:00:09 2015 Tags: