C is one of the most common languages that programs are written in. This is generally because of toolchain concerns, performance, familiarity and access to low-level hardware details, rather than how easy the language makes resource allocation.

Many resource leak bugs are caused by this, as the previous article describes, so it's important to be familiar with the various ways resources can be handled in C and C++.

Stack allocated memory

When you declare a variable, you declare its type, which is used to determine how much memory needs to be allocated for it.

static variables (either globals, or variables explicitly marked as static in functions) are allocated at compile time and have memory reserved for them.

automatic variables can only be declared in functions. Memory allocated for automatic variables is released when the function returns, they are called automatic because you don't need to do anything for the memory to be released again.

This only works for memory though, and if you allocate too large an object on the stack, it is likely to crash your program, and release the memory back to the operating system, to be re-used in another process.

goto cleanup;

When automatic cleanup is not sufficient, you need to be explicit.

In C you have to call the appropriate function to release the resource. For memory allocated with malloc(3), it needs to be free(3)d, and files opened with open(2) need to be close(2)d.

Because you also need to check the return values of functions you call, you can end up with code like this:

char *foo(size_t bar, int baz){
    int fd = -1;
    char *mem = NULL;
    fd = open("file/path", O_RDONLY);
    if (fd != -1){
        mem = malloc(bar);
        if (mem != NULL){
            int ret = qux(fd, mem, baz);
            close(fd);
            if (ret == 0){
                return mem;
            } else {
                free(mem);
                return NULL;
        } else {
            close(fd);
            return NULL;
        }
    } else {
        return NULL;
    }
}

This doesn't scale well with the number of resources you require, and can easily end up with very wide code.

As an alternative, we're going to use goto.

There's a lot of horror stories on the internet that say you should never use goto, but these generally come from the uninformed, jut parrotting goto considered harmful without understanding the context; it was written when you would traditionally use goto, rather than a while loop.

If it's good enough for the Linux kernel then it should be good enough for you.

The previous code can be instead written as:

char *foo(size_t bar, int baz){
    int fd = -1;
    char *mem = NULL;
    fd = open("file/path", O_RDONLY);
    if (fd == -1){
        return NULL;
    }
    mem = malloc(bar);
    if (mem == NULL){
        goto cleanup;
    }
    if (qux(fd, mem, baz) == 0){
        goto cleanup;
    }
    free(mem);
    mem = NULL;
cleanup:
    close(fd);
    return mem;
}

catch { object.close(); }

Now we stray into the land of C++. Previously we said that you should use goto because there's no suitable language construct. C++ is a far more extended language, which does have a language construct to handle this.

char *foo(size_t bar, int baz){
    int fd = -1;
    fd = open("file/path", O_RDONLY);
    if (fd == -1){
        return 0;
    }
    try {
        char *mem = new char[bar];
        try {
            qux(fd, mem, baz);
            close(fd);
            return mem;
        } catch (...) {
            delete[] mem;
            throw;
        }
    } catch (...) {
        close(fd);
        throw;
    }
}

In C++ you can handle exceptions rather than checking return codes and using goto. This reduces the amount of code at the call site dedicated to error handling, but unfortunately in C++, this still ends up with a lot of nested scopes.

Other languages, such as Java and Python have a finally block, which is always run, whether an exception was raised or not, which would mean the above code wouldn't need to close(fd) in the success path.

RAII

This doesn't bother C++ programmers, since for guaranteed resource cleanup, an idiom call RAII is preferred, where you have special objects, which release the resource in their destructors.

shared_ptr is a wrapper object that will destroy the object you pass it when every shared_ptr referring to the object has been unref'd.

auto_ptr will destroy the object when the auto_ptr goes out of scope.

The ifstream classes will close the file descriptor that they opened when they are destroyed.

std::vector<char> foo(int baz){
    std::ifstream in;
    in.open("file/path");
    std::vector<char> mem;
    qux(in, mem, baz);
    return mem;
}

cleanup attribute

C has evolved in parallel to C++, and has its own ways of implementing something RAII-like.

GCC has an extension called the cleanup attribute.

This allows you to annotate a variable declaration with a function that will be called when it exits the scope.

This sounds a little ugly, but if you look at the cleanup attribute example, you can see that it uses a macro to hide the definition.

Others

This article has been mainly about cleaning up run-time resources, but you can also have specific code for cleanup of global resources.

You can use atexit(3) to clean up resources at process exit, provided it isn't terminated too aggressively.

In C++ there's also global object destructors, which let you specify the cleanup code per object.

Posted Wed Nov 5 12:00:10 2014

We previously looked at resource cleanup in C++.

C++ has smart objects that release their resources when they go out of scope.

Java also has objects that release their resources when they are no longer used.

However, Java doesn't clean up its objects immediately after they stop being used.

This is because memory management can be an expensive operation, so batching it up with other memory cleanups can be useful.

However, since you can't assume that the smart objects have cleaned themselves up after they are no longer referenced, you need to explicitly tell them to clean up.

To make this easier, Java has a finally block to go with its try block, which is always executed, so you don't need separate cleanup code in your success path and error path.

import java.io.FileReader;
import java.io.FileNotFoundException;
import java.io.IOException;

public class Main {
    public static void main(String[] args) {
        if (args.length != 1) {
            System.err.println("Must be given 1 argument\n");
        }
        FileReader input;
        try {
            input = new FileReader(args[0]);
        } catch (FileNotFoundException e) {
            System.exit(1);
            return; /* unreachable, but javac is dumb */
        }
        try {
            char[] buf = new char[1024];
            int ret;
            while ((ret = input.read(buf, 0, buf.length)) != -1){
                String s = new String(buf, 0, ret);
                System.out.print(s);
            }
        } catch (IOException e) {
            System.err.println("Failure while writing file contents\n");
        } finally {
            try {
                input.close();
            } catch (IOException e) {
            }
        }
    }
}

Python also has a finally block, with much less verbose syntax.

#!/usr/bin/python
import sys
if len(sys.argv) != 2:
    sys.stderr.write("Must be given 1 argument\n")
try:
    input = open(sys.argv[1], 'r')
    while True:
        d = input.read(1024)
        if not d:
            break
        sys.stdout.write(d)
finally:
    input.close()

A downside to the try...finally idiom, is that it litters your code with the cleanup code for resources you acquire. The RAII idiom from C++ handles this by putting the cleanup code in the destructors.

Unfortunately, while python may have deterministic resource cleanup from reference counting like C++ (Iron Python and Jython may not), you can't reliably write your own destructors in python.

To make this nicer, python has with blocks, which allow the code for cleanup to be kept with the code for allocation.

To use a with block, you pass it an object that obeys the context manager protocol. There are dedicated context manager objects, and some other objects, such as open files, can be used as context managers.

With context managers, the code now looks like:

#!/usr/bin/python
import sys
if len(sys.argv) != 2:
    sys.stderr.write("Must be given 1 argument\n")
with open(sys.argv[1], 'r') as input:
    while True:
        d = input.read(1024)
        if not d:
            break
        sys.stdout.write(d)

To create your own context manager, you can write it like this:

#!/usr/bin/python
import contextlib
import sys
@contextlib.contextmanager
def open_input(path):
    input = open(path, 'r')
    try:
        yield input
    finally:
        input.close()
if len(sys.argv) != 2:
    sys.stderr.write("Must be given 1 argument\n")
with open_input(sys.argv[1]) as input:
    while True:
        d = input.read(1024)
        if not d:
            break
        sys.stdout.write(d)
Posted Wed Nov 12 12:00:08 2014

Recently, Richard told us all about his work and hack stations and that prompted me to think a little more about my "workspace" and then perhaps discuss that a little with you all.


For me, the critical thing about being comfortable with my computers is consistency. This means that my work laptop, my personal laptop and my desktop PC share many features in terms of setup and indeed in terms of hardware. I have a Lenovo ThinkPad x220 as my work computer, an x201 as my personal laptop and I have a Lenovo USB keyboard attached to my desktop PC. My input surface is pretty much the same across all three of my systems. This contributes to a much more comfortable typing experience for me no matter where I am or what I'm up to. I can highly recommend normalising your input surfaces.

Output-wise, all of my systems are LCDs. Obviously the two laptops are (and have similar screen resolutions) and my desktop has a pair of LCDs (since my laptops have both internal and external display capability). My desktop is also configured to, generally, offer me approximately the same amount of screen real-estate as my laptops, although at a higher resolution. (Larger default fonts etc). All this contributes to ensuring that, to my eyes, all three of my systems have consistent output characteristics. Again I can highly recommend normalising your output surfaces.

Despite being able to use my laptop on my lap (shock! horror!) I tend to only use it when sat at a desk of some kind (unless I'm watching a film in bed). This also contributes to the consistency feeling which means that my brain and body automatically fall into the right "mode" when I'm at a keyboard. I have a comfortable desk and chair at work and at home, and I do my best to make myself comfortable when I'm away from either. While it's trite and often over-stated, good body positioning relative to your input and output devices can also lead to a better computing experience.

For general usability consistency I run the same OS everywhere (Debian) and for user-interface consistency I run the same window manager (XMonad], the same shell [Zshell], the same email client (Mutt], and the same editor (Emacs) across all of my systems, with a consistent configuration thanks to my dotfiles being in revision control. I can highly recommend this approach to give yourself a consistent and comfortable working environment.

Finally there's how I use all that. I keep a desktop for my email because email is important to me, I keep a desktop where I keep my IRC terminal and similar bits, one for my web browser and one for my editor round out my default four desktops which you can find in my Xmonad configuration. This consistency of desktop layout contributes to my feeling of "being in the right place" when I use my computers. Extra desktops are created and destroyed at whim, but 1, 2, 3, and 4 are always mail, term, www and emacs, and they're always right.

My web browser has a set of pinned tabs which are always the same across all of my platforms (though my work laptop has a few more tabs for work related stuff pinned in addition) and I synchronise passwords etc between them. Being able to find that tab I was looking at, at home the night before, in the office the following day leads to a sense of integration which again increases my joy in using my computers.

Richard talked about using music when in the office to reduce distractions from noise around him. I use music similarly at work, and at home I tend to listen to music when I code in order to keep consistency of environment. I tend to prefer lyricless music when I am writing prose, and I prefer bouncy music when I write code. This means that, much to my chagrin, I listen to quite a bit of cheesy 80s and 90s pop. I have my entire music collection available to me wherever I am, thanks to a removable hard drive, and I keep a few favoured albums on my laptop hard drive as well, just in case.

I guess my message, more than anything, is simply that consistency is the key to my comfort in my computing environment and I cannot recommend it enough.

Happy hacking.

Posted Wed Nov 19 12:00:09 2014
Lars Wirzenius Tools I use

Sometimes it's interesting to look at the tools someone else likes. Here's a snapshot of my current toolbox.

Hardware

I use a Lenovo Thinkpad X220 laptop, running Debian wheezy. It has a 2.6 GHz CPU, 16 GiB RAM (twice the official maximum, thanks to ThinkWiki), and both a terabyte spinning drive and a 240 GB SSD.

I'll be upgrading the Debian version to jessie in the coming months, now that Debian has frozen jessie in preparation for the next release.

Having two storage devices in the laptop is very nice. I get the benefits both of SSD (very fast) and a spinning disk (cheap per gigabyte). I have the operating system, my home directory, and some virtual machines on the SSD, and a Debian mirror, music and other media files, and other virtual machines on the spinning disk.

I don't use an external keyboard or mouse. The X220 keyboard is very nice, for a laptop keyboard, and I've used it and the trackpoint mouse emulator so much over the past several years that they fade away. I don't need to look at, or think about, using the keyboard, and they don't limit or slow me down. It's also very convenient to not have to attach, detach, or carry extra input devices.

Sometimes I do use an external monitor, but it's not my usual mode. I'd like much more screen real estate, and someday I'll arrange that, but in recent years that's been excluded by other life choices.

I don't have a desktop machine at all at this time, which is dictated by the same life choices: we've moved to a different country several times in recent years, so having as little stuff as possible is important to me.

General environment

I use GNOME3 as my desktop environment, with Xmonad as the window manager. I like the GNOME desktop in general, but I have fallen for a tiling window manager. I chose Xmonad over the options, because my Yakking colleague Daniel likes it. Also, it's another reason for me to learn Haskell.

I use Emacs as my text editor, with a little vi for quick edits. This is not a fanatical choice: I've used a number of text editors over the years, including several I've written myself. Currently, I seem to like Emacs best, but I may change again some day.

My preferred login shell is bash. I don't do a whole lot of configuration of it, mostly about the prompt, which is in bold face so I can more easily spot prompts in tall windows.

For virtualisation, I use libvirt, virt-manager, and KVM. I sometimes generate the virtual machine disk images with vmdebootstrap. I use virtualisation a fair bit, which makes the 16 GiB of RAM on my laptop a nice thing to have.

I browse the web with Firefox. Chromium is nice, too, but I find the Mozilla Foundation's commitment to freedom and openness to be more credible than Google's.

I follow RSS/Atom feeds with Liferea. I IRC with irssi. I handle e-mail with a combination of mutt, offlineimap, and notmuch. Files get synchronised across machines with git-annex.

Backups are done with Obnam (of course; I wrote it).

My web sites are made with Ikiwiki, and hosted on Branchable.

Documents are mostly written in Markdown, and processed with Pandoc.

Programming

I write most of my code in Python 2. I haven't started the transition to Python 3, because I want my code to run on the current stable version of Debian, and Python 2 has been much better supported. Also, lack of free time. I use the coverage.py tool for coverage measurement, and my own CoverageTestRunner as the unit test runner. (I've written a lot of my own tools; I won't mention all of them.)

I also use a bit of C, and the usual assortment of shell, awk, and other Unix command line scripting tools.

Version control is handled by git. My git server runs gitano.

Stuff I don't use

I don't use much office software.

I don't use Facebook or Twitter, though I am on LinkedIn for professional reasons. I keep in touch with my people mostly by means that do not result in profiling information sold to advertisers.

I don't use integrated development environments. I prefer a traditional text editor, and to do things on the command line: this lets me pick and choose between various tools and doesn't tie me to one specific megatool.

Posted Wed Nov 26 12:00:15 2014 Tags: