Unix Operating System tools

dd

dd(1) stands for disk duplicator, though it is often humourously called disk destroyer, since misuse can mean total loss of a disk's data.

It is an old tool. It pre-dates the convention of starting options with a -, so the command to make a backup of an attached disk is:

$ dd if=/dev/sdb of=disk-backup.img

if=/dev/sdb specifies that the input file is /dev/sdb which is the second disk attached to the computer.

of=disk-backup.img says that the contents of the disk should be written to disk-backup.img.

Another use of dd(1) is creating large files. if=/dev/urandom lets you create a file with random contents; if=/dev/zero lets you create a file filled with zeros.

Just substituting the if= with the previous command would result in a command that will fill your filesystem. To limit the amount of data copied, specify bs= and count=. bs= specifies the buffer size, which is how much data to read before starting to write. count= specifies how many full buffers to copy, so the amount of data to copy is the product of the two.

With that in mind, the following command makes a 512 MiB file full of random data, in blocks of 1KiB.

$ dd if=/dev/urandom of=random-file bs=1024 count=524288

The following will create a file full of zeroes.

$ dd if=/dev/zero of=random-file bs=1024 count=524288

dd(1) also supports the seek= option, which can be used to start writing data at a given offset to the file. This could be useful for writing to a partition elsewhere on the disk, or creating a sparse file.

This command writes the disk image, skipping the first block of the disk.

$ dd if=disk.img of=/dev/sdb seek=1

This command creates a 1GiB sparse file on file-systems that support it.

$ dd if=/dev/zero bs=1 of=sparse-file \
      seek=$(( (1024 * 1024 * 1024) - 1 )) count=1

The intent of truncate -s 1GiB sparse-file is clearer.

shred

Unlike dd which is only affectionately known as Data Destroyer, shred(1) is supposed to do this.

This writes random data in-place to attempt to replace the file contents on disk, so its data cannot be recovered.

Unfortunately, file-systems are not so simple that this works any more. Journalled file systems like ext4 may have a copy in the journal, and CoW file-systems like btrfs may have multiple copies around.

Partly this is a result of an evolutionary arms race against storage devices by file-systems to do their very best to not lose data. For this reason, I would recommend using shred(1) on the device you believe the file-system to be backed by, or if you're feeling particularly paranoid, remove the drive from the machine and physically shred it into as may pieces as you feel comfortable with..

sync

There are various layers of caching involved when writing data to a hard-disk, to provide better throughput.

If you're using the C standard API, fflush(3) is the first thing you need to do. This will write any data that is being buffered by the C library to the VFS layer.

This just guarantees that any reads or writes performed by talking to your operating system will see that version of the data. It will be cached in memory until a convenient time when it can be written out again.

It has not yet made it onto the disk, to ensure this, you need to call one of the sync system calls.

These give as good a guarantee as you can get, that if you were to suddenly lose power, your data would be on the disk.

It is not possible to directly sync the writes associated with the file descriptor from shell, but you can use the sync(1) command, which will do its best with every disk.

Of course, none of this can actually guarantee that your data is safely written to the disk, as it may lie about the writes having made it to disk and cache it for better throughput, and unless it can guarantee with some form of internal power supply that it will have finished writes before it loses power, then your data will be lost.

uname

uname(1) is an old interface for telling you more about your operating system. Common uses are uname -r to say which version you are using, uname -m says which machine you are running on, and uname -a does its best to show you everything.

For example, since I use a debian chroot on an Android tablet, I get the following:

$ uname -a
Linux localhost 3.1.10-gea45494 #1 SMP PREEMPT Wed May 22 20:26:32 EST 2013 armv7l GNU/Linux

mknod and mkfifo

mknod(1) and mkfifo(1) create special files. /dev/null is one such a file.

mkfifo(1) is a special case of mknod(1), creating a device node with well known properties, while mknod(1) is capable of creating arbitrary device nodes, which may be backed by any physical or virtual device provided by the Kernel.

Because of this it is a privileged operation, as you can bypass the permissions imposed on the devices in /dev.

Modern Linux systems use a special file-system called devtmpfs to provide device nodes, rather than requiring people to use mknod(1) to populate /dev.

mkfifo(1) is useful for shell scripts though, when there are complicated control flows that can't be easily expressed with a pipeline.

The following script will pipe ls through cat, and tell you the exit status of both commands without having to rely on bash's PIPESTATUS array.

td="$(mktemp -d)"
mkfifo "$td/named-pipe"
ls >"$td/named-pipe" & # start ls in background, writing to pipe
lspid="$!"
cat <"$td/named-pipe" & # read pipe contents from pipe
# you may start getting ls output to your terminal now
catpid="$!"
wait "$lspid"
echo ls exited "$?"
wait "$catpid"
echo cat exited "$?"
rm -rf "$td"

df and du

df(1) displays the amount of space used by all your file systems. It can be given a path, at which point it will try to give you an appropriate value for what it thinks is the file system mounted there.

Modern Linux systems can have very complicated file systems though, so it may not always give correct results. df(1) for example, can give incorrect results for btrfs, where there's not a 1 to 1 mapping between disks and file-system usage, and is not smart about things like bind-mounts and mount namespaces, so smarter tools like [findmnt(1)][] from util-linux are required.

du(1) attempts to inform you of how much disk space is used for a specific file-system tree, so du -s . tells you how much space your current directory is using.

du -s / is unlikely to correspond with the same number provided by df /, because there are metadata overheads required, so on normal file-systems the result of df(1) is likely to be larger.

btrfs can also produce different results, since it can share data between files.

chroot

chroot(1) is a useful command for allowing programs with different userland requirements work on the same computer, assuming you don't need too much security.

It changes a process' view of the file-system to start at a different point, so you can hypothetically use it to restrict access to certain resources.

There are various ways of escaping a chroot, so it's insufficient to protect your system from untrusted programs.

Containers or virtual machines are more secure, but don't have a standard interface.

Chroots are still useful for trusted programs though. For example, I run debian squeeze on my Android tablet to do my blogging.

nice

nice(1) can be used to hint to your operating system's scheduler that a process requires more or less resources.

This can't be used to prevent a mis-behaving process from consuming more than its fair share of CPU, since it could always fork further worker processes.

Linux attempts to handle this deficiency with cgroups.

Posted Wed Jul 2 11:00:12 2014 Tags:
Lars Wirzenius Influential works to read

The history of free software and open source software has a few influential works that it would be good for everyone involved to read. You don't need to agree, and some of them may be a bit obsolete, but it's good to know them.

  • The GNU Manifesto by RMS is where the free software movement formally started from. There was free software before it, but it defined the concept in a way that had perhaps never been expressed before, and galvanised the movement.

  • The Cathedral and Bazaar by ESR is to open source what the GNU Manifesto is to free software.

  • The Debian Social Contract formalises, for one major Linux distribution, uses the principles of the above two to explicitly specify the ethical foundation on which the project is built, and the values it holds.

  • The Four freedoms of software, as defined by the FSF, underlies all answers to the question, "is this free software". The [Debian Free Software Guidelines] and the OSI Open Source Definition build and expand on that, to be more practical and detailed for answering questions in practice.

  • In the beginning was the command line, an essay by Neal Stephenson, explains why the command line is maybe not a bad idea, even today.

  • A Declaration of the Independence of Cyberspace, by John Perry Barlow, proposes the online world as worthy of being thought of as an independent entity. That vision didn't really pan out. Code v2 by Lawrence Lessig discusses the reasons why not, and suggests another vision for the future.

Posted Wed Jul 9 11:00:06 2014 Tags:
Daniel Silverstone Version Numbering

In the past I've talked about starting projects, but one thing which is often overlooked in software projects is versioning the outputs. Clearly by now you are all version-control afficionados, but those of you using DVCSs of various kinds know that the "numbers" given to you by those systems are not suitable for using as version numbers. But while you might know that instinctively, do you have a good grasp of why it's not a good idea?

Version numbers communicate a number of potential datapoints to observers. These data include, but are not limited to, the relative newness of a release, its difference relationship with predecessors, when the release was made, what kind of release it is (alpha, beta, 'golden' etc). Of course, you can communicate an awful lot more in a version number, such as exactly what revision was built, or who built it, or where it was built if you so choose.

When I number releases, I attempt to communicate a difference-to-previous-releases metric and a type-of-change metric in the form of a "major" and "minor" version number pair (e.g. 1.1) and sometimes I extend to a third "patch" number if I'm making a release which really is just a bugfix (rare, not because I write bug-free code but rather because I tend to already have made other changes and don't have the user-base which would demand a patch release). I also try and make sure that packaging systems such as Debian's can understand my version numbers. I stick to a numbering scheme compatible with how dpkg compares version numbers because of my preference for not making life harder for Debian developers.

What matters is that: of the not inconsiderably bountious, immensely ample, myriad approaches that there are to select from; you should be as consistent and careful as you can be with numbering your software releases. Be aware that there are many possible consumers of your version numbers -- developers, users, packaging systems, and marketers to name but a few.

Posted Wed Jul 16 11:00:07 2014
Richard Maw Job control in bash

What is a job.

Abstraction. A pipeline is made of many processes, they are all in the same job.

I'm going to provide a brief overview, for more technical details see the "JOB CONTROL" section of the bash(1) manpage.

Starting a job in the background.

Similarly to how ending a command with && and || changes the logic flow for the next command, ending a command with a single & causes the command to be run "in the background" i.e. while the command is running, you can still enter other commands.

Identifying a running job

Immediately after running a background job, the $! variable contains the process ID of the last command to be run. This is different from the job ID, but it is still rather useful, provided you are only running one command at a time.

Additionally, bash will assign each job an ID, starting with the % symbol, so the first background job is %1. % %% and %+ all mean the current job (i.e. most recent), and %- refers to the previous job (second most recent).

The jobs command will list all active jobs, listing their job ID, the command they are running, and markers to say which is the current and which is the previous job.

Suspending a job.

Pressing ^Z (holding Control and pressing Z) is the usual shortcut key. This will return you to your shell prompt, and the job you were running is suspended (i.e. processing stops, it will not print any more output etc.)

Moving a job to the background.

The bg command can be used to resume an existing job, while leaving you access to the command line. Without any further options, it will resume the latest command to be run. bg can also be given a job spec as described above. bg, bg %, bg %% and bg %1 are all equivalent, as are bg - and bg %-.

Moving a job to the foreground.

fg will move a job to the foreground, so it would be identical to as if it had been run without the & specifier or suspended with ^Z.

fg has all the short-cuts that bg does, plus the job specifiers themselves are synonymous to passing the job specifier to fg, so % is equivalent to fg %.

The background specifier & can also be used with the fg command, which makes it equivalent to bg, so fg %1 & is equivalent to bg %1.

The logical consequence of this is that % & is equivalent to bg %+.

Waiting for a job to finish.

The wait command can be used to synchronise execution by waiting until an existing command finishes before returning to the command-line, or starting another job.

This differs from using the fg command, as wait may also be given a process ID, as retrieved from $!.

This is useful for when you have a long-running command, but you forgot you need to run another command afterwards. To handle this, you can suspend with ^Z, then run wait %+ && some_other_command.

Terminating a job.

There are a few ways to do this. You can foreground a job with fg then kill it with ^C. Alternatively, you could kill the job with the kill command.

This is not the same as the kill(1) program. Unless you invoke /bin/kill or "$(which kill)", you will get the kill built-in.

The kill built-in has the advantage of being able to be given a job spec, such as %1, instead of just a process ID.

Severing all ties with a job

Usually when a shell exits, it ensures all its child processes are terminated. This is a useful default behaviour, as it stops there being untidy processes being left around.

However, this is not always wanted, as you may care more about a job completing than being around to observe it.

Traditionally, this is solved by using the nohup(1) command, or running the process in a detached [screen(1)][] session. However, both these approaches require you to know in advance that this job is going to take longer than you have your terminal open.

To solve this, bash has a disown built-in command, which can remove running commands from job-control, and optionally prevent them being terminated by the shell.

$ some command </dev/null >~/nohup.out &
$ disown -h %+

Is functionally equivalent to running nohup some command. Running disown without -h won't prevent the process being killed when your shell exits.

Your shell will send the SIGHUP signal to its child processes when it exits, hence the naming of the nohup(1) command, and the -h option to disown.

Further reading

lwn is doing a series about process grouping. Later articles go into details about more modern Linux features, but the first article is good for a history of process management, and some technical discussions of how Job control works.

Posted Wed Jul 23 11:00:10 2014 Tags:

Recently I have been learning about actigraphy as you will know if you read my last posting on my browser history. As I said in that article, I was working up to writing a new app for my Pebble Smartwatch to replace my Jawbone Up. It struck me as I wrote that article that I was doing something which seems more than a little unusual in the Open Source and Free Software community. Namely writing software for something I knew little to nothing about -- but taking the time to learn about it before putting hand to keyboard to get coding.

When we start projects, we naturally tend to select a name for it (after all, how can one create the git repositories, register domains, create a dev mailing list etc without a name?), and we decide on other important stuff like what licence to put it under. We write the README, perhaps a basic NEWS file and then comes time to start to write the design documents (or at least notes).

But, how many of us actually sit down and think "Perhaps I should gather up resources on the topic before I write my design?" Sometimes we're doing a project which is the research into the topic -- under that circumstance you might be forgiven skipping the pre-research step. Most of us, however, are writing code for already understood problems, even if we're going to be taking novel approaches to it.

I often find myself wanting to write a program just to learn about something. How many people do you know who have written regex execution engines "for fun"? But I always try and do a lot of research up-front so that I don't go into projects blind. Oftentimes that, when written up, takes does the form of a trivial README with an explanation of what the program will do, or perhaps a set of design notes (not end-user documentation, but simply notes for myself) which distill what I have learned into my design, but the research work really is behind the scenes.

For those of you who have followed the truisms series, and delighted in the homeworks set at the end of each article, I offer this: For whatever you're currently working on, take a step back and think if you really understand the problem domain you're working in. If you can't answer with an emphatic yes, and you're not doing a pure research project, then stop and take some time (perhaps only half an hour) to search around on the web and read a few papers (if any exist) to help you along.

Why not comment on this posting with how you get on?


Posted Wed Jul 30 11:00:09 2014