We previously discussed issues with using PIDfiles.

One issue we encountered was that we need a way to handle multiple processes.

Process groups

If you've ever started a program in the background in a shell you might have noticed it gave you a "Job ID" to refer to it rather than a process ID.

This is not just to give you a memorable number for each task, but because jobs may contain multiple processes, which is how a pipeline of multiple processes may be a single job.

This is accomplished in Linux and traditional UNIXes with the setpgrp(2) system call which assigns a new process group to a process which will be inherited by its subprocesses.

This entire process group may be killed by passing the negation of the process group ID to the kill(2) system call.

A process may only be part of one process group though, so if you have processes that may call setpgrp(2) themselves then it is not possible to use process groups to manage terminating a whole process tree of a service.

UNIX Sessions

You may be wondering how anything can possibly work if you can't use process groups to track a user's processes.

The answer to this is that UNIX has a concept of sessions.

Every process is a part of a session, and each session has a "controlling TTY", which can be accessed via /dev/tty.

When a process creates a new session with setsid(2) it becomes the session leader.

If the session leader process is terminated then the entire session receives the SIGHUP signal, which by default terminates the process.

The controlling tty was traditionally a virtual terminal which emulates the old teletype terminals on modern computers. Terminal windows in graphical interfaces use pseudo terminals, which could be used to use sessions for grouping processes that don't belong to a device.

This is typically done by getty and login(1), terminal emulator or sshd, which also update utmp(5) to include the controlling TTY and session ID, to track the current active sessions.

There are a number of issues with using UNIX sessions for tracking processes.

  1. utmp(5) is an awkward interface, requiring multiple processes to access a single file without trampling over each other, requiring file range locking, which can't be done portably and in a thread-safe manner.

    I consider this to be analogous to /etc/mtab, which was an old, manually maintained file, which had to be replaced with a more reliable, kernel-provided interface.

  2. setsid(2) describes sessions and process groups as a strict two-level hierarchy.

    The implication of this is that any process can escape with setsid(2), so bypassing mtab and inspecting the contents of the sessionid flie in /proc won't work.

  3. Escaping session cleanup is by necessity a well-documented procedure, since traditional daemons are started by detaching from the current session rather than asking the init process to start the daemon.

    See nohup(1) for details about how to escape session cleanup.


The traditional UNIX system calls came from a time when it was believed you could trust programs to be well written and benign.

We do not live in this world, so we need a better approach to track which processes we run on our computers, which we will discuss in a future article.

Posted Wed Apr 26 12:00:07 2017 Tags:
Daniel Silverstone Find your motivation

A while ago I wrote about ensuring that you know why you're writing something in order that you keep focussed on that goal while you code. My focus at that point was on the specific project you were looking to undertake, but it may behoove us to look at the wider picture from time to time.

What motivates you?

It's important that you can answer this question, ideally without hesitation or backtracking. For each of us the answer will be different, and noone's answer is any less "right" than anyone elses. For myself, it took several years to be in a position to answer the question confidently, quickly, and consistently. That's not to say that my answer won't change in the future, but at least for now I know what motivates me and how that manifests in my day-to-day hacking.

I have had a word with Richard and he has explained his motivation to me, and so for your perusal and criticism, here's what motivates us both..


For me, the primary motivation for writing free software is that I enjoy making it possible for other people to achieve things. I am, as it were, an "enabler" or "facilitator". This manifests itself in an interest in processes, meta-programming, and tooling. I find myself writing libraries, services, and test tooling; and I enjoy reading papers and architecture designs, thinking of new ways to solve old problems, and novel problems to solve. (And of course, I write articles on this here blog :-) )


My motivation in general is to learn something such that it can be applied to something which in some way may be construed as to the betterment of society. Or indeed those things which may improve society directly. In the free-software world, this has manifested in the topic of reliability and also freeing people from vendor lock-in.

(* note, I kinda paraphrased what Richard said)


You didn't think I'd let you get away with no homework this week did you? Hah! I'd like you to sit down, consider your motivation in the free software world and a few ways in which that manifests into projects you work on or with. If you're feeling super-enthusiastic about it, why not post a comment on this post and share your motivation with the rest of us?

Posted Wed Apr 19 11:24:16 2017 Tags:

Previously we lamented that we had to read the cmdline for every process to work out whether the program we want to run is already running, and that we'd like to be able to look up a name to see if it's running.

The traditional way to do this is to write the process identifier (or PID) to a file called named after your program, such as /var/run/$PROGNAME.pid.

This is an obvious solution, that you will probably have seen before, and before systemd and upstart became popular, was the way you handled running services in Linux.

How people do pid files wrong

On the surface just writing the PID looks like a good idea, since you can use the presence of the file to tell that the process is running or read the contents of the file to see which PID it's using so you can kill(2) the process.

This however is less reliable than parsing files in procfs, since if you're reading from /proc you know it's always up to date since it's provided by the kernel's process table.

There are no such guarantees from the PID file. Your process could terminate abnormally and not clean it up.

Be extra cautious with PID files not stored in /run or /var/run, since there is no guarantee that they weren't left over from a previous boot.

Reliability can be improved by checking whether the PID from the file is in use, by running kill(pid, 0) == 0 in C, or kill-0 $pid in shell, since this will evalutate to true if that process is running.

This is not yet reliable though, since while killing with signal 0 will tell you whether a process with that PID is running, you can't tell if it's the correct process, since PIDs get reused fairly frequently.

For PID files to be reliable you need some way to guarantee that the PID in the file is actually referring to the correct process.

You need some way of tying the lifetime of the process to the file.

When a process terminates all the file descriptors it had open are closed, so we can know that the contents are valid if a process is holding it open.

You may be tempted to use lsof-Fp -f -- /var/run/$PROGNAME.pid, or parsing procfs manually to determine whether a process is using the file, but this is awkward for the same reason as not parsing the output of the ps(1) command to tell whether it's running.

We need to be able to do something to the file descriptor that will have an observable effect through other file descriptors to that flie.

The solution to this is to take a lock on the file.

You may see some programs use fcntl(2) with F_SETLK. This is tempting because F_GETLK will include the PID in the result instead of requiring the service to serialise and write the PID to the flie and the checking process having to read and parse the file.

F_SETLK should be avoided because the lock is removed when any file descriptor to that file owned by that process is closed, so you need to be able to guarantee that neither any of your code not any other code you use via a library will ever open that PID file again even if your process normally opens arbitrary files by user input.

So rather than using fcntl(2) with F_SETLK, use flock(2) with LOCK_EX, or fcntl(2) with F_WRLK to take a write or exclusive lock on the file, and test whether the contents are live by trying to take a read or shared lock, with flock(2)'s LOCK_SH, or fcntl(2)'s F_RDLK.

If you succeed at taking a read lock then the contents of the file aren't valid and the service isn't running.

Implementing this logic can be awkward, fortunately if you're writing a C program you can use the libbsd-pidfile functions.

It's the wrong tool for the job

Setting it up correctly is tricky

The libbsd-pidfile functions will handle locks correctly, but as permissively licensed as it is, it is not always available, perticularly if you're not writing a C program.

A brief survey of the top results for python libraries for handling PID files resulted in pid, python-pidfile and pidfile.

python-pidfile does not keep the file open or take a lock. pidfile only improves by taking the lock and holding it. pid checks the validity and sets correct permissions on the file. None of them have a mechanism to safely retrieve the PID from the file.

So if you have to do it yourself, you'll need to do the following:

  1. open with O_CREAT to make the lock file if it doesn't exist.
  2. Try to non-blocking take a shared lock on the file. If you fail it's running, and depending on whether you want to end it, replace it or leave it, you should either instruct it to terminate and continue, exit, or terminate then exit. If you succeed then it's not running.
  3. If you want to start a new service or replace it, upgrade your shared lock to an exclusive lock with a timeout. If you take a blocking lock then your processes will deadlock waiting for their turn to start the service, and if you take a non-blocking lock then if your process is pre-empted by another process between trying to lock and releasing the shared lock, then you could end up with every process exiting.
  4. If you took the lock replace the contents of the file with your PID, if you timed out unlock the file or exit.

The above is already complicated and still doesn't handle edge-cases, such as another process trying to start between taking the lock and writing the PID, which libbsd-pidfile handles with a retry loop.

It also doesn't handle the file being unlinked while starting, which would cause you to have multiple services running.

libbsd-pidfile doesn't take the shared lock then upgrade, so if many processes often want to know whether it's running, then they would be unnecessarily contesting the lock.

Views of PIDs are not universal

Process namespaces exist.

While a single process' view of its PID does not change, other processes can see it with different PIDs since processes in sub-namespaces are viewable in multiple namespaces and have a different PID in each namespace.

You can detect whether the process is running by whether locking fails, but you can't use the contents to terminate it, since the PID it knows it to be is different from the PID you can reach it with, unless you use a restricted system call to enter the namespace of the process in order to terminate it.

fcntl(2)'s F_GETLK would return the PID correctly, but is unreliable in other areas.

Services can span more than one process

The PIDfile model works acceptably if your service only runs in one process, but some programs spawn helper processes and terminating the lead process may not tidy up all the helper processes.

There are ways to have the termination of one process cause the termination of others, however this just moves the problem of "how do you know your process is running" to the supervising process without solving the general problem.

So, we know that PID files aren't necessarily the right tool for the job, in the next article we explore some alternatives.

Posted Wed Apr 19 11:20:30 2017 Tags:

So you're starting a new project. You have an idea for something, and a vision for how to implement it. What should you do first?

If you're like me, you burn to start with the interesting coding bit first. After a few hours of furious typing, you've done the fun bit, and you need to add all the boring bits: automated tests (unless you did TDD, and even then there may be missing tests), documentation, a manual page, a README, perhaps packaging (Debian, RPM, language specific), a CI project, a home page, etc. There's a lot of it. That is usually when my enthusiasm fades. A lot of my projects end up as a single file in ~/bin.

I call the boring bits scaffolding, and I've learnt that it pays to do them first. Not only is that often the only time I will actually do them, but once the scaffolding is in place, the fun bits are also more fun to do.

If I have unit and integration test frameworks in place, adding another test is only a small incremental task. If I haven't got them in place (particularly for integration tests), I tend to postpone writing tests. No chance of TDD, whereas when I put in an integration test framework, I often to TDD at the program level, not just at the individual class and method level.

Likewise, if I start by writing a manual page, adding another sentence or paragraph is easy. Also, writing a manual page, or any documentation, clarifies my thinking about what the program should actually do, and how it should be used. This makes writing the code more pleasant.

Having CI in place from the beginning, means the tests I write get exercised from the start. This finds bugs earlier, even if I run my tests manually as well. CI never thinks "I can run test faster this one time by taking this clever shortcut".

Of course, putting up a lot of scaffolding takes effort, and it's all wasted if you don't actually want to write the program after all. Sometimes what sounds like a brilliant idea on the last train home from a party, makes no sense at all in the morning. (Personal experience, ahem.)

Thus it may make sense to start with a simple proof of concept, or prototype implementation to verify the soundness of you idea, then throw that away, and set up scaffolding, before you start writing production code.

My latest personal project has a manual page, unit and integration tests, Debian packaging, a CI project, and a home page. I can install it and run it. It does't yet do anything useful. Before all this, I wrote a prototype to prove the idea I had, and threw that away. Some time this year I will start dropping in modules to actually do useful stuff, and that'll be easy, since I won't need to worry about all the boring bits. As soon as it does anything useful, I can also point friends at a Debian package and ask them to try my latest masterpiece.

What's more, I have a better chance of getting them to help by writing a module they need themselves, since they just need to write the module and don't need to worry about the rest.

Posted Wed Apr 5 11:00:06 2017 Tags:

From time to time, you write something which you really want to show off. If it's a GUI application then you might start with some good screenshots, or possibly even make a screen-capture video of you using the application, so that people can watch and marvel at your amazing creation.

If your wonderful program happens to be a command-line tool, then such videos are often a little lackluster or simply overly-large for the information they convey. It's better if the text being displayed is actually text and that it can be copy-pasted out (so others can follow along in their own terminals). To that end, there is a number of tools for recording operations in a terminal and allowing that to played back. I will mention a few of them here, but this is by no means an exhaustive list; and I shall endeavour to not recommend any specific option over any other.


Asciinema is a pretty slick project housed at https://asciinema.org/ with its source available at https://github.com/asciinema/asciinema and https://github.com/asciinema/asciinema-player. Asciinema is available in Debian and other Linux distributions and works very well. The Asciinema community provides a hosted system and also the tooling necessary to provision the playback of recordings on your own website providing you can honour the GPLv3. The client application is written in Python and has few dependencies.

You can see an example at: https://asciinema.org/a/42383


Like Asciinema, showterm provides a hosted service, over at https://showterm.io/ and an open-source tool hosted at https://github.com/ConradIrwin/showterm. Showterm's server-side is also open, but is focussed on a hosted experience so expect to be running up a rails app if you want to host it yourself.

Showterm is written in Ruby and is under the MIT licence as is its server. It also relies on ttyrec in some circumstances.

You can see an example at: https://showterm.io/7b5f8d42ba021511e627e


If you're hoping for standalone HTML output, then TermRecord might be what you're after. It can be found at https://github.com/theonewolf/TermRecord and is written in Python. Unlike the other offerings, TermRecord produces entirely standalone output which doesn't depend on any other JavaScript or server resources.

The licence terms for the TermRecord output include MIT licensed JavaScript, and font content under the Ubuntu Font License 1.0.

You can see an example at: http://theonewolf.github.io/TermRecord/figlet-static.html


The TTYGIF project deserves an honourable mention because despite its output not being copy/pasteable; an animated GIF is much smaller than a video file would likely be. TTYGIF is housed at https://github.com/icholy/ttygif and produces a simple animated GIF of the terminal it is running in. Since almost every web browser which isn't purely text-based offers animated GIF playback, the output of TTYGIF doesn't need any additional support. No JavaScript players or complex plugins.

TTYGIF is MIT licensed, but doesn't seem to place any particular licence on its output. It's also written in C and needs a few dependencies to get itself going, one of which is a tty recorder anyway :-)

You can see an example at: https://camo.githubusercontent.com/acff4cc740350a784cb4539e501fcce1815329c0/687474703a2f2f692e696d6775722e636f6d2f6e76454854676e2e676966


There are plenty of other options, including those which tend to only offer playback in a terminal themselves, such as script and ttyrec. Your homework (You didn't think you'd got away with another week without homework did you?) is to have a play with some of the options listed here (or any others) you can find, and then perhaps comment on this posting showing what you've managed to get up to while creating a tty-cast to demonstrate your latest impressive terminal skills.

Posted Wed Mar 29 11:00:06 2017 Tags:

Sometimes it is necessary to leave a process running, performing some service in the background while doing something else.

It would be redundant and possibly harmful to start a new one if it is already running.

Ideally all programs would safely shut themselves down if already running, checking if it's running before starting only guarantees that it was runing when you checked, rather than that it is running when you need it. For most purposes though it is reasonable to check first.

So how do we know if our service is running?

You may have run ps(1) before to see if a process is running, so you might naturally think this would be how to do it.

This would of course fall into the trap of parsing the output of shell commands. Why should we write fragile code when ps(1) is using a proper API to do it?

The way this is accomplished is the procfs virtual file system traditionally mounted at /proc. There is a subdirectory in this file system for each process listed by its process ID.

We can list all directories that are processes by running:

find /proc -mindepth 1 -maxdepth 1 -name '[0-9]*'

Inside each of these directories are files describing the process.

Check comm

When you look at the output of ps it shows the name of the process, which is normally the base name of the file path of the executable that the process was started with.

This is stored in the file in /proc called comm.

So if the name of your program is "myprogram", you can find out if your program is running with the following command:

find /proc -mindepth 1 -maxdepth 1 ! -name '*[^0-9]*' -type d -exec sh -c \
    '[ "$(cat "$1/comm")" = myprogram ] && echo Is running' - {} ';'

I would recommend against checking if your program is running this way though, as processes may call themselves whatever they want, by writing the new name to comm.

$ cat /proc/$$/comm
$ printf dash >/proc/$$/comm
$ cat /proc/$$/comm

This is often used by services that fork off helper processes to name the subprocesses after their role to make it easier for developers or sysadmins to know what they do.

Check exe

The procfs entry also includes the path of the executable the proccess was started from as a symbolic link.

Thus if your program is installed at /usr/bin/myprogram then we can check whether it is running with:

find /proc -mindepth 1 -maxdepth 1 ! -name '*[^0-9]*' -type d -exec sh -c \
    '[ "$(readink "$1/exe")" = /usr/bin/myprogram ] && echo Is running' - {} ';'

This cannot be modified by the proces after it has started, but as usual caveats apply:

  1. Not all processes have an initial executable. This symbolic link may be unreadable (fails with errno of ENOENT) in the case of kernel threads.

  2. It could be a program that has subcommands, one of which may be a long-running service (e.g. git-daemon), which you wouldn't want to fail to start just because a shorter operation with a different subcommand happened to be running at the same time.

  3. This is unhelpful in the case of interpreted languages, since it is always the name of the interpreter rather than the name of the script.

  4. The same program may be reachable by multiple file paths if the executable has been hard-linked.

  5. If the program's executable may be removed while it is running, changing exe to append " (deleted)" to the file path.

    If this file is then replaced then another process may have the same executable path but an incompatible behaviour.

    This isn't even unusual if the name of the process is generic, like "sh" or "httpd".

So it's useless for interpreted programs and unreliable if the executable can be replaced.

Check cmdline

It could be perfectly safe to run the same program multiple times provided it is passed different configuration.

The cmdline file can be parsed to infer this configuration as a list of strings that are NUL terminated.

A problem with this approach is the need to reimplement parsing logic and know for all command-lines whether it's appropriate to start another.

This logic could be quite difficult, but you could add a parameter just for determining whether it is the same.

This is far from ideal because:

  1. Lookup time gets worse as your system has more processes running.
  2. Processes can modify their command-line too, so another process could arrange to have the same command-line, and make this unreliable.

Next time we are going to look at a better use for that parameter.

Posted Wed Mar 22 12:00:08 2017 Tags:

Bit rot, specifically the phenomenon of software working less well even if it hasn't changed, is annoying, but a fact of life. There might be a thing or two you can do to make it happen less.

Examples from your author's personal experience from the past year:

  • Cloud provider changes the default username on base images from ec2-user to debian, requiring simple changes needed in many places.
  • Cloud provider upgrades their virtualisation platform, which introduces a new API version, and breaks the old version. All API using automation needs upgrading.
  • Configuration management software introduces a new feature (become), and deprecates the old corresponding feature (sudo). Simple changes, but in many places.
  • Configuration management software breaks the new feature (can no longer switch to an unprivileged user to run shell script snippets), requiring more complicated changes in several places (run shell as root, invoke sudo explicitly).
  • Author's software depends on enterprise-grade software for a specific service, which switches to requiring Oracle Java, instead of OpenJDK. Author's software isn't fully free software anymore.

Bit rot happens for various reasons. The most common reason is that the environment changes. For example, software that communicates over the network may cease to function satisfactorily if the other computers change. A common example is the web browser: even though your computer works just as well as before, in isolation, web sites use new features of HTML, CSS, and JavaScript, not to mention media formats, and web pages become bigger, and in general everything becomes heavier. Also, as your browser version ages, sites stop caring about testing with it, and start doing things that expose bugs in your version. Your web experience becomes worse every year. Your browser bit rots.

There is no way to prevent bit rot. It is a constant that everything is variable. However, you can reduce it by avoiding common pitfalls. For example, avoid dependencies that are likely to change, particularly in ways that will break your software. An HTML parsing library will necessarily change, but that shouldn't break your software if the library provdes a stable API. If the library adds support for a new syntactic construction in HTML, your program should continue to work as before.

You should be as explicit as possible in what you expect from the environment. Aim to use standard protocols and interfaces. Use standard POSIX system calls, when possible, instead of experimental Linux-specific ones from out-of-tree development branches. Sometimes that isn't possible: document that clearly.

Have automated ways of testing that your software works, preferably tests that can be run against an installed instance. Run those tests from time to time. This will let you and your users notice earlier that something's broken.

Posted Wed Mar 15 12:00:07 2017 Tags:

I have the dubious honour of being one of the people, at my place of work, charged with interviewing technical applicants. Without giving the game away too much, I thought I might give a few hints for things I look for in a CV, and in the wider world, when considering and interviewing a candidate.

First a little context - I tend to interview candidates who are applying for higher-level technical roles in the company, and I have a particular focus on those who claim on their CV to have a lot of experience. I start by reading the cover letter and CV looking for hints of F/LOSS projects the applicant has worked with; either as a user or a developer. I like it when an applicant provides a bitbucket, github or gitlab URL for their personal work if they have any; but I really like it when they provide a URL for their own Git server (as you might imagine).

Once I have identified places on the Internet where I might find someone, I look to dig out their internet ghosts and find out what they are up to in the wider F/LOSS world. The best candidates show up in plenty of places, are easily found making nice commits which show their capability, and seem well spoken on mailing lists, fora, et al. Of course, if someone doesn't show up on Internet searches then that doesn't count against them because to have the privilege of being able to work on F/LOSS is not something afforded to all; but if you do show up and you look awful it will count against you.

Also remember, there's more ways to contribute than writing code. I love it when I find candidates have made positive contributions to projects outside of just coding for them. Help a project's documentation, or be part of mentoring or guide groups, and I'll likely be very pleased to talk with you.

Beyond the Internet Stalking, I like to get my candidates to demonstrate an ability to compare and contrast technologies; so a good way to get on my good side is to mention two similar but conflicting capabilities (such as Subversion and Git) be prepared to express a preference between them, and be able to defend that preference.

Finally a few basic tips -- don't lie, dissemble, or over-inflate in your CV or cover-letter (I will likely find out) and don't let your cover letter be more than a single side of A4, nor your CV more than 2 sides of A4.

If I ever interview you, and I find out you read this article, I will be most pleased indeed. (Assuming you take on my recommendations at least :-) )

Posted Wed Mar 8 12:00:07 2017

FOSS projects are mostly developed on a volunteer basis.

This makes the currencies by which they are developed: free time and motivation.

Often times you have the free time, but not the motivation. Often this is not from feeling that the work isn't worth doing, but that you feel inadequate to do it.

Don't be disheartened. There's plenty you can do that helps.

  1. Just be there, whether in-person or online.

    You can do whatever else you want while being there, but it's encouraging to not be along in your endeavours.

    You may even find some motivation of your own.

  2. When others are talking about what they want to achieve, respond enthusiastically.

    It makes them more likely to follow-through and do so, and in the very least makes them feel good.

    This does risk making them feel worse if they never get around to it, but sometimes that's sufficient to shame them into action later, and other times it's sufficient to say "these things happen".

  3. Engage in discussion about what others want to achieve.

    It's extremely valuable for refining ideas, so they can implement what they want to do better, it keeps it fresh in their mind so motivation lasts longer, and it leaves a clearer idea of what to do so it may be completed before motivation runs out.

  4. Mention what other people are doing to people who might be interested.

    You could end up with anecdotes of other people thinking it's a cool idea, which when relayed to people doing the work provides their own motivation.

  5. Remind people of the successes they've had.

    It makes people feel good about what they've already done, and can put any issues they are currently struggling with into perspective.

    Lars pointed out that Yakking has published more than 180 articles at a rate of one per week! We've managed to get this far, we can continue for a good while yet.

Posted Wed Mar 1 12:00:07 2017 Tags:
Daniel Silverstone Please be careful when you test

We have spoken before about testing your software. In particular we have mentioned how if your code isn't tested you can't be confident that it works. Whe also spoke about how the technique of testing and the level at which you test your code will vary based on what you need to test.

What I'd like to talk about this time is about understanding the environment in which your tests exist. Since "nothing exists in a vacuum" it is critical to understand that even if you write beautifully targetted tests, they still exist and execute within the wider context of the computer they are running on.

As you are no doubt aware by now, I have a tendency to indulge in the hoary old developer habit of teaching by anecdote, and today is no exception to that. I was recently developing some additional tests for Gitano and exposed some very odd issues with one test I wrote. Since I was engaged in the ever-satisfying process of adding tests for a previously untested portion of code I, quite reasonably, expected that the issue I was encountering was a bug in the code I was writing tests for. I dutifully turned up the logging levels, sprinkled extra debug information around the associated bits of code, and puzzled over the error reports and debug logs for a good hour or so.

Predictably, given the topic of this article, I discovered that the error in question made absolutely no sense given the code I was testing, and so I had to cast my net wider. Eventually I found a bug in a library which Gitano depends on, which gave me a somewhat hirsuite yak to deal with. Once I had written the patch to the library, tested it, committed it, made an upstream release, packaged that, reported the bug in Debian, uploaded the new package to Debian, and got that new package installed onto my test machine - lo and behold, my test for Gitano ran perfectly.

This is, of course, a very particular kind of issue. You are not likely to encounter this type of scenario very often, unless you also have huge tottering stacks of projects which all interrelate. However you are likely to encounter issues where tests assume things about their environment without necessarily meaning to. Shell scripts which use bashisms, or test suites which assume they can bind test services to particular well known (and well-used) ports are all things I have encountered in the past.

Some test tools offer mechanisms for ensuring the test environment is "sane" for a value of sanity which applies only to the test suite in question. As such, your homework is to go back to one of your well-tested projects and consider if your tests assume anything about the environment which might need to be checked for (outside of things which you're already checking for in order to build the project in the first place). If you find any unverified assumptions then consider how you might ensure that, if the assumption fails, the user of your test suite is given a useful report on which to act.

Posted Wed Feb 22 12:00:06 2017