There has been, of late, a significant gain in mindshare being enjoyed by a number of movements which call themselves Semantic $THING. These usually encode some extra meaning in content which otherwise exists. Commonly these are intended to make it easier for computers to read extra meaning into something intended for humans; but sometimes they're intended to allow humans to more easily deal with something meant for computers.
Semantics are, in a very basic sense, how meaning is overlayed onto the syntax of content. I like to think of it as: syntax is the 'how', but semantics are the 'what'.
Here are three Semantic $THINGs which I think you ought to know about:
The Semantic Linefeeds concept is intended to make it easier for humans to grok the delta between two versions of a text file intended to be processed (for example markdown)
The Semantic Versioning concept is intended to make it possible for humans, and software, to understand the relationship between different releases of a piece of software.
The Semantic Commits concept is intended to make it easier to produce changelogs for projects and there's a number of tools built up around this.
If you know of any other useful Semantic $THINGs then why not comment on this article to let others know about them? For homework, I simply suggest you read the above linked articles, and then do a little of your own research around the topics and consider if you might need to take on any of the points in your own projects. I am considering semantic commits for my main projects at the time of writing this article.
Any software that is used by more than a trivial number of people will at some point be used by people living in different countries, with different cultures. Different cultures have different conventions which might cover date format, decimal mark, currency, names, units of measurement, spelling and of course, language (that's natural language, not programming language).
The practice of building support for a variety of cultures into your software is called internationalisation, often referred to as I18N. This is the practice of writing software that is independent of any particular culture. There is a closely relation topic localisation (L10N). This can be thought of as taking an internationalised piece of software and localising it for one specific region.
You should be careful to separate your internationalisation from your business logic.
It is unlikely that you can cover every possible variation in this yourself. Luckily, there are several tools designed to deal with the problem of internationalisation. These will be covered in a future article so stay tuned!
This year I was fortunate enough to attend FOSDEM.
Since a project I work on recently made its 1.0 release and is scheduled to be part of the next Debian stable release this means it will plausibly attract more attention.
Being inexperienced in community engagement, I decided to attend various talks on this topic.
Building an accessible community (video)
This is mostly not relevant to Gitano because it is about how to run a conference and handle accessibility.
It was entertaining though, and a useful bit of advice, that little things like using gender neutral pronouns in text can help to foster an inclusive atmosphere.
As a result I will try to proof read my writing in case I unintentionally included non-inclusive language.
Overcoming culture clash (video)
This was a talk about some theory about what kinds of cultural differences there are, and some specific cultural differences that often cause issue.
It started out with a metric by which cultural attitudes about 6 supposed characteristics of culture can be quantified so the difference between cultural attitudes can be measured.
The speaker admitted that they weren't an expert on the topic so I couldn't ask for clarification of how some characteristics differed since my impression was that at least three of them overlapped.
More practical advice included:
- Communities are built one person at a time. So we should try to foster good relations with existing and new users.
- Local support groups should be encouraged, but should be helped to interact with the wider community by inviting members to events and visiting them.
- Get to know the cultural differences of local groups.
- Avoid real-time (face to face or IRC) meetings. Text is better than video calls, and asynchronous is better than synchronous since it is a lot easier to translate.
- Plan events with awareness of religious and national holidays so you're not accidentally excluding someone. For example, FOSDEM is set during Chinese new year, so can be problematic.
- Don't be afraid to ask if people have issues, but be aware that not wanting to impose is also a cultural value, so they may attempt to appease unnecessarily.
Like the ants (Growing communities)
There was a talk from a communities manager from Google talking about how to foster a community without driving it.
The gist is that it should emulate a hive-mind like ants, where rather than dictating direction you would provide feedback mechanisms to encourage what is wanted.
This is not relevant to Gitano since we will be part of any community that happens.
Open source is just about the source, isn't it? (video)
This was a talk from a community manager about a bunch of non-code parts of a project to worry about.
Trademarking
First was trademark handling.
For Gitano we've started well by picking a name that is not already used for a git server and created a logo that resembles no other git server.
We will need to ask anyone who names their git server Gitano to rename though.
Finding users
We need to go out and find potential users, rather than waiting for them to come to us.
Talking about it on social media may help, and getting users to talk about it would help.
As would submitting talks to relevant conferences.
Larger projects can submit an article to a relevant journal since journalists are lazy and will print articles to fill space.
I intend to speak about Gitano at more conferences as a result.
Supporting users
Supporting users is essential, you don't know who might be a valuable contributor so be friendly to everyone.
You can't always expect users to come to you, you need to go where they are, which may mean subscribing to Stack Overflow to see if Gitano is mentioned.
Retaining contributors
Non-coder contributors can be hugely valuable, since they provide support for other users and may be able to provide support in languages you don't understand.
Retaining contributors depends heavily on how responsive you are. If you can provide automated feedback on code style etc. it helps.
If a useful contributor is approaching burn-out, if you can arrange for them to be employed to do so then that's handy, this is not helpful for Gitano since we're not a big foundation.
There isn't much we can do before other contributors turn up or leave for this.
Managing infrastructure
If infrastructure is required for development then upstream must provide it, since contributors are even less likely to be able to provide their own.
If infrastructure is expensive then a tip jar helps.
Try not to spread infrastructure across too many systems, or at least provide a landing page to locate everything.
Have one place for recording canonical decisions, don't split them across mailing lists or wikis.
For Gitano we expect the canonical place for policy to be the wiki, so we're going to have a policy page and link to all the infrastructure from the wiki.
All Gitano's infrastructure is either free or cheap and paid out of the lead developer's pocket, so we don't need to think about a tip jar yet.
Expect to leave
Plan for your own exit from the project. Nothing lasts forever.
It is helpful if you can centralise contact details so they can be changed more easily.
For Gitano we plan to have a contact details page to centralise as much as we can, and avoid giving out the details of any particular person as a contact.
Deciding on communications channels
Given the discussion of various contact channels and the importance of using appropriate ones I asked the speaker if she had any tips on how to evaluate which to use.
I was recommended to decide based on:
- Which is easy for your developers to use.
Which is easy for your contributors to use.
Ideally by asking existing users what they would prefer, but otherwise an educated guess based on what the target users might want.
In Gitano we have opinionated developers who like mailing lists, IRC and RSS feeds, so to widen the support net we're going to add a link to webchat.
We have spoken before about testing your software. In particular we have mentioned how if your code isn't tested you can't be confident that it works. Whe also spoke about how the technique of testing and the level at which you test your code will vary based on what you need to test.
What I'd like to talk about this time is about understanding the environment in which your tests exist. Since "nothing exists in a vacuum" it is critical to understand that even if you write beautifully targetted tests, they still exist and execute within the wider context of the computer they are running on.
As you are no doubt aware by now, I have a tendency to indulge in the hoary old developer habit of teaching by anecdote, and today is no exception to that. I was recently developing some additional tests for Gitano and exposed some very odd issues with one test I wrote. Since I was engaged in the ever-satisfying process of adding tests for a previously untested portion of code I, quite reasonably, expected that the issue I was encountering was a bug in the code I was writing tests for. I dutifully turned up the logging levels, sprinkled extra debug information around the associated bits of code, and puzzled over the error reports and debug logs for a good hour or so.
Predictably, given the topic of this article, I discovered that the error in question made absolutely no sense given the code I was testing, and so I had to cast my net wider. Eventually I found a bug in a library which Gitano depends on, which gave me a somewhat hirsuite yak to deal with. Once I had written the patch to the library, tested it, committed it, made an upstream release, packaged that, reported the bug in Debian, uploaded the new package to Debian, and got that new package installed onto my test machine - lo and behold, my test for Gitano ran perfectly.
This is, of course, a very particular kind of issue. You are not likely to encounter this type of scenario very often, unless you also have huge tottering stacks of projects which all interrelate. However you are likely to encounter issues where tests assume things about their environment without necessarily meaning to. Shell scripts which use bashisms, or test suites which assume they can bind test services to particular well known (and well-used) ports are all things I have encountered in the past.
Some test tools offer mechanisms for ensuring the test environment is "sane" for a value of sanity which applies only to the test suite in question. As such, your homework is to go back to one of your well-tested projects and consider if your tests assume anything about the environment which might need to be checked for (outside of things which you're already checking for in order to build the project in the first place). If you find any unverified assumptions then consider how you might ensure that, if the assumption fails, the user of your test suite is given a useful report on which to act.