Informational shell utilities
Primitives
echo(1) will print its command line out. Despite its simple behaviour, it is remarkably useful, since it's a way to get your shell to put its variables to its standard output.
$ VAR=foo
$ echo "$VAR"
foo
Given its frequent use and simplicity, the version of echo(1) you use will actually be the version built into your shell.
It can be used to write strings to a file, when used with output redirection.
$ echo foo >bar
cat(1) can be used to retrieve the contents of a file.
$ cat bar
foo
You can put the contents of a file into a variable using subshell expansion.
$ baz=`cat bar`
$ echo $baz
foo
$ qux=$(cat bar)
$ echo $qux
foo
So with echo(1), cat(1), output redirection and subshell expansion, you can move data between variables, command line arguments and files.
cat(1) has one last trick up its sleeve; it can be used to concatenate files, hence its name.
$ echo foo >1
$ echo bar >2
$ cat 1 2
foo
bar
File trimming
head(1) and tail(1), without further arguments, read the first 10 or last 10 lines of a file and write them to standard output.
$ seq 20 >numbers # write the numbers 1 to 20 to the file called numbers
$ head numbers
1
2
3
4
5
6
7
8
9
10
$ tail numbers
11
12
13
14
15
16
17
18
19
20
The number of lines can be tuned with the -n
flag. So head -n3
is the first 3 lines, and tail -n3
is the last 3 lines.
$ head -n3 numbers
1
2
3
$ tail -n3 numbers
18
19
20
The number given to the -n
option of head(1) can be prefixed with a
-
to print all but the last n
lines, and tail(1) can be prefixed
with a +
to print all but the first n
lines.
$ seq 5 >numbers
$ head -n -1
1
2
3
4
$ tail -n +1
2
3
4
5
The option -c
can be used to perform a similar operation, but on the
number of bytes rather than the number of lines.
tail(1) has a few more tricks. One of tail's key uses is to display the last lines of a log, which are generally more important to know what just happened to the program that generated it.
However, it's also useful to have the output from the log be printed to an
open terminal, to see what it's doing, without having to reconfigure it
to log differently, so tail(1) also has the -f
flag, which stands
for follow. When -f
is specified, it will print the last 10 lines,
but also print new lines as they are added.
Pagers
less(1) and more(1) are pagers. Pagers are responsible for allowing you to more easily read the output of a program that generates a lot, by keeping it all on one page and letting you say when you've finish reading it and continue to the next page.
less(1) is like more(1), but better. If you're interested, someone also wrote a pager called most(1).
Please note that cat "$file" | less
is rarely the right thing to do.
Unless $file
is special in some way this is equivalent to less <"$file"
,
and it's clearer, in almost all cases, to simply run less "$file"
.
Environment processing
Both env(1) and printenv(1) can be used to list all the variables. Indeed they behave the same if they are called with no arguments.
However, if you are writing a shell script and need to programmatically
get a list of all the environment variables, and maybe their values,
then you need the GNU coreutils versions of these programs, and then you
can use use the -0
switch to get them to NUL terminate output, rather
than newline terminate, allowing you to process them with read -d ''
.
After this, env(1) and printenv(1) differ, since printenv(1) will list variables you list on the command line
$ export FOO=bar
$ export BAZ=qux
$ printenv FOO BAZ
bar
qux
However, this is of limited use, since your shell will suport variable interpolation natively.
$ echo $FOO $BAZ
bar qux
env(1) will set variables then run a command.
$ env BADGER=stoat printenv BADGER
stoat
env(1) can be used to run a program in a reduced environment with the
-i
flag.
$ env -i BADGER=stoat printenv
BADGER=stoat
Apart from all this, env(1) is used on the she-bang line of scripts,
to allow looking up the interpreter in $PATH
, since various programs
may be installed in different places on different systems, but env(1)
is usually installed at /usr/bin/env
.
Directory contents information
ls
ls(1) can be used to list the contents of a directory.
$ ls /
bin dev initrd.img lib64 opt run sys var
boot etc initrd.img.old media proc sbin tmp vmlinuz
cdrom home lib mnt root srv usr vmlinuz.old
If it can determine that it is writing to a terminal, it will tabulate the results like above, otherwise it is one entry per line.
$ ls / | cat
bin
boot
cdrom
dev
etc
home
initrd.img
initrd.img.old
lib
lib64
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
vmlinuz
vmlinuz.old
It's common to see shell scripts use ls(1) to perform an action on every entry in a directory.
$ for dir in `ls /`; do
echo $dir
done
However, this would have problems if any of the directory names had
spaces in them, and shell globbing is shorter, and handles spaces,
just not any files or directories with names that start with .
.
$ for dir in /*; do
echo $dir
done
You can use ls -a
to list files that start with .
, though ls -A
is generally more useful since it excludes .
and ..
.
To do this safely in scripts is more complicated.
A portable way to do this is:
$ command='for arg; do echo "$arg"; done'
$ find / -mindepth 1 -maxdepth 1 -exec sh -c "$command" - {} +
/home
/etc
/media
/var
/bin
/boot
/dev
...
A less portable way that uses fewer subprocesses is:
$ find / -mindepth 1 -maxdepth 1 -print0 |
while read -d '' arg
do
echo "$arg"
done
/home
/etc
/media
/var
/bin
/boot
/dev
/lib
/mnt
/opt
...
If your loop needs to change variables outside the body of the loop, you need to do this instead.
$ count=0
$ while read -d '' arg
do
count=$(( "$count" + 1 ))
done < <(find / -mindepth 1 -maxdepth 1 -print0)
$ echo $count
24
pwd
pwd(1) will print the path to the current directory to standard output.
$ pwd
/home/richardmaw
This is not generally useful when using a shell interactively, since your shell's prompt will usually have the current directory in it:
$ PS1='\w\$ '
~$ cd /
/$ pwd
/
However, pwd
can be of use in scripting if you need to change
directory and wish to return to the current directory later in
your shell script.
#!/bin/sh
HERE="$(pwd)"
# Do work including changing directory
# ...
cd "$HERE"
# Do more work back in the original directory
# ...
readlink
readlink(1) is a thin wrapper over the readlink(2) system call in its default mode of operation.
$ readlink /vmlinuz
boot/vmlinuz-3.11.0-17-generic
It is generally more common to discover this by running ls -l
, since
it's fewer characters to type, though readlink(1) is more useful
in scripts.
$ ls -l /vmlinuz
lrwxrwxrwx 1 root root 30 Mar 3 08:07 /vmlinuz -> boot/vmlinuz-3.11.0-17-generic
Another useful mode of readlink(1) is readlink -f
, which is
"canonicalize" mode.
$ readlink -f /vmlinuz
/boot/vmlinuz-3.11.0-17-generic
$ cd ~
$ readlink -f ../../vmlinuz
/boot/vmlinuz-3.11.0-17-generic
Another use for readlink -f
is to get an absolute path to a shell
script, to access packaged resources.
$ install -m755 /dev/stdin /tmp/thisdir-test.sh <<'EOF'
#!/bin/sh
THISDIR="$(readlink -f "$(dirname "$0")")"
echo "$THISDIR"
EOF
$ /tmp/thisdir-test.sh
/tmp
$ mv /tmp/thisdir-test.sh ~
$ ~/thisdir-test.sh
/home/richardmaw
stat
stat(1) uses the stat(2) system call to report information about files.
It defaults to showing a wide selection of information.
$ stat thisdir-test.sh
File: ‘thisdir-test.sh’
Size: 69 Blocks: 8 IO Block: 4096 regular file
Device: 1ah/26d Inode: 237403 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 1000/richardmaw) Gid: ( 1000/richardmaw)
Access: 2014-03-30 20:50:10.266400076 +0100
Modify: 2014-03-30 20:49:01.786402099 +0100
Change: 2014-03-30 20:49:42.522400896 +0100
Birth: -
stat(1) can be told to output specific information with its -c
argument, which takes a format string. %n
is the file name, %s
is the file size.
$ stat -c '%n: %s' thisdir-test.sh
thisdir-test.sh: 69
User status
whoami(1) prints the user name of the current user.
$ whoami
richardmaw
who(1) and users(1) list who is logged in, in slightly different formats.
$ users
richardmaw richardmaw richardmaw richardmaw
$ who
richardmaw tty1 2014-03-30 15:49
richardmaw tty7 2014-03-30 18:06 (:0)
richardmaw pts/1 2014-03-30 18:13 (:0)
richardmaw pts/2 2014-03-30 20:10 (:0)
id(1) displays more information about the current user, including the primary group, supplementary groups, and all the numeric ids for them, all comma separated
$ id | while read -d ',' FOO; do echo $FOO; done ~
uid=1000(richardmaw) gid=1000(richardmaw) groups=1000(richardmaw)
4(adm)
20(dialout)
24(cdrom)
25(floppy)
27(sudo)
29(audio)
30(dip)
44(video)
46(plugdev)
104(scanner)
113(netdev)
114(bluetooth)
120(fuse)
id(1) can be given various options to limit its output, such as -Z
to
also print the SELinux security context.
date
date(1) prints the current date and time to standard output.
$ date
Sun Mar 30 21:36:16 BST 2014
It also takes a format string.
$ date +%F
2014-03-30
It can be told to format a date which isn't now, with the -d
option.
$ date -d yesterday
Sat Mar 29 20:37:58 GMT 2014
wc
wc(1) prints line, word and character counts. By default it prints all 3 and the file name.
$ wc thisdir-test.sh
3 7 69 thisdir-test.sh
To print only one of those, specify -l
, -w
and -c
respectively.
$ wc -l thisdir-test.sh
3 thisdir-test.sh
To prevent wc(1) printing the file name, the file has to be provided as the standard input.
$ wc -l <thisdir-test.sh
3
A common solution to finding out how many files in a directory is
ls $dir | wc -l
, though this will give incorrect results if the file
names have new line characters in them.
The most concise, correct way to do this is:
$ find / -maxdepth 1 -mindepth 1 -print0 | grep -cz .
24