pages tagged awkyakkinghttp://yakking.branchable.com/tags/awk/yakkingikiwiki2016-04-20T11:00:17ZWhen all is sed and donehttp://yakking.branchable.com/posts/sed/Will Holland2016-04-20T11:00:17Z2016-04-20T11:00:09Z
<p>The utility <a href="https://www.gnu.org/software/sed/manual/sed.html">Sed</a>, <em>s</em>tream <em>ed</em>itor, can be found on almost all unix-based
systems. Sed takes a steam of text and, as the name suggests, edits it
according to some instructions you have given it. It is a very flexible tool
and can be very useful when using the commandline or in a shell script.</p>
<p>As a classic Hello World example do</p>
<pre><code>sed 's/.*/Hello World/g'
</code></pre>
<p>Then press enter a few times and see sed replace the lines with <code>Hello World</code>.</p>
<p>This article contains some (fairly artificial) examples in order to demonstrate
the features of sed. There are far more features than are in the scope of this
article; I suggest checking out the <a href="http://linux.die.net/man/1/sed">man page</a> if you are curious for more.</p>
<h2>Uses</h2>
<p>There is nothing that sed can do that is unique. Other tools such as <a href="http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_04_02.html">grep</a>
<a href="https://www.gnu.org/software/gawk/manual/html_node/Simple-Sed.html">awk</a> and <a href="https://wincent.com/wiki/Using_Perl_as_a_stream_editor_on_standard_input">perl</a> can be used for similar things to name just a few. Sed
provides a style of working that some people like the most.</p>
<p>A simple use for sed is the situation when you are running some software that
produces a lot of output but you only care about certain messages and do not
want to miss them amidst the output you don't care about, maybe these messages
appear in blocks starting with a line <code>WARNING:</code> and ending with an empty line.
In this case you can do</p>
<pre><code>./some_software | sed -n '/WARNING:/, /^$/p'
</code></pre>
<p>The <code>-n</code> option tells sed not to print the output of <code>./some_software</code> unless
explicitly told to do so by the print instruction, written <code>p</code>, this comes at
the end of our sed command. Before it there are two regular expressions in
forward slashes separated by a comma. This tells sed that you want to print in
the range of lines between any that match those two regular expressions.</p>
<p>Another useful command that can go in place of the print command is the delete
line command which is written <code>d</code>. Now when your boss is watching you can show
him that it runs without any warnings by removing all the lines containing the
string <code>WARNING</code> with</p>
<pre><code>./some_software | sed '/WARNING/d'
</code></pre>
<p>Notice that the <code>-n</code> option is not there because we want sed to print all the
text that is being output by <code>./some_software</code> <em>unless</em> it matches <code>WARNING</code>,
in which case delete that line from the stream.</p>
<p>Yet another usecase is taking the output of one command and changing parts of
it; This is called a substitution and is done with the <code>s/</code>. For example, maybe
you are having problems with a piece of software. You want to ask for help on
an online forum but need to be careful not to post personal information which
is included in the command output. Maybe it gives away that your name is Joe
Bloggs so you do</p>
<pre><code>./some_software | sed 's/Joe Blogs/<Name Redacted>/g'
</code></pre>
<p>For a more involved example, imagine you want to get the IP address of the
wlan0 interface on your machine and put it in a variable in a shell script.
ifconfig might ouput something that looks like:</p>
<pre>
wlan0 Link encap:Ethernet HWaddr 08:11:96:05:6b:6c
inet addr:192.168.1.122 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::a11:96ff:fe05:6b6c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:67062 errors:0 dropped:0 overruns:0 frame:0
TX packets:18075 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:21572026 (20.5 MiB) TX bytes:3448315 (3.2 MiB)
</pre>
<p>We want to extract the address <code>192.168.1.122</code>. By eyeballing it we can see
that we want the line after it says <code>wlan0</code>, then we want to match the regex
<code>inet addr:[^ ]*</code> and we want to print that. You could use the following
line</p>
<pre><code>wlan0ip=$(ifconfig | sed -n '/wlan0/{n; s/.*inet addr:\([^ ]*\).*/\1/p; q}')
</code></pre>
<p>Notice that in this command after the <code>/wlan0/</code> to match there is a curley
brace. This is a block like in other programming languages because all the
commands between the braces will be executed when there is a matching line.
Commands are separated by semicolons. The <code>n</code> command simply tells sed to look
at the next line. The next bit is a little fiddly. We only want to output the
IP address, not the whole line so instead of doing a normal print <code>p</code> I used
the <code>s/.*(<some regex>).*/\1/p</code> trick. This is a fairy standard trick which
replaces the entire line with just the match before printing it in order to
only print what you want. Finally the <code>q</code> tells sed to quit because you have
found what you want and do not need to process the remainder of the output.</p>
<h2>Limitations</h2>
<p>Sed is bad when you need a line that comes before a matching line. Sed never
goes backwards so by the time it finds the matching line it has already thrown
away.</p>