Sick Gaming
Ampersands and File Descriptors in Bash - Printable Version

+- Sick Gaming (https://www.sickgaming.net)
+-- Forum: Computers (https://www.sickgaming.net/forum-86.html)
+--- Forum: Linux, FreeBSD, and Unix types (https://www.sickgaming.net/forum-88.html)
+--- Thread: Ampersands and File Descriptors in Bash (/thread-88675.html)



Ampersands and File Descriptors in Bash - xSicKxBot - 02-13-2019

Ampersands and File Descriptors in Bash

<div style="margin: 5px 5% 10px 5%;"><img src="http://www.sickgaming.net/blog/wp-content/uploads/2019/02/ampersands-and-file-descriptors-in-bash.png" width="1969" height="1034" title="" alt="" /></div><div><div><img src="http://www.sickgaming.net/blog/wp-content/uploads/2019/02/ampersands-and-file-descriptors-in-bash.png" class="ff-og-image-inserted" /></div>
<p>In our quest to examine all the clutter (<code>&amp;</code>, <code>|</code>, <code>;</code>, <code>&gt;</code>, <code>&lt;</code>, <code>{</code>, <code>[</code>, <code>(</code>, ), <code>]</code>, <code>}</code>, etc.) that is peppered throughout most chained Bash commands, <a href="https://www.linux.com/blog/learn/2019/2/and-ampersand-and-linux">we have been taking a closer look at the ampersand symbol (<code>&amp;</code>)</a>.</p>
<p><a href="https://www.linux.com/blog/learn/2019/2/and-ampersand-and-linux">Last time, we saw how you can use <code>&amp;</code> to push processes that may take a long time to complete into the background</a>. But, the &amp;, in combination with angle brackets, can also be used to pipe output and input elsewhere.</p>
<p>In the <a href="https://www.linux.com/blog/learn/2019/1/understanding-angle-brackets-bash">previous tutorials on</a> <a href="https://www.linux.com/blog/learn/2019/1/more-about-angle-brackets-bash">angle brackets</a>, you saw how to use <code>&gt;</code> like this:</p>
<pre>
ls &gt; list.txt
</pre>
<p>to pipe the output from <code>ls</code> to the <i>list.txt</i> file.</p>
<p>Now we see that this is really shorthand for</p>
<pre>
ls 1&gt; list.txt
</pre>
<p>And that <code>1</code>, in this context, is a file descriptor that points to the standard output (<code>stdout</code>).</p>
<p>In a similar fashion <code>2</code> points to standard error (<code>stderr</code>), and in the following command:</p>
<pre>
ls 2&gt; error.log
</pre>
<p>all error messages are piped to the <i>error.log</i> file.</p>
<p>To recap: <code>1&gt;</code> is the standard output (<code>stdout</code>) and <code>2&gt;</code> the standard error output (<code>stderr</code>).</p>
<p>There is a third standard file descriptor, <code>0&lt;</code>, the standard input (<code>stdin</code>). You can see it is an input because the arrow (<code>&lt;</code>) is pointing into the <code>0</code>, while for <code>1</code> and <code>2</code>, the arrows (<code>&gt;</code>) are pointing outwards.</p>
<h3>What are the standard file descriptors good for?</h3>
<p>If you are following this series in order, you have already used the standard output (<code>1&gt;</code>) several times in its shorthand form: <code>&gt;</code>.</p>
<p>Things like <code>stderr</code> (<code>2</code>) are also handy when, for example, you know that your command is going to throw an error, but what Bash informs you of is not useful and you don’t need to see it. If you want to make a directory in your <i>home/</i> directory, for example:</p>
<pre>
mkdir newdir
</pre>
<p>and if <i>newdir/</i> already exists, <code>mkdir</code> will show an error. But why would you care? (Ok, there some circumstances in which you may care, but not always.) At the end of the day, <i>newdir</i> will be there one way or another for you to fill up with stuff. You can supress the error message by pushing it into the void, which is <i>/dev/null</i>:</p>
<pre>
mkdir newdir 2&gt; /dev/null
</pre>
<p>This is not just a matter of “<i>let’s not show ugly and irrelevant error messages because they are annoying,</i>” as there may be circumstances in which an error message may cause a cascade of errors elsewhere. Say, for exapmple, you want to find all the <i>.service</i> files under <i>/etc</i>. You could do this:</p>
<pre>
find /etc -iname "*.service"
</pre>
<p>But it turns out that on most systems, many of the lines spat out by <code>find</code> show errors because a regular user does not have read access rights to some of the folders under <i>/etc</i>. It makes reading the correct output cumbersome and, if <code>find</code> is part of a larger script, it could cause the next command in line to bork.</p>
<p>Instead, you can do this:</p>
<pre>
find /etc -iname "*.service" 2&gt; /dev/null
</pre>
<p>And you get only the results you are looking for.</p>
<h3>A Primer on File Descriptors</h3>
<p>There are some caveats to having separate file descriptors for <code>stdout</code> and <code>stderr</code>, though. If you want to store the output in a file, doing this:</p>
<pre>
find /etc -iname "*.service" 1&gt; services.txt
</pre>
<p>would work fine because <code>1&gt;</code> means “<i>send standard output, and only standard output (NOT standard error) somewhere</i>“.</p>
<p>But herein lies a problem: what if you *do* want to keep a record within the file of the errors along with the non-erroneous results? The instruction above won’t do that because it ONLY writes the correct results from <code>find</code>, and</p>
<pre>
find /etc -iname "*.service" 2&gt; services.txt
</pre>
<p>will ONLY write the errors.</p>
<p>How do we get both? Try the following command:</p>
<pre>
find /etc -iname "*.service" &amp;&gt; services.txt
</pre>
<p>… and say hello to <code>&amp;</code> again!</p>
<p>We have been saying all along that <code>stdin</code> (<code>0</code>), <code>stdout</code> (<code>1</code>), and <code>stderr</code> (<code>2</code>) are <i>file descriptors</i>. A file descriptor is a special construct that points to a channel to file, either for reading, or writing, or both. This comes from the old UNIX philosophy of treating everything as a file. Want to write to a device? Treat it as a file. Want to write to a socket and send data over a network? Treat it as a file. Want to read from and write to a file? Well, obviously, treat it as a file.</p>
<p>So, when managing where the output and errors from a command goes, treat the destination as a file. Hence, when you open them to read and write to them, they all get file descriptors.</p>
<p>This has interesting effects. You can, for example, pipe contents from one file descriptor to another:</p>
<pre>
find /etc -iname "*.service" 1&gt; services.txt 2&gt;&amp;1
</pre>
<p>This pipes <code>stderr</code> to <code>stdout</code> and <code>stdout</code> is piped to a file, <i>services.txt</i>.</p>
<p>And there it is again: the <code>&amp;</code>, signaling to Bash that <code>1</code> is the destination file descriptor.</p>
<p>Another thing with the standard file descriptors is that, when you pipe from one to another, the order in which you do this is a bit counterintuitive. Take the command above, for example. It looks like it has been written the wrong way around. You may be reading it like this: “<i>pipe the output to a file and then pipe errors to the output.</i>” It would seem the error output comes to late and is sent when <code>1</code> is already done.</p>
<p>But that is not how file descriptors work. A file descriptor is not a placeholder for the file, but for the <i>input and/or output channel</i> to the file. In this case, when you do <code>1&gt; services.txt</code>, you are saying “<i>open a write channel to services.txt and leave it open</i>“. <code>1</code> is the name of the channel you are going to use, and it remains open until the end the line.</p>
<p>If you still think it is the wrong way around, try this:</p>
<pre>
find /etc -iname "*.service" 2&amp;&gt;1 1&gt;services.txt
</pre>
<p>And notice how it doesn’t work, notice how errors get piped to the terminal and only the non-erroneous output (that is <code>stdout</code>) gets pushed to <code>services.txt</code>.</p>
<p>That is because Bash processes every result from <code>find</code> from left to right. Think about it like this: when Bash gets to <code>2&amp;&gt;1</code>, <code>stdout</code> (<code>1</code>) is still a channel that points to the terminal. If the result that <code>find</code> feeds Bash contains an error, it is popped into <code>2</code>, transferred to <code>1</code>, and, away it goes, off to the terminal!</p>
<p>Then at the end of the command, Bash sees you want to open <code>stdout</code> as a channel to the <i>services.txt</i> file. If no error has occurred, the result goes through <code>1</code> into the file.</p>
<p>By contrast, in</p>
<pre>
find /etc -iname "*.service" 1&gt;services.txt 2&gt;&amp;1
</pre>
<p><code>1</code> is pointing at <code>services.txt</code> right from the beginning, so anything that pops into <code>2</code> gets piped through <code>1</code>, which is already pointing to the final resting place in <code>services.txt</code>, and that is why it works.</p>
<p>In any case, as mentioned above <code>&amp;&gt;</code> is shorthand for “<i>both standard output and standard error</i>“, that is, <code>2&gt;&amp;1</code>.</p>
<p>This is probably all a bit much, but don’t worry about it. Re-routing file descriptors here and there is commonplace in Bash command lines and scripts. And, you’ll be learning more about file descriptors as we progress through this series. See you next week!</p>
</div>