Skip to content

Latest commit

 

History

History
253 lines (184 loc) · 9.15 KB

01_shell.md

File metadata and controls

253 lines (184 loc) · 9.15 KB

IPython Cookbook, Second Edition This is one of the 100+ free recipes of the IPython Cookbook, Second Edition, by Cyrille Rossant, a guide to numerical computing and data science in the Jupyter Notebook. The ebook and printed book are available for purchase at Packt Publishing.

Text on GitHub with a CC-BY-NC-ND license
Code on GitHub with a MIT license

Chapter 2 : Best practices in Interactive Computing

2.1. Learning the basics of the Unix shell

Learning how to interact with the operating system using a command-line interface (or terminal) is a required skill in interactive computing and data analysis. We will use a command-line interface in most recipes of this book. IPython and the Jupyter Notebook are typically launched from a terminal. Installing Python packages is typically done from a terminal.

In this recipe, we will show the very basics of the Unix shell, which is natively available in Linux distributions (such as Debian, Ubuntu, and so on) and macOS. On Windows 10, one can install the Windows Subsystem for Linux, a command-line interface to a Unix subsystem integrated with the Windows operating system (see https://docs.microsoft.com/windows/wsl/about).

Getting ready

Here are the instructions to open a Unix shell on macOS, Linux, and Windows. bash is the most common Unix shell and this is what we will use in this recipe.

On macOS, bring up the Spotlight Search, type terminal, and press Enter.

On Windows, follow the instructions at https://docs.microsoft.com/en-us/windows/wsl/install-win10. Then, open the Windows menu, type bash, and press Enter.

On Linux, open the Dash by clicking on the top-left icon on the desktop, type terminal, and open the Terminal application.

If you want to run this notebook in Jupyter, you need to install bash_kernel, available at https://github.com/takluyver/bash_kernel. Open a terminal and type pip install bash_kernel && python -m bash_kernel.install.

This will install a bash kernel in Jupyter, and it will allow you to run this recipe's code directly in the Notebook.

How to do it...

The Unix shell comes with hundreds of commands. We will see the most common ones in this recipe.

  1. The terminal lets us write text commands with the keyboard. We execute them by pressing Enter, and the output is displayed below the command. The working directory is the directory of our file system that is currently "active" in the terminal. We can get the absolute path of the working directory as follows:
pwd
~/git/cookbook-2nd/chapter02_best_practices

the dollar $ sign must not be typed: it is typically used by the shell to indicate where the user can start typing. The information written before it may show the user name, the computer name, and part of the working directory. Here, only the three characters pwd should be typed before pressing Enter.

  1. We can list all files and subdirectories in the working directory as follows:
ls
00_intro.md  03_git.md           07_high_quality.md
01_shell.md  04_git_advanced.md  08_test.md
02_py3       05_workflows.md     09_debugging.md
02_py3.md    06_tips.md          images
ls -l
total 100
-rw-rw-r-- 1 owner   769 Dec 12 10:23 00_intro.md
-rw-rw-r-- 1 owner  2473 Dec 12 14:21 01_shell.md
...
-rw-rw-r-- 1 owner  9390 Dec 12 11:46 08_test.md
-rw-rw-r-- 1 owner  5032 Dec 12 10:23 09_debugging.md
drwxrwxr-x 2 owner  4096 Aug  1 16:49 images

The -l option displays the directory contents as a detailed list, showing the permissions and owner of the files, the file sizes, and the last modified dates. Most shell commands come with many options that alter their behavior and that can be arbitrarily combined.

  1. We use the cd command to navigate between subdirectories. The current directory is named . (single dot), and the parent directory is named .. (double dot):
cd images
pwd
~/git/cookbook-2nd/chapter02_best_practices/images
ls
folder.png  github_new.png
cd ..
pwd
~/git/cookbook-2nd/chapter02_best_practices
  1. Paths can be specified as relative (depending on a reference directory, generally the working directory) or absolute. The home directory, specified as ~, contains the user's personal files. Configuration files are often stored in a directory like ~/.program_name. For example, ~/.ipython contains configuration files of IPython:
ls -la ~/.ipython
total 20
drwxr-xr-x  5 cyrille 4096 Nov 14 16:16 .
drwxr-xr-x 93 cyrille 4096 Dec 12 10:50 ..
drwxr-xr-x  2 cyrille 4096 Nov 14 16:16 extensions
drwxr-xr-x  2 cyrille 4096 Nov 14 16:16 nbextensions
drwxr-xr-x  7 cyrille 4096 Dec 12 14:18 profile_default

in most terminals, we can use the arrow keys on the keyboard to navigate in the history of past commands. Also, the Tab key enables tab completion, which automatically completes the first characters of a command or a file. For example, typing ls -la ~/.ipy and pressing Tab would automatically complete to ls -la ~/.ipython, or it would present the list of possible options if there are several files or directories that begin with ~/.ipy.

  1. We can create, move, rename, copy, delete files and directories from the terminal:
# We create an empty directory:
mkdir md_files
# We copy all Markdown files into the new directory:
cp *.md md_files
# We rename the directory:
mv md_files markdown_files
ls markdown_files
00_intro.md         05_workflows.md
01_shell.md         06_tips.md
02_py3.md           07_high_quality.md
03_git.md           08_test.md
04_git_advanced.md  09_debugging.md
rmdir markdown_files
rmdir: failed to remove 'markdown_files':
    Directory not empty
rm markdown_files/*
rmdir markdown_files

The rm command lets us delete files and directories. The rm -rf path deletes the given path recursively, even if subdirectories are not empty. It is an extremely dangerous command as it cannot be undone: the files are immediately and permanently deleted, they do not go into a trash directory first. See https://github.com/sindresorhus/guides/blob/master/how-not-to-rm-yourself.md for more details.

  1. There are several useful commands to deal with text files:
# Show the first three lines of a text file:
head -n 3 01_shell.md
# Learning the basics of the Unix shell

Learning how to interact with the operating system (...)
# Show the last line of a text file:
tail -n 1 00_intro.md
We will also cover more general topics (...)
# We display some text:
echo "Hello world!"
Hello world!
# We redirect the output of a command to
# a text file with `>`:
echo "Hello world!" > myfile.txt
# We display the entire contents of the file:
cat myfile.txt
Hello world!

Several command-line text editors are available, such as pico, nano, or vi. Learning these text editors requires time and effort, especially vi.

  1. The grep command lets us search substrings in text. In the following example, we find all instances of Unix followed by a word (using regular expressions):
grep -Eo "Unix \w+" 01_shell.md
Unix shell
Unix shell
Unix subsystem
Unix shell
(...)
Unix shell
Unix shell
  1. A major strength of the Unix shell is that commands can be combined with pipes: the output of one command can be directly transferred to the input of another command:
echo "This is a Unix shell" | grep -Eo "Unix \w+"
Unix shell

There's more...

We only scratched the surface of the Unix shell in this recipe. There are many other commands that can be combined in an infinite number of ways. Many repetitive tasks that would take hours of manual work can be done in a few minutes by writing the appropriate commands. Mastering the Unix shell may take a lot of effort, but it leads to dramatic time gains in the long term.

Here are a few references:

See also

  • Ten tips for conducting reproducible interactive computing experiments