The below are parsed from the same code that generated the lecture slides. The
material on this page in no way is a replacement for reading through the full slides, as
it only contains excerpts and potentially lacks relevant context. That said, there’s
little reason for you to need to hunt through the slides when you just want a refresher
on how something works / what commands you learned when.
ssh <username>@remote
username
is the username on the
remote
host.
remote
is the url of the server you want to log into.
IP Address, e.g.,
128.253.141.34
Symbolic name, e.g.,
wash.cs.cornell.edu
Use
@
to specify username.
ssh username@remote
On
wash
I am
mpm288
:
v1:
ssh mpm288@128.253.3.197
v2:
ssh mpm288@wash.cs.cornell.edu
(
or
fill in the fields if you’re using a graphical SSH client)
(if you need the port number, it’s
22
)
pwd
Prints the “full” path of the current directory.
The
-P
flag is needed when
symbolic
links are present.
The
-l
flag lists detailed file / directory information (we’ll learn more about flags later).
Use
-a
to list hidden files.
cd [directory name]
Changes directory to
[directory name]
.
If not given a destination defaults to the user’s home directory.
Reminder: the home directory is
~
cat [files]...
Prints (“concatenates”) the listed files to your terminal
With no arguments, does something more advanced
touch [flags] <file>
Adjusts the timestamp of the specified file.
With no flags uses the current date and time.
If the file does not exist,
touch
creates it.
“But I swear I haven’t changed the file, look at the timestamp.”
… timestamps prove nothing.
mkdir [flags] <dir1> <dir2> <...> <dirN>
Can use relative or absolute paths.
Not restricted to making directories in the current directory only.
Need to specify at least one directory name.
Can specify multiple, separated by spaces.
The
-p
flag is commonly used in scripts:
Makes all parent directories if they do not exist.
Convenient because if the directory exists,
mkdir
will not fail.
rm [flags] <filename>
Removes the file
<filename>
.
Remove multiple files with wildcards (more on this later).
Remove every file in the current directory:
rm *
Remove every
.jpg
file in the current directory:
rm *.jpg
Prompt before deletion:
rm -i <filename>
rmdir [flags] <directory>
Removes an
empty
directory.
Throws an error if the directory is not empty.
You are encouraged to use this command: failing on non-empty can and will save you!
cp [flags] <file> <destination>
Copies from one location to another.
To copy multiple files, use wildcards (such as
*
).
Globs / patterns can only be used for
<src>
.
<dest>
must be explicit and singularly defined.
Completely reasonable…how would it know what to do if there is ambiguity in where to send the file(s)?
To copy a complete directory:
cp -r <src> <dest>
To overwrite more aggressively:
cp -f <src> <dest>
mv [flags] <source> <destination>
Moves a file or directory from one place to another.
Also used for renaming, rename
<oldname>
to
<newname>
.
mv badFolderName correctName
handin <assignment> <file_name>
Hands in a
single file
or a
directory you own
for the named assignment
If you need to hand in more than one file, make a directory and
cp
the files into it
check-handin <assignment>
What you should see now (modulo colors)
NetID@wash ~ $
NetID
is your username
wash
is the
hostname
of the computer you’re accessing
~
is the path to your current
directory
(we call folders “directories” in *nix land because AT&T invented these words)
This is the
bash prompt
, the default command line.
everything in bash is based on a
current directory
You are currently
inside
the
~
Directory. What does this mean?
~
is a special symbol for your
home
directory
you own everything in your home directory
(on personal computers) contains Desktop, Downloads, etc.
Commands work like functions for bash
Command is a single word, like
command
Commands can take arguments
arguments are space-separated:
command arg1 arg2
passes
arg1
and
arg2
to
command
Most arguments are optional
position-independent
arguments are called “flags” and are prefixed with a
-
or
--
example:
command --flag
example:
command -f
A
path
describes how to access a file
Most paths are
relative
paths – they start in your current working directory
Simple paths are just file names in the current directory
example: I’m in
~
, which contains
course
; while I’m in
~
the path
course
will refer to this directory
A path can
traverse
directories using the
/
separator
example: the path
~/course
will
always
mean the directory
course
in my home directory, no matter what my current working directory is.
example: to get to the directory
bar
in the directory
baz
in the directory
~
, I could
cd ~/bar/baz
.
Relative path shortcuts worth remembering:
Shortcut
Expands To
~
current user’s home directory
.
the current directory
..
the parent directory of the current directory
-
for
cd
, return to previous working directory
An example:
~/course/cs2043
arbitrary choice, nothing special about it.
After each
cd
command, execute
pwd
to confirm.
$ cd ~/course/cs2043 # go to starting location
$ cd # now at /home/mpm288
$ cd - # now at ~/course/cs2043
$ cd .. # now at ~/course
man command_name
Loads the manual (manpage) for the specified command.
Unlike google, manpages are
system-specific
.
Usually very comprehensive. Sometimes
too
comprehensive.
Type
/keyword
to search for
keyword
, and hit
<enter>
.
The
n
key jumps to the next search result.
Flags and Options: Formats
A flag that is
One letter is specified with a single dash (
-a
).
More than one letter is specified with two dashes (
--all
).
The reason is because of how switches can be combined.
We generally use “flag” and “switch” interchangeably:
“flag” the command, telling it that “action X” should occur
specify to the command to “switch on/off action X”
Flags and Options: Switches
Switches
take no arguments, and can be specified in a couple of different ways.
Switches are usually one letter, and multiple letter switches usually have a one letter alias.
One option:
ls -a
ls --all
Two options:
ls -l -Q
ls -lQ
Usually
applied from left to right in terms of operator precedence, but not always:
This is up to the developer of the tool.
Prompts:
rm -fi <file>
Does
not
prompt:
rm -if <file>
Flags and Options: Argument Specifiers
The
--argument="value"
format, where the
=
and quotes are needed if
value
is more than one word.
Yes:
ls --hide="Desktop" ~/
Yes:
ls --hide=Desktop ~/
One word, no quotes necessary
No:
ls --hide = "Desktop" ~/
Spaces by the
=
will be misinterpreted
It used
=
as the argument to
hide
The
--argument value
format (space after the
argument
).
Quote rules same as above.
ls --hide "Desktop" ~/
ls --hide Desktop ~/
Usually,
--argument value
and
--argument=value
are interchangeable.
Not always!
groups [user name]
Lists groups to which [argument] belongs.
With no argument, lists your groups
chmod <mode> <file>
Changes file or directory permissions to
<mode>
.
The format of
<mode>
is a combination of three fields:
Who is affected: a combination of
u
,
g
,
o
, or
a
(all).
Use a
+
to add permissions, and a
-
to remove.
Specify type of permission: any combination of
r
,
w
,
x
.
# Add read, write, & execute for user, group, & other
$ chmod ugo+rwx <file> # or chmod a+rwx <file>
# Remove read and write for other
$ chmod o-rw <file>
Can specify mode in octal: user, then group, then other.
E.g.,
750
means
user=7
,
group=5
,
other=0
permissions.
chgrp group <file>
Changes the group ownership of
<file>
to
group
.
The
-R
flag will recursively change permissions of a directory.
chown user:group <file>
Changes the ownership of
<file>
.
The
group
is optional (
chown user <file>
).
The
-R
flag will recursively change permissions of a directory.
stat [opts] <filename>
Gives you a wealth of useful information.
Uid
(
%U
) is the user,
Gid
(
%G
) is the group.
BSD/OSX:
stat -x <filename>
for “standard” behavior.
Can be useful to mimic file permissions you don’t know.
Human readable:
--format=%A
, e.g.
-rw-rw-r--
BSD/OSX:
-f %Sp
is used instead.
Octal:
--format=%a
(great for
chmod
), e.g.
664
BSD/OSX:
-f %A
is used instead.
umask <mode>
Remove
mode
from the file’s permissions.
Similar syntax to
chmod
:
umask 077
:
+rwx
for
owner
,
-
for all others.
umask g+w
: enables group write permissions.
umask -S
: display the current mask.
Just a bit mask with
0o777
and your
mode
.
Full permissions
0o777
Sample User Mask
0o002
Logical
&
Gives
0o002
more <filename>
Scroll through one page at a time.
Program
exits
when end is reached.
less <filename>
Scroll pages or lines (mouse wheel, space bar, and arrows).
Program does
not
exit when end is reached.
head -[numlines] <filename>
tail -[numlines] <filename>
Prints the first / last
numlines
of the file.
First 5 lines:
head -5 file.txt
or
head -n5 file.txt
Last 5 lines:
tail -5 file.txt
or
tail -n5 file.txt
Default is 10 lines.
echo <text>
Prints the input string to the standard output (the terminal).
We will soon learn how to use
echo
to put things into files, append to files, etc.
Show off to your friends how cool you are:
$ echo 'I can have a conversation with my computer!'
$ echo 'But it always copies me. RUDE.'
man command_name
Loads the manual (manpage) for the specified command.
Unlike google, manpages are
system-specific
.
Usually very comprehensive. Sometimes
too
comprehensive.
Type
/keyword
to search for
keyword
, and hit
<enter>
.
The
n
key jumps to the next search result.
File Ownership
You can discern who owns a file many ways, the most immediate being
ls -l
Permissions with
ls
$ ls -l README
-rwxrw---- 1 milano cs2043tas 20 Jan 26 15:48 README
# milano <-- the user
# cs2043tas <-- the group
Third column is the
user
, fourth column is the
group
.
Other columns are the
link count
and
size
we’ll talk about like count in …. 5 lectures?
What is this RWX Nonsense?
r
= read,
w
= write,
x
= execute.
-rwx------
User
permissions
----rwx---
Group
permissions
-------rwx
Other
permissions
Directory permissions begin with a
d
instead of a
-
Other
: “neither the owner, nor a member of the group”.
An example
What would the permissions
-rwxr-----
mean?
It is a file.
User can read and write to the file, as well as execute it.
Group members can read the file
Group members
cannot
write to or execute the file.
Other cannot do
anything
with it.
For the formula hungry, you can represent
r
,
w
, and
x
as binary variables (where 0 is off, and 1 is on). Then the formula for the modes is
Octal Ownership Permissions
r
⋅ 2
2
+
w
⋅ 2
1
+
x
⋅ 2
0
Examples
chmod 755
:
rwxr-xr-x
chmod 777
:
rwxrwxrwx
chmod 600
:
rw-----
If that makes less sense to you, feel free to ignore it.
Just use the
stat
command to help you convert :)
The octal version can be confusing, but will save you time. Excellent resource in
[Computer Hope
2016
]
.
Flags and Options: Formats
A flag that is
One letter is specified with a single dash (
-a
).
More than one letter is specified with two dashes (
--all
).
The reason is because of how switches can be combined.
We generally use “flag” and “switch” interchangeably:
“flag” the command, telling it that “action X” should occur
specify to the command to “switch on/off action X”
Flags and Options: Switches
Switches
take no arguments, and can be specified in a couple of different ways.
Switches are usually one letter, and multiple letter switches usually have a one letter alias.
One option:
ls -a
ls --all
Two options:
ls -l -Q
ls -lQ
Usually
applied from left to right in terms of operator precedence, but not always:
This is up to the developer of the tool.
Prompts:
rm -fi <file>
Does
not
prompt:
rm -if <file>
Flags and Options: Argument Specifiers
The
--argument="value"
format, where the
=
and quotes are needed if
value
is more than one word.
Yes:
ls --hide="Desktop" ~/
Yes:
ls --hide=Desktop ~/
One word, no quotes necessary
No:
ls --hide = "Desktop" ~/
Spaces by the
=
will be misinterpreted
It used
=
as the argument to
hide
The
--argument value
format (space after the
argument
).
Quote rules same as above.
ls --hide "Desktop" ~/
ls --hide Desktop ~/
Usually,
--argument value
and
--argument=value
are interchangeable.
Not always!
Caution About Shebang
The shebang
must
be the first line.
Generally speaking, best approach is to use
env
:
#!/usr/bin/env bash
#!/usr/bin/env python
Generally, it is “wrong” to hard-code say
#!/bin/bash
If I have a custom installation of
bash
that I want to use, your script will ignore me and use the default system
bash
.
There times
ARE
you do this, but they are very uncommon.
Example: program that interfaces with the operating system.
Then you
do
want to hard-code paths to
/bin
or
/usr/bin
.
Not a
#
commentable language?
Official answer: just don’t use a shebang.
Unofficial answer: technically it doesn’t matter, since the shebang is a hack on the first 8 bits, but this would render the file useless except for when it is executed by a shell.
Shebang Case Study: System Tool Counterexample
Consider the tool
gnome-tweak-tool
. It’s purpose is to alter system configurations of the desktop manager Gnome.
Their shebang:
#!/usr/bin/env python
This is “wrong”. My operating system uses
/usr/bin/python
behind the scenes for displaying windows etc.
I have a
custom
python installation that I use for daily hacking.
gnome-tweak-tool
uses my
custom
python, instead of using the
system
python.
Should be using
/usr/bin/python
.
Why is it “wrong”? The
gi.repository
library imported refers to my
custom
python, not the
system
python.
This “bug” has been around for years with no change. There has to be a reason?
Shebang Details
The Shebang does not need a space, but can have it if you want. The following all work:
The
#!
is the
magic
(yes, that is the technical term):
The
#!
must
be the very first two characters, and
the executable separated by whitespace
on the same line
.
Recall that starts
#
is a comment in
bash
.
Technically this line is never “executed”
by the script
.
The
shell
launching the script
to determine
how
to launch.
In general, you will see either one space or no spaces.
Best to stick with one of those for consistency ;)
Shebang Limitations
Generally, only
safe
to use
two
arguments in shebang:
The interpretor.
An optional set of arguments.
So when you do
/usr/bin/env
, technically
/usr/bin/env
is the “interpretor”
bash
is the argument.
This means that if you want to use
perl
or
awk
or something, you are limited to single letter flags. E.g. if you want
-a
,
-b
,
-c
, you would have to do
/usr/bin/perl -abc
.
/usr/bin/env
cannot be used!
[Interesting mail thread][04_env_mail].
[Amusing hacks available][04_shebang_hacks].
find [where to look] criteria [what to do]
Used to locate files or directories.
Search any set of directories for files that match a criteria.
Search by name, owner, group, type, permissions, last modification date, and
more
.
Search is recursive (will search all subdirectories too).
Sometimes you may need to limit the depth.
Comprehensive & flexible. Too many options for one slide.
You will learn it steadily, over time. The sooner you start, the better off you will be in your deveolpment career.
Git is not just for CS Majors
.
It is for
anybody
working with
any
code.
Some Useful Find Options
-name
: name of file or directory to look for.
-maxdepth num
: search at most
num
levels of directories.
-mindepth num
: search at least
num
levels of directories.
-amin n
: file last access was
n
minutes ago.
-atime n
: file last access was
n
days ago.
-group name
: file belongs to group
name
.
-path pattern
: file name matches shell pattern
pattern
.
-perm mode
: file permission bits are set to
mode
.
Of course…a lot more in
man find
.
Some Details
This command is extremely powerful…but can be a little verbose (both the output, and what you type to execute it). That’s normal.
Modifiers for
find
are evaluated in conjunction (a.k.a AND).
Can condition your arguments with an OR using the
-o
flag.
Must be done
for each
modifier you want to be an OR.
Can execute command on found files / directories by using the
-exec
modifier, and
find
will execute the command for you.
The variable name is
{}
.
You have to end the command with either a
Semicolon (
;
): execute command
on each
result as you
find
them.
Plus (
+
):
find
all results first,
then
execute command.
Warning: have to escape them, e.g.
\;
and
\+
The
;
and
+
are shell expansion characters!
Basic Examples
Find all files accessed at most 10 minutes ago
find . -amin -10
Find all files accessed at least 10 minutes ago
find . -amin +10
Comparing AND vs OR behavior
find . -type f -readable -executable
All files that are
readable
and
executable
.
find . -type f -readable -o -executable
All files that are
readable
or
executable
.
Display all the contents of files accessed in the last 10 minutes
find . -amin -10 -exec cat {} \+
On a Mac and ended up with
.DS_Store
Everywhere?
find . -name ".DS_Store" -exec rm -f {} \;
Could be
;
or
+
since
rm
allows multiple arguments.
Solve maze in one line
Maze in 2 seconds
find / -iname victory -exec handin maze {} \+
imagine how much more complicated
maze
could get in the real world…
More Involved Example
Your boss asks you to backup all the logs and send them along.
Combining some of the things we have learned so far (also zip)
# Become `root` since `/var/log` is protected:
$ sudo su
<enter password for your user>
# Make a containment directory to copy things to
$ mkdir ~/log_bku
# `find` and copy the files over in one go
$ find /var/log -name "*.log" -exec cp {} ~/log_bku/ \;
# The `cp` executed as `root`, so we cannot read them.
$ chown -R mpm288 ~/log_bku # My netID is mpm288
# Give the folder to yourself.
$ mv ~/log_bku /home/mpm288/
# Become your user again
$ exit
# Zip it up and send to your boss
$ zip -r log_bku.zip ~/log_bku
More Involved Example: Analysis
Don’t
have
to be
root
:
sudo find
was too long for slides.
Make the directory
<dir>
as normal user.
sudo find ... -exec cp {} <dir> \;
sudo chown -R <you> <dir>
zip -r <dir>.zip <dir>
Cannot
use
\+
instead of
\;
in this scenario:
Suppose you found
/var/log/a.log
and
/var/log/b.log
.
Executing with
\;
(
-exec
as you
find
):
cp /var/log/a.log ~/log_bku/
cp /var/log/b.log ~/log_bku/
Executing with
\+
(
find
all first, then
-exec
once):
cp /var/log/a.log /var/log/b.log ~/log_bku/
cp
gets mad: you gave three arguments
The high-level story is: nothing special.
Just a sequence of operations being performed.
Runs from top to bottom.
Common practice:
Executable filetype.
Shebang.
Bash Scripting at a Glance
#!/bin/bash
echo "hello world!"
echo "There are two commands here!"
#!/usr/bin/python3
print('hello there friend');
The
shebang
#!/bin/bash
is the interpreter
Run a command or two!
Always test your scripts!
#!/bin/bash
#this is a comment. Maze solution script!
find / -iname victory -exec handin maze {} \+
Some execution details
Run your scripts by providing a
qualified path
to them.
path must start with a folder
Current directory? use
./scriptname
somewhere else? specify the path to your script
Scripts execute from top to bottom.
This is just like Python, for those of you who know it already.
Bad code? you may only realize it when (and if) the script reaches that line
The script starts at the top of the file.
Execution continues down until the bottom (or
exit
called).
Broken statement? It still keeps executing the subsequent lines.
The 3 Main Modes of VIM
Normal Mode:
Launching pad to issue commands or go into other modes.
Can view the text, but not edit it directly (only through commands).
Return to normal mode
from other modes
: press
ESCAPE
Visual Mode:
Used to highlight text and perform block operations.
Enter visual mode
from normal mode
: press
v
Visual Line:
shift+v
Visual Block:
ctrl+v
Explanation: try them out, move your cursor around…you’ll see it.
Insert Mode:
Used to type text into the buffer (file).
Like any regular text-editor you’ve seen before.
Enter
from normal mode
: press
i
Moving Around VIM
Most of the time, you can scroll with your mouse / trackpad.
You can also use your arrow keys.
VIM shortcuts exist to avoid moving your hands at all. Use
h
to go left.
j
to go down.
k
to go up.
l
to go right.
Hardcore VIM folk usually map left caps-lock to be
ESCAPE
.
Goal: avoid moving your wrists at all costs. Arrows are so far!
I don’t do this. I also don’t use VIM.
Useful Commands
:help
help menu, e.g. specify
:help v
:u
undo
:q
exit
:q!
exit without saving
:e [filename]
open a different file
:syntax [on/off]
enable / disable syntax highlighting
:set number
turn line numbering on
:set nonumber
turn numbering off (e.g. to copy paste)
:set spell
turn spell checking on
:set nospell
turn spell checking off
:sp
split screen horizontally
:vsp
split screen vertically
<ctrl+w> <w>
rotate between split regions
:w
save file
:wq
save file and exit
<shift>+<z><z>
alias for
:wq
(hold shift and hit
z
twice)
WOW How about no. let’s see Emacs
Basic editing works like notepad (except no mouse)
No switching between modes to edit/search/save/etc.
Emacs can also be installed on pretty much every OS.
Allows you to edit things
moderately
quickly…
…and keeps getting faster as you learn it
Emacs modes
[cmd=(
emacs
) An editor, also from 1976.]
emacs file
[/cmd]
Based on file and action type
Java file detected? IDE mode engaged!
Plain file detected? Basic edit mode engaged!
LaTeX file detected? TeXstudio mode!
Shortcuts and actions
mostly
independent of mode
But modes hide a lot of power…
Sometimes accused of being a whole OS.
Moving around and basic editing:
move by character? Use the arrow keys!
move by word? Hold control and use the left/right arrow keys!
move by paragraph? Hold control and use the up/down arrow keys!
Saving: hold CTRL, press X then S (all while holding control
Closing: hold CTRL, press X then C (all while holding control)
Convention: C-x means “hold control, press x”
C-x C-s means “press x and s, all while holding control”
These editors predate “normal” shortcuts!
Useful Shortcuts
C-x C-f
Open a file for editing
C-x C-s
Save the current file
C-x C-c
exit
C-x b
change to a different open file
C-space (arrow key)
Start highlighting (marking) a region
C-w
Cut the code in the highlighted region
Alt-w
Copy the code in the highlighted region
C-g
Quit (cancel command, “escape”)
C-y
paste
C-s
search (find)
Escape-x
Enter a command by name (C-g to quit)
C-x k
close a file (it will ask) (emas stays open)
Escape-$
spellcheck the word under the cursor
Escape-x ispell
spellcheck the highlighted region
Escape-x help
Get just a lot of help information
Escape-x
<tab>
List ALL THINGS EMACS CAN DO
git
is a
decentralized
version control system.
Like “historic versions” for DropBox/OneDrive
Except far more advanced, and more streamlined
It enables you to save changes as you go to your code.
As you make these changes, if at any point in time you discover your code is “broken”, you can
revert
back in time!
Of course, if you haven’t been “saving” frequently, you have less to work with.
Mantra:
commit
early and often.
Can also
share
your code with friends!!
Can work on same version, or…
can “go back in time” to latest working one!
You will have trouble – we all do.
The tracked folder is called a
repository
(
repo
)
You
git init .
to create repository “here”
To
track
a file in a repository, you
git add <filename>
The act of “saving” is
commit
, and needs a message
to commit all tracked files,
git commit -a -m 'your message here'
To copy a repository, you
git clone
it
To work with friends, you need to
git clone
their (or a common) repository
git pull /other/repo/path
their changes
if you edited the same file, you get a
conflict
if you have uncommitted changes, you can’t pull
zip <name_of_archive> <files_to_include>
E.g.
zip files.zip a.txt b.txt c.txt
Extracts to
a.txt
,
b.txt
, and
c.txt
in
current directory
.
To do folders, you need recursion.
zip -r folder.zip my_files/
Extracts to folder named
my_files
in
current directory
.
Good practice to ALWAYS zip a folder and distribute with the name it will extract as.
zip -r folder_name.zip folder_name/
Drives me
crazy
when I get a
.zip
that extracts files in the same directory… very difficult to keep track of.
unzip <archive_name>
Use
-l
to list what would extract before doing it.
gzip <files_to_compress>
Less time to compress, larger file:
--fast
More time to compress, smaller file:
--best
Read the
man
page, lots of options.
By default,
replaces
the original files!
You can use
--keep
to bypass this.
gunzip <archive_name>
Use
-l
to list what would extract before doing it.
tar -cf <tar_archive_name> <files_to_compress>
C
reate a tar archive.
tar -xf <tar_archive_name>
E
x
tract all files from archive.
tar
is a stream tool. By default, it is expecting stream input.
Don’t forget the
-f
if you are working with files!
This is a non-exhaustive list. There are
many
out there.
You will
lose
all your data, you cannot read and write this way.
Piping and Redirection are quite sophisticated, please refer to the Wikipedia page in
[Wikipedia
2017
]
.
Exit Codes
When you execute commands, they have an “exit code”.
This how you “signal” to others in the shell: through exit codes.
The exit code of the
last command executed
is stored in
$?
There are various exit codes, here are a few examples:
$ super_awesome_command
bash: super_awesome_command: command not found...
$ echo $?
127
$ echo "What is the exit code we want?"
What is the exit code we want?
$ echo $?
0
The success code we want is actually
0
. Refer to
[The Linux Documentation Project
2017
a
]
.
Remember
cat
with no args? You will have to
ctrl+c
to kill it, what would the exit code be?
Executing Multiple Commands in a Row
With exit codes, we can define some simple rules to chain commands together:
Always execute:
$ cmd1; cmd2 # exec cmd1 first, then cmd2
Execute conditioned upon exit code of
cmd1
:
$ cmd1 && cmd2 # exec cmd2 only if cmd1 returned 0
$ cmd1 || cmd2 # exec cmd2 only if cmd1 returned NOT 0
Kind of backwards, in terms of what means continue for
and
, but that was likely easier to implement since there is only one
0
and many
not
0
’s.
if [ CONDITION_1 ]
then
# statements
elif [ CONDITION_2 ]
then
# statements
else
# statements
fi # fi necessary
# The `then` is necessary...
# use semicolon to shorten code
if [ CONDITION_1 ]; then
# statements
elif [ CONDITION_2 ]; then
# statements
else
# statements
fi # fi necessary
Double brackets (
bash
only!)
[[ expr ]]
allow for more features e.g., boolean operations.
both
[
and
[[
are actually commands!
if [[ CONDITION_1 ]] || [[ CONDITION_2 ]]; then
# statements
fi
elif
and
else
clauses
allowed
,
not
required
.
BE VERY CAREFUL WITH SPACES!
Spaces on both the
outside
and
the
inside
necessary
!
# bash: syntax error near unexpected token `then`
if[[ 0 -eq 0 ]]; then echo "Hiya"; fi
# bash: [[0 command not found...
if [[0 -eq 0 ]]; then echo "Hiya"; fi
# bash: syntax error in conditional expression:
# unexpected token `;'
# bash: syntax error near `;'
if [[ 0 -eq 0]]; then echo "Hiya"; fi
# This has spaces after if, and before brackets (works)!
if [[ 0 -eq 0 ]]; then echo "Hiya"; fi
Test Expressions
[
and
[[
have a special set of commands that allow checks.
Numerical comparisons (often used with variables):
$n1 -eq $n2
tests if
n
1 =
n
2
.
$n1 -ne $n2
tests if
n
1 ≠
n
2
.
$n1 -lt $n2
tests if
n
1 <
n
2
.
$n1 -le $n2
tests if
n
1 ≤
n
2
.
$n1 -gt $n2
tests if
n
1 >
n
2
.
$n1 -ge $n2
tests if
n
1 ≥
n
2
.
If either
$n1
or
$n2
are not a number, the test
fails
.
String comparisons:
"$s1" == "$s2"
tests if
s1
and
s2
are identical.
"$s1" != "$s2"
tests if
s1
and
s2
are different.
Make sure you have spaces!
"$s1"=="$s2"
will
fail
…
For strings in particular,
use double quotes
!
If string has spaces
and
no double quotes used, it will
fail
.
Path Testing
Test if
/some/path
exists:
-e /some/path
Test if
/some/path
is a file:
-f /some/path
Test if
/some/path
is a directory:
-d /some/path
Test if
/some/path
can be read:
-r /some/path
Test if
/some/path
can be written to:
-w /some/path
Test if
/some/path
can be executed:
-x /some/path
Test if
/some/path
is an empty file:
-s /some/path
Many
more of these, refer to
[The Linux Documentation Project
2017
b
]
for more.
Path Testing Example
#!/usr/bin/env bash
path="/tmp"
if [[ -e "$path" ]]; then
echo "Path '$path' exists."
if [[ -f "$path" ]]; then
echo "--> Path '$path' is a file."
elif [[ -d "$path" ]]; then
echo "--> Path '$path' is a directory."
fi
else
echo "Path '$path' does not exist."
fi
Output from script:
Path '/tmp' exists.
--> Path '/tmp' is a directory.
# Delineate by spaces, loop:
# s1, then s2, then s3, then s4
for var in s1 s2 s3 s4; do
echo "Var: $var"
done
# Output:
# Var: s1
# Var: s2
# Var: s3
# Var: s4
# Brace expansion:
# 00, 01, ..., 11
for var in {00..11}; do
echo "Var: $var"
done
The
-h
flag prevents
jobspec
from
SIGHUP
killing it.
Use if you forgot to launch with
nohup
, for example.
jobspec
is the job number (e.g., execute
jobs
to find it).
E.g., if
mplayer
has
jobID1
, then
disown -h %1
source <filename> [arguments]
Executing
script
B
from script
A
runs
B
in a
subshell
.
Sourcing
script
B
from script
A
executes in
current shell
.
If script
B
exit
s, then script
A
exit
s!
Think of it like copy-pasting
B
into
A
at the line where
source B
is written in
A
.
Just like
#include <header.h>
in
C
if you know it.
Fundamental to the initial shell setup process:
All dotfiles related to your
shell
are
sourced
.
chsh -s /absolute/path/to/new/shell username
GNU and BSD
chsh
are slightly different,
read the
man
page
!
Example usage to change
$SHELL
for
username
:
$ sudo chsh -s /bin/zsh username
Warning
: do
not
change the
$SHELL
of the
root
user!
Typically,
chsh
will modify
/etc/passwd
grep
your
username
and read last field.
Kill signals can be used by number or name.
TERM
or
15
: terminates execution (default signal sent with
kill
and
killall
).
HUP
or
1
: hang-up (restarts the program).
KILL
or
9
: like bleach, can kill anything.
Some examples:
# Terminates process with PID 9009.
$ kill 9009
# REALLY kills the process with PID 3223.
$ kill -9 3223
# Restarts the process with PID 12221.
# Particularly useful for servers / daemon processes.
$ kill -HUP 12221
Remember
top
and
htop
? They can both
renice
and
kill
“Dotfiles” change, add, or enhance existing functionality.
Use
ctrl+D
to close current
in-focus
pane / window.
If you close the last pane of a session, that session ends.
On
ugclinux
(CS Undergraduate servers) I am
mpm288
:
v1:
ssh mpm288@ugclinux.cs.cornell.edu
v2:
ssh -l mpm288 ugclinux.cs.cornell.edu
Sweet!
ugclinux
has Matlab, can I use it?
$ /usr/local/MATLAB/R2012a/bin/matlab
Warning: No display specified. You will not be able to
display graphics on the screen.
>> exit()
# exit() left Matlab
$ exit # close the ssh connection
Now do:
ssh -X mpm288@ugclinux.cs.cornell.edu
$ /usr/local/MATLAB/R2012a/bin/matlab
# Matlab displays on my screen now!
Or it can be much more, using regular expressions.
Common use:
<command> | grep <thing you need to find>
You have some
command
or sequence of commands producing a large amount of output.
The output is longer than you want, so filter through
grep
.
Reduces the output to only what you really care about!
Understanding how to use
grep
is
really
going to save you a lot of time in the future!
The
*
matches any
string
, including the null
string
.
It is a “greedy” operator: it expands as far as it can.
Is
related
to the
Kleene Star
, matching
0 or more
occurrences.
For shell,
*
is a
glob
. See
[The Linux Documentation Project
2017
c
]
for more.
# Does not match: AlecBaldwin
$ echo Lec*
Lec.log Lecture1.tex Lecture1.txt Lecture2.txt Lectures
# Does not match: sure.txt
$ echo L*ure*
Lecture1.tex Lecture1.txt Lecture2.txt Lectures
This is the greedy part:
L*
⇒
Lect
# Does not match: tex/ directory
$ echo *.tex
Lecture1.tex Presentation.tex
Matces
existing files/dirs
, does
not
define sequence
The
?
matches a
single
character.
# Does not match: Lec11.txt
$ echo Lec?.txt
Lec1.txt Lec2.txt Lec3.txt
Lec
11
not matched because it would have to
consume
two characters, the
?
is
exactly one
character
Which character, though, doesn’t matter.
# Does not match: ca cake
$ echo ca?
can cap cat
Again matches existing files/dirs!
[brackets]
are used to define
sets
.
Use a dash to indicate a range of characters.
Can put commas between characters / ranges (
[a-z,A-Z]
).
Means
either
one lower case
or
one upper case letter.
[a-z]
only matches
one
character.
[a-z][0-9]
: “find exactly
one
character in
a..z
,
immediately
followed by
one
character in
0..9
”
Input
Matched
Not Matched
[SL]ec*
Lecture Section
Vector.tex
Day[1-3]
Day1 Day2 Day3
Day5
[a-z][0-9].mp3
a9.mp3 z4.mp3
az2.mp3 9a.mp3
Inverting Sets
The
^
character is represents
not
.
[abc]
means
either
a
,
b
,
or
c
So
[^abc]
means
any
character that is
not
a
,
b
, or
c
.
Input
Matched
Not Matched
[^A-P]ec*
Section.pdf
Lecture.pdf
[^A-Za-z]*
9Days.avi
vacation.jpg
sets, inverted or not, again match existing files/dirs
Brace Expansion
Brace Expansion
:
{...,...}
matches any pattern inside the comma-separated braces.
Suports ranges such as
11..
22
or
t..z
as well!
Brace expansion needs at least two options to choose from.
Input
Output
{Hello,Goodbye}\ World
Hello World Goodbye World
{Hi,Bye,Cruel}\ World
Hi World By World Cruel World
{a..t}
Expands to the range
a
…
t
{1..99}
Expands to the range
1
…
99
Note
: NO SPACES before / after the commas!
Mapped onto following expression where applicable:
Following expression must be
continuous
(whitespace escaped)
See next slide.
Braces
define a sequence
, unlike previous!
Brace Expansion in Action
# Extremely convenient for loops:
# prints 1 2 3 ... 99
$ for x in {1..99}; do echo $x; done
# bash 4+: prints 01 02 03 .. 99
$ for x in {01..99}; do echo $x; done
# Expansion changes depending on what is after closing brace:
# Automatic: puts the space between each
$ echo {Hello,Goodbye}
Hello Goodbye
# Still the space, then *one* 'World'
$ echo {Hello,Goodbye} World
Hello Goodbye World
# Continuous expression: escaped the spaces
$ echo {Hello,Goodbye}\ Milky\ Way
Hello Milky Way Goodbye Milky Way
# Yes, we can do it on both sides. \\n: lose a \ in expansion
$ echo -e {Hello,Goodbye}\ Milky\ Way\ {Galaxy,Chocolate\ Bar\\n}
Hello Milky Way Galaxy Hello Milky Way Chocolate Bar
Goodbye Milky Way Galaxy Goodbye Milky Way Chocolate Bar
Symbols
Meaning
*
Multiple character wildcard: 0 or
more
of
any
character.
?
Single character wildcard: exactly one, don’t care which.
[]
Create a set, e.g.
[abc]
for
either
a
, or
b
, or
c
.
^
Invert sets:
[^abc]
for anything
except
a
,
b
, or
c
.
{}
Used to create enumerations:
{hello,world}
or
{1..11}
$
Read value:
echo $PWD
reads
PWD
variable, then
echo
<
Redirection: create stream out of file
tr -dc '0-9' < file.txt
>
Redirection: direct output to a file.
echo "hiya" > hiya.txt
&
Job control.
!
Contextual. In Shell history, otherwise usually negate.
#
Comment: anything after until end of line not executed.
Non-exhaustive list: see
[The Linux Documentation Project
2017
d
]
for the full listing.
Special characters inside
double
quotes “prefer” not to expand
some still need escaping
Special characters in
single
quotes are
never
expanded.
# prints the letters as expected
$ for letter in {a..e}; do echo "$letter"; done
# escaping the money sign means give literal $ character
$ for letter in {a..e}; do echo "\$letter"; done
# $ is literal now, so doesn't read variable
$ for letter in {a..e}; do echo '$letter'; done
Pay attention to your text editor when writing scripts.
Like the slides, there is syntax highlighting.
It
usually
changes if you alter the meaning of special characters.
If you remember anything about shell expansions, remember the difference between single and double quotes.
Some Useful Grep Options
-i
: ignores case.
-A 20 -B 10
: print 10 lines
B
efore, 20 lines
A
fter each match.
-v
: inverts the match.
-o
: shows only the matched substring.
-w
: “word-regexp” – exclusive matching,
read the man page
.
-n
: displays the line number.
-H
: print the filename.
--exclude <glob>
: ignore
glob
e.g.
--exclude *.o
-r
: recursive, search subdirectories too.
Note:
your Unix version may differentiate between
-r
and
-R
, check the
man
page.
grep -r [other flags] <pattern> <directory>
That is, you specify the
pattern
first, and where to search after (just like how the
file
in non-recursive
grep
is specified last).
Regular Expressions
grep
, like many programs, takes in a
regular expression
as its
input
. Pattern matching with regular expressions is more sophisticated than shell expansions, and also uses different syntax.
More precisely, a regular expression
defines
a set of strings – if any part of a line of text is
in the set
,
grep
returns a
match
.
When we use regular expressions, it is (usually) best to enclose them in quotes to stop the shell from expanding it before passing it to
grep
/ other tools.
WARNING
When using a tool like
grep
, the shell expansions we have learned
can
and do still occur! I
strongly
advise using
double quotes
to circumvent this. Or if you want the literal character (e.g. the
*
), use
single quotes
to disable all expansions entirely.
Regular Expression Similiarities
Some
regex
patterns are similar / the same.
Single Characters are Different
Shell Expansion:
?
Regular Expressions:
.
?
means something different in regex (Differences slide).
Example:
grep "t.a"
⇒
lines with
tea
,
taa
, and
steap
Sets are almost the Same
Shell Expansion:
[a-z]
Regular Expressions:
[a-z]
Matches one of the indicated characters.
Don’t separate multiple characters with commas in the
regex
form (e.g.
[a,b,q-v]
becomes
[abq-v]
).
A Note on Ranges in Sets
Like shell wildcards, regex is case-sensitive.
How would you match any letter, regardless of case?
If you take a look at the ASCII codes (
[ASCII Table
2010
]
), you will see that the lower case letters come
after
the upper case letters.
You should be careful about trying to do something like
[a-Z]
.
Instead, just do
[a-zA-Z]
.
Or use the POSIX set
[[:alpha:]]
.
Note:
some programs
may
accept the range
[a-Z]
.
But it may not actually be the range you think. It depends.
Regular Expression Differences
Some of the shell expansion tools are
completely
different.
Modifiers Apply to the Expression
Before
Them
?
is
0 or 1
occurences:
a?
⇒
0 or 1
a
*
is
0 or more
occurences:
a*
⇒
0, 1, …
n
a
’s
+
is
1 or more
occurences:
a+
⇒
1, 2, …
n
a
’s
Note
:
+
and
?
are
extended
regular expression characters.
Must escape (
\+
and
\?
) or use
-E
or
egrep
.
# Nothing happens, they weren't escaped
$ grep "f?o+" combined/*.*
# f\? can be 0, so h{e,3}llo are found
$ grep "f\?o\+" combined/*.*
combined/foo.tex:1:foo
combined/foo.text:1:foo
combined/foo.txt:1:foo
combined/h3llo.txt:1:h3llo
combined/hello.txt:1:hello
# Second expansion: treated as file input to grep
# You can only supply *ONE* pattern!
$ grep h{e,3}llo combined/*.*
grep: h3llo: No such file or directory
combined/hello.txt:1:hello
# Double quotes won't save you: that's the literal
# string 'h{e,3}llo' at this point (so no match).
$ grep "h{e,3}llo" combined/*.*
AKA you cannot
easily
do these expansions when using
grep
.
{}
.bash are
fundamentally different
from the other expansions
defines a sequence, does not match existing targets.
Final Thoughts and Additional Resources
The regular expressions we use in our shell are the “Perl Regular Expressions.”
Use
-d
to specify the delimiter (
TAB
by default).
Use
-s
to concatenate serially instead of side-by-side.
No
options
and one
file
specified: same as
cat
.
Use with
-s
to join all lines of a file.
split [options] [file [prefix]]
Use
-l
to specify how many lines in each file
Default:
1000
Use
-b
to specify how many
bytes
in each file.
The
prefix
is prepended to
each file
produced.
If no
file
provided (or if
file
is
-
),
stdin
is used.
Use
-d
to produce numeric suffixes instead of lexographic.
Not available on BSD / macOS.
join [options] file1 file2
Join two files at a time, no more, no less.
Default: files are assumed to be delimited by
whitespace
.
Use
-t <char>
to specify alternative
single-character
delimiter.
Use
-1 n
to join by the
n
th
field
of
file1
.
Use
-2 n
to join by the
n
th
field
of
file2
.
Field numbers start at
1
, like
cut
and
paste
.
Use
-a f_num
to display unpaired lines of file
f_num
.
employees.csv
Alice,female,607-123-4567,11 Sunny Place,Ithaca,NY,14850
Bob,male,607-765-4321,1892 Rim Trail,Ithaca,NY,14850
Andy,n/a,607-706-6007,1 To Rule Them All,Ithaca,NY,14850
Bad employee data without proper delimiter
/course/cs2043/demos/10-demos/employees.csv
Get names, ignore improper lines:
$ cut -d , -f 1 -s employees.csv
Get names and phone numbers, ignore improper lines:
$ cut -d , -f 1,3 -s employees.csv
Get address (
4
th
col and after), ignore improper lines:
$ head -1 no_spoon.txt
There is no spoon. There is no spoon. There is no spoon. There is no spoon.
$ sed 's/no spoon/a fork/g' no_spoon.txt
There is a fork. There is a fork. There is a fork. There is a fork.
...
There is a fork. There is a fork. There is a fork. There is a fork.
Replaces
no spoon
with
a fork
for every line.
No ending
/g
? Only one substitution per line:
$ sed 's/no spoon/a fork/' no_spoon.txt
There is a fork. There is no spoon. There is no spoon. There is no spoon.
...
There is a fork. There is no spoon. There is no spoon. There is no spoon.
Caution
: get in habit of using
single-quotes
for with
sed
.
Otherwise special shell characters (like
*
) may expand in
double-quotes
causing you sadness and pain.
Deletion
Delete all
lines
that contain
regex
:
sed '/regex/d'
david.txt
Hi, my name is david.
Hi, my name is DAVID.
Hi, my name is David.
Hi, my name is dAVID.
joinsplit_join/ages.txt
and
split_join/salaries.txt
files into
results.txt
:
$ join -a1 ages.txt salaries.txt > results.txt
$ cat results.txt
Alice 44
Bob 30 300,000
Candy 12 120,000
function <name> {
body...
}
line breaks are essential!
Just like a switch statement in other languages, only better.
Does not carry on to all cases if you forget that
break
keyword.
case "$var" in
"A" )
cmds to execute for case "A"
;;
"B" )
cmds to execute for case "B"
;;
* )
cmds for DEFAULT (not matched) case
;;
Sort of like shorthand for
if
-
elif
-
else
statements…
…only not quite the same!
Simple If and Case Examples
Make a simple program to print between 0 and 2
blargh
s
Input is
$1
, explicit check not necessary (
else
or
*)
case)
#! /usr/bin/env bash
#
# (empty to fill space in minted)
# (empty to fill space in minted)
# (empty to fill space in minted)
#
if [[ "$1" == "0" ]]; then
echo "0 blargh echoes..."
elif [[ "$1" == "1" ]]; then
echo "1 blargh echoes..."
echo " [1] blargh"
# number or string
elif [[ "$1" -eq 2 ]]; then
echo "2 blargh echoes..."
echo " [1] blargh"
echo " [2] blargh"
else
echo "Blarghs come in [0-2]."
exit 1
fi
#!/usr/bin/env bash
case "$1" in
[[:digit:]] )
echo "$1 blargh echoes..."
for (( i = 1; i <= $1; i++ )); do
echo " [$i] blargh"
done
;;
* )
echo "Blarghs only come in [0-9]."
exit 1
;;
esac
Works on inputs
0-9
, as well as exit for everything else.
Will
not
match
11
(sets only match one character, see
[Bash Reference Manual
2017
a
]
).
So
*)
being
last
is
equivalent
to
default
in other languages
#!/usr/bin/env bash
if [[ "$1" =~ [[:digit:]] ]]; then
echo "$1 blargh echoes..."
for (( i = 1; i <= $1; i++ )); do
echo " [$i] blargh"
done
else
echo "Blarghs only come in [0-9]."
exit 1
fi
Works on
[0-9]
.
Cool! Works on
99
.
Whoops! Works on
208a
– the
for
loop crashes!
Using Sets with If Part 2
Option 1: negate a negation (read:
if not “not a number”
):
# +-----------+ +-----------------+
# | Negate if | | Negate (invert) |
# | match | | set |
# +-----------+ +-----------------+
# | |
if [[ ! "$1" =~ [^[:digit:]] ]]; then
Option 2: use a complete
extended regular expression pattern
:
# +----------------------+
# | ^: beginning of line |
# +----------------------+
# |
if [[ "$1" =~ ^[[:digit:]]+$ ]]; then
# +--------------------+ || +-----------------------+
# | +: 1 or more digit |--++--| $ matches end of line |
# +--------------------+ +-----------------------+
Using Sets with If Part 3 (We’re Finsihed, Right?!)
The last example felt pretty bullet-proof, what can go wrong?
some_array=( zero one two ) # Indices: 0, 1, 2
some_array[11]=11 # Indices: 0, 1, 2, 11
some_array["hi"]="there" # Indices: 0, 1, 2, 11, "hi"
You
cannot
have an
array
of
array
s.
Array Functions
You perform an
array
operation with
${expr}
Works on non-arrays too; mandatory for arrays
You use the name of the variable followed by the operation:
echo "Index 11: ${arr[11]}" # prints: Index 11: 11
echo "Index 51: ${arr[51]}" # prints: Index 51: a string value
echo "Index 0: ${arr[0]}" # DOES NOT EXIST! (aka nothing)
Like loops,
@
and
*
expand differently:
echo "Individual: ${arr[@]}"
# Individual: 11 22 33 a string value different string value
echo "Joined::::: ${arr[*]}"
# Joined::::: 11 22 33 a string value different string value
Differently how?
echo "Length of Individual: ${#arr[@]}"
# Length of Individual: 5
echo "Length of Joined::::: ${#arr[*]}"
# Length of Joined::::: 5
Differently HOW?!!!
Easier to compare with loops
Remember that
;
allows you to continue on the same line.
Individual expansion (
@
):
for x in "${arr[@]}"; do echo "$x"; done
# 11
# 22
# 33
# a string value
# different string value
Joined expansion (
*
):
for x in "${arr[*]}"; do echo "$x"; done
# 11 22 33 a string value different string value
The
*
loop only executes once (everything is
globbed
together).
The
@
loop iterates over each element in the array.
new_arr=([17]="seventeen" [24]="twenty-four")
new_arr[99]="ninety nine" # may as well, not new
for x in "${new_arr[@]}"; do echo "$x"; done
# seventeen
# twenty-four
# ninety nine
Get the list of indices:
for idx in "${!new_arr[@]}"; do echo "$idx"; done
# 17
# 24
# 99
Array Slicing
You can just as easily
slice
your arrays.
Use
@
to get whole array, then specify indices to
slice
Syntax:
${array_var[@]:start_index:slice_size}
If
end_index
is not specified, takes until last index
zed=( zero one two three four )
echo "From start: ${zed[@]:0}"
# From start: zero one two three four
echo "From 2: ${zed[@]:2}"
# From 2: two three four
echo "Indices [2-4]: ${zed[@]:2:3}"
# Indices [2-4]: two three four
for x in "${zed[@]:2:3}"; do echo "$x"; done
# two
# three
# four
for x in "${zed[*]:2:3}"; do echo "$x"; done
# two three four
More…
This was a
small subset
of what can be done with
bash
arrays.
I highly suggest you go through the examples listed in
[Bash Reference Manual
2017
b
]
in.
Search for
Substring Removal
for some insanely cool tricks!
ln [flags] <source> <target>
works like
cp
; from
src
to
dst
creates a
peer
link; no notion of “original”
only works on files
ln -s [flags] <source> <target>
technically the same command as
ln
, but used very differently with the -s flag!
creates a
subordinate
link; refers to the
path.
</source>
doesn’t check to see if the source path was sensible first!
works on files or directories.
awk
is a programming language designed for processing text-based data.
Allows easy operation on fields rather than full lines.
Works in a
pattern-action
manner, like
sed
.
Supports numerical types (and operations).
Supports control-flow (e.g.,
if
-
else
statements).
Created at Bell Labs in the 1970s.
Alfred
A
ho, Peter
W
einberger, and Brian
K
ernighan
An ancestor of
perl
, a
cousin
of
sed
.
K
ernighan and
R
itchie also invent C
Very
powerful.
It’s
Turing Complete
!
… a lot of things are.
gawk
gawk
is the GNU implementation of the
awk
programming language.
On BSD/OSX, it is just called
awk
.
On GNU, it is technically
gawk
, but should reliably be
symlinked
as
awk
.
There are many different implementations of the AWK programming language.
If you use C or C++, this is similar to how there are different compilers. The compiler is an “implementation” of the language (big quotes on that…).
If you use Python, it’s like the difference between CPython, PyPy, Jython, etc.
Different implementations of the same programming language.
Proceeds line by line, checking each pattern one by one.
If the pattern is found, the
{ commands }
are executed.
So for the above:
First line of input grabbed.
pattern1
checked, if match
{ commands1 }
executed.
pattern2
checked, if match
{ commands2 }
executed.
Next line of input grabbed.
Check
pattern1
, then
pattern2
, so on and so forth…
Print all lines containing Monster or monster.
awk '/[Mm]onster/ {print}' frankenstein.txt
If no action specified, default is to print the whole line.
awk '/[Mm]onster/' frankenstein.txt
The
$0
variable in
awk
refers to the whole line.
awk '/[Mm]onster/ {print $0}' frankenstein.txt
First field (delimited by whitespace, or change
field separator
).
awk '/[Mm]onster/ {print $1}' frankenstein.txt
awk
understands extended regular expressions by default :)
We don’t need to escape
+
,
?
, etc!
awk
allows us blocks of code to be executed only once, at the beginning / end.
With demo file
monstrosity.awk
and data file
frankenstein.txt
in current directory:
#!/usr/bin/awk -f
BEGIN { print "Starting search for monster..." }
/[Mm]onster/{ count++ } # Increment if [Mm]onster found
END { print "Found " count " monsters in the book." }
Use the
-f
in the shebang to tell
awk
it expects a script.
$ ./monstrosity.awk # hangs... no input file
$ ./monstrosity.awk frankenstein.txt # yay!
# shebang '#!/usr/bin/awk -f' makes same as ...
$ awk -f monstrosity.awk frankenstein.txt
NF
: the number of fields in the current line.
NR
: the number of lines read so far.
You cannot change
NF
or
NR
FILENAME
: the name of the input file.
FS
: the
field separator
.
Example: change
FS=","
for processing a comma-separated-value sheet.
Can also specify
-F
flag (capital!) to set the
FS
.
awk
can match any of the following pattern types:
/regular expression/
relational expression
pattern1 && pattern2
pattern1 || pattern2
pattern1 ? pattern2: pattern3
If
pattern1
, then match
pattern2
. Otherwise, match
pattern3
(pattern)
: parenthesis to group / change order of operations.
! pattern
to invert
pattern
pattern1, pattern2
: match
pattern1
, work on every line until matches
pattern2
Install a
formula
:
brew install <fmla1> <fmla2> ... <fmla2>
Remove a formula:
brew uninstall <fmla1> <fmla2> ... <fmlaN>
Only one
fmla
required, but can specify many.
“Group” packages have no meaning in
brew
.
Updating components:
Update
brew
, all
taps
, and installed formulae listings. This does not update the actual software you have installed with
brew
, just the definitions:
brew update
.
Update just installed formulae:
brew upgrade
.
Specify a
formula
name to only upgrade that formula.
Searching for packages:
Same command:
brew search <formula>
There are so many package managers out there for different things, too many to list them all!
Install a
formula
:
brew install <fmla1> <fmla2> ... <fmla2>
Remove a formula:
brew uninstall <fmla1> <fmla2> ... <fmlaN>
Only one
fmla
required, but can specify many.
“Group” packages have no meaning in
brew
.
Updating components:
Update
brew
, all
taps
, and installed formulae listings. This does not update the actual software you have installed with
brew
, just the definitions:
brew update
.
Update just installed formulae:
brew upgrade
.
Specify a
formula
name to only upgrade that formula.
Searching for packages:
Same command:
brew search <formula>
There are so many package managers out there for different things, too many to list them all!