CS 3410 Fall 2018
Due: Sunday, October 14th at 11:59 PM. Submit all required files on CMS.
For this lab, we will cover virtual machines (VMs), SSH, Linux commands, and the GCC compiler. Before we go into the core material of this lab, we will give brief descriptions of various terms to make sure everyone is on the same page with certain concepts.
Motivation. Computers come in many shapes and forms. As a
result, compiling and running your code on different machines may produce
wildly different results. In this course, we want to ensure that the everyone
is working in the same environment so that we can guarantee that if it works
for you, it will also work for us (and our autograder). We therefore require
that you do all of your projects work on the Ubuntu environment that we set up
for you, because some of the projects we will be doing later on may work
differently in different environments. We want to avoid anyone submitting
anything that works on their computer but then doesn't work on our machines
because the environment is different.
The environment that we will be testing your code on is the Cornell CS
Undergraduate Linux Cluster. This means that we require that your code
should run correctly on these machines. It won’t help you if it works on your
computer, but not on the cluster. This cluster can be accessed remotely using
SSH, which will be explained in this lab. We also cover how to set up a similar
environment locally in the form of a virtual machine.
In this class, the environment that we will be developing our code in is a distribution of Linux, called Ubuntu.
What is Linux?
Linux is an operating system that is free to the public to download, since it was created as an open-source program. It was first released by Linus Torvalds in 1991. Since then, Linux has evolved through many improvements made by the computing community. It used to be considered the most secure OS because of the vast number of programmers who can fix bugs quickly. However, malicious hackers have been targeting Linux more in recent years, possibly because of its increasing popularity as server software.
The system is based on an older OS version called UNIX. To navigate through a Linux or UNIX system, you need to type in instructions to the computer using what is called a command-line prompt (nowadays, we often use graphical user interfaces (GUIs) to click on folders instead of typing commands to find those folders). You can even download software using the command-line prompt. Later in this lab, you will learn some of the most common Linux commands.
What is SSH?
SSH (Secure SHell) is a method of remotely connecting to and running commands on another computer ("remote" meaning something far away, so we are connecting to a computer far away). It can be accessed using the ssh command in a shell (like the Mac Terminal). It lets you access the resources of another computer that you may not have on your own computer. However, you cannot use SSH without an Internet connection. Cornell makes Ubuntu computers available to computer science students, which you can access remotely with SSH.
Windows does not come with an SSH client, which means you will need to install one on your computer if you are a Windows user. The setup for this is described later.
What is a Virtual Machine (VM)?
A virtual machine is an application that can imitate hardware such that you can run more than one operating system on your own computer. For example, I can run Windows on a Mac computer using a virtual machine. The VM can run while my usual operating system running without requiring a reboot. It works like any other application, sitting in a window you can minimize when you want to go on Facebook and maximize when you want to work on CS 3410 again. In case you hear this terminology, the Host OS refers to your computer, while the Guest OS is the operating system you are running on the VM.
For this class, you will have the option of either using the VirtualBox VM to run Ubuntu or using SSH to remotely connect to a Cornell Ubuntu machine. Both methods ensure that you have access to an Ubuntu machine. Using SSH may be easier to set up for some students (despite the length of the write-up), but as aforementioned, you must make sure to have an Internet connection.
Why are going through all the trouble of setting this up?
Scroll up for the Motivation.
Important Note for SSH
This is a system that will work anytime you’re connected to the internet.... on Cornell Campus. Cornell's SSH network requires being connected to the same network as the machine you’re connecting to. Never fear though, the solution to this is called a VPN - a Virtual Private Network. This is a service that routes your traffic through a server (in this case on Cornell campus) so that you can connect to the VPN even when not on Cornell Wifi.
You can find the installation and connection instructions here. After installation, follow the connection tutorial at the same page to set up the connection.
Important: If you use Windows, your computer does not come with an SSH client. Skip down to "For Windows users only", then come back here to complete the process
For everyone:
ssh netid@ugclinux.cs.cornell.edu
, where netid is your netidexit
or hit Ctrl-DFor Windows users only:
Choose one of the two options:
C:\Program Files\Git\usr\bin
or C:\Program Files\Git\bin
. Find which of those directories contains a file called ssh.exe
, then copy that filepathIt is also possible to use PuTTY given how simple it is to set up. You can download PuTTY here: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html.
As for using PuTTY, open up the installed program and on the default "Session" page, input netid@ugclinux.cs.cornell.edu
for Host Name and leave the rest of the settings at their default. Then click "Open" to start your ssh session, where it will ask you for your password. You have the option to save the session in the PuTTY configuration window to make future connections faster, but this is not necessary. SSH works the same way as described in the instructions above, once you are connected through PuTTY.
For everyone:
Share files from computer with VM
sudo adduser vm vboxsf
sudo reboot
/media/
* You can learn more about VirtualBox at https://www.virtualbox.org/manual/ch01.html
Now that we have a Linux environment to use, let’s learn some commands to make your Linux machine do your bidding! But first, we have a few more bits of knowledge for you. Note that there are some necessary tasks to complete along with the reading!
Which type of shell are we using?
We are using the bash shell, which is a typical default for Linux systems. Type in the following command into your shell to check!
echo $0
Can we use editors within the shell?
Yep. You have several at your disposal - nano
, emacs
, and vim
. Each have their own set of commands for various behaviors like saving text, exiting, etc. If you have never used any of these editors before, we suggest getting started with nano
to get accustomed to using an editor within a shell, and because the commands are written at the bottom of the nano screen, as shown in the screenshot below:
The ^ stands for the Control key on your keyboard (this is true even for Mac users - do not use the Command key!).
Complete the following:
nano
into the shell.science.txt
(to be consistent with the naming in a tutorial we will use later).nano science.txt
How do I get files from my computer to the remote machine and back?
For this class, the best option is to use your git repository. Complete the following to get the repository on your remote machine:
git clone your_repository_link
, where your_repository_link
is what you found in step 3.You should now be able to navigate through the downloaded directory. Use the git commands to add, commit, push, pull, etc. to update your files. If you need a refresher on how to use git for the command-line prompt, see the tutorials in the Git, etc. section of the course resources page.
What are man pages?
These are manual pages, not pages for men. If you cannot remember what parameters you need for a command, you can type in man command_name
to obtain a description. For example, try man mv
. Use the up and down arrows to scroll, and use q
to leave the screen. You can even do man man
to get information about man
itself.
Read this whole part carefully before starting!
Complete Tutorials 1, 2, and 3 from the UNIX Tutorial for Beginners.
For the second command under Section 2.1 Copying Files, instead of copying the file given on the website, copy your previously made science.txt
file from the unixstuff
parent directory using
cp ../science.txt .
(note the dots - there are four in this line!)
Save information about your Linux distribution to a file called linux_info.txt
. This is to verify that you genuinely have the correct environment set up. Try running the command lsb_release -a
and looking at the output. The lsb_release
command prints out information about your Linux distribution (Ubuntu). LSB stands for Linux Standard Base, which is a project to help standardize the various distributions of Linux. It should look something like this:
No LSB modules are available. Distributor ID: LinuxMint Description: Linux Mint 17.3 Rosa Release: 17.3 Codename: rosa
The above is an example of lsb_release -a
output created by running the command on a different version of Linux. Yours might look very similar, but it's okay if the Linux distibution and version are different (i.e. Ubuntu).
Yes, you will see the message No LSB modules are available.
From what we can figure, Ubuntu is not totally LSB compliant since it is "a considerable amount of work for little measurable benefit" according to those working on the Debian project.
Put the output of this command in a file by redirecting
the output of the command, which you should have learned in Tutorial 3 of the UNIX commands. The redirection you learned redirects only stdout
(standard output), which means the No LSB modules are available.
line will still be output to the terminal and not to the file. This is because that line is output to stderr
(standard error), and so isn't redirected with the rest of the text. This is fine - we're only looking for your file to contain the lines outputted to stdout, which means the contents of your file should look similar to the block above but missing the first line.
Some useful tips:
Welcome to C! It is a programming language that was first introduced almost half a century ago but is still one of the most commonly used programming languages due to its speed, efficiency, and ability to closely interface with hardware. Let’s finish discussing a few more terms:
What is a high-level programming language?
Thus far, you have been learning a machine language (the binary instructions) and an assembly language. These are what we call low-level programming languages, languages that are "close to the hardware". High-level programming languages, on the other hand, are more similar to human languages and thus are easier for programmers to read and write code. C is an example of a high-level programming language.
How do we get high-level programming languages to be read by the hardware?
Basically, we have to do a few conversions, where software like compilers and assemblers convert programs to other types of programs. Here is one possible chain:
High-level program -> Compiler -> Assembly language program -> Assembler -> Machine language program -> Linker (and machine code from libraries) -> Complete machine code -> Loader -> Machine code loaded into memory -> Hardware
These pieces will be explained more in-depth during the Linkers & Loaders lecture.
What is a statically-typed language?
This is a language that requires all variables to be declared at compile time. In other words, you have to give the type of each variable (e.g. int
, char
) in the source code. This means that the compiler has a chance to check whether you stored the right types of values in each variable before you get to run the program, which helps to prevent bugs early on in the coding process. C is a statically-typed language. (An example of a dynamically-typed language, where the program does not catch type errors until runtime, is Python 🐍.)
Now that you know how all the pieces fit together, let's get back to coding in C! On your Ubuntu machine, open up a text editor and create a new file called hello.c
. Type in the following C program:
#include <stdio.h> int main() { printf("Hello world! I am netid.\n"); return 0; }
But replace netid
with your NetID. When you are done typing, save the file and exit the editor. Now you are ready to compile and run the program you just created! In your terminal, navigate to where you saved hello.c
. Now run the command (the ~$ is part of the prompt and is not to be typed):
gcc -o sayhello hello.c
The C compiler (GCC) has compiled your source code hello.c
into an executable named sayhello
. The -o
option allows us to name the resulting executable file anything we want. Without the flag, the default name given to the executable is a.out
.
If this gives you any errors, make sure you are in the right working directory (use ls
to confirm that hello.c
is in the same directory). If you are in the right location, then you did not enter the program correctly — go back to your editor and fix the program. Otherwise, you have just compiled a C program! You can run your program by running the command:
./sayhello
And your program should run! It should print Hello world! I am netid.
and do nothing else. For example, if your NetID is "abc123", then compilation and execution would look like something like this:
gcc -o sayhello hello.c ./sayhello Hello world! I am abc123.
What does ./sayhello
mean? The command ./sayhello
means, "Run the executable sayhello
in the current directory."
Now that you have completed this lab, we would like to know the steps that you took to get here, and how you will be writing and testing C code for future assignments!
Think about:
Submit this as a text file called procedure.txt
. Feel free to talk this through with a TA during lab or office hours!
Congrats, you're done! It's time to submit some of the various files you've made to CMS. In order to get these files from either the machine you are ssh'ed into or the VM onto your host machine so you can submit them, you'll likely have to push them to your Github repositories. Feel free to make a new folder to do so. Please include
science.txt
file you made during the Linux Command Overview portionlinux_info.txt
file you made in Part 2hello.c
file you made in Part 3, with your netid filled inprocedure.txt
file that you outlined in Part 4There's a lot more that we would love to show you, but unfortunately cannot due to time constraints of the lab.
But, we encourage you to try some of the things shown below during your space time! :)
Below, we have a sample C program that implements Fibonacci. This may be familiar from Project 2. Let's open a text editor once again and copy the program into a new file fibonacci.c
.
#include "stdio.h" // Headers for standard input/output #include "stdlib.h" // C standard library // From the P2 writeup. int i_Fibonacci(int n) { int f1 = 0; int f2 = 1; int fi; if (n == 0) { return 0; } if (n == 1) { return 1; } for (int i = 2; i <= n; i++) { // declaration of i in the loop guard is only valid in C99 or higher fi = f1 + f2; f1 = f2; f2 = fi; } return fi; } // An unoptimized recursive implementation int r_Fibonacci(int n) { if (n == 0 || n == 1) { // logical or return n; } else { return r_Fibonacci(n-1) + r_Fibonacci(n-2); } } int main() { printf("Fibonacci(12) as computed iteratively: %d\n", i_Fibonacci(12)); printf("Fibonacci(12) as computed recursively: %d\n", r_Fibonacci(12)); }
Now, compile the program with the command:
gcc -o fibonacci.out -Wall -Werror -Wextra -std=c99 fibonacci.c
The flag -std=c99
is necessary to tell GCC to compile using the C99 language standard, rather than the C89 that it defaults to. C99 is more convenient to write, and we’ll be grading based off that. The -Wall
and -Wextra
flags add additional compile-time warnings, which will help catch bugs in your code. The -Werror
flag tells the compiler to treat warnings as errors, causing compilation to fail until all warnings are resolved.
Now, let's run our fibonacci program!
./fibonacci.out Fibonacci(12) as computed iteratively: 144 Fibonacci(12) as computed recursively: 144
tmux
is a program that allows you to control multiple terminal sessions at the same time. This can be helpful for when you want to run commands simulaneously without having to open multiple terminal windows. It also has more powerful features such as sending commands to multiple terminal sessions at the same time, saving sessions across logins, etc.
You can set it up by following the tutorial here: https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340.