Lab 7: Introduction to UNIX and C

CS 3410 Fall 2018


Due: Sunday, October 14th at 11:59 PM. Submit all required files on CMS.


For this lab, we will cover virtual machines (VMs), SSH, Linux commands, and the GCC compiler. Before we go into the core material of this lab, we will give brief descriptions of various terms to make sure everyone is on the same page with certain concepts.

Motivation. Computers come in many shapes and forms. As a result, compiling and running your code on different machines may produce wildly different results. In this course, we want to ensure that the everyone is working in the same environment so that we can guarantee that if it works for you, it will also work for us (and our autograder). We therefore require that you do all of your projects work on the Ubuntu environment that we set up for you, because some of the projects we will be doing later on may work differently in different environments. We want to avoid anyone submitting anything that works on their computer but then doesn't work on our machines because the environment is different.
The environment that we will be testing your code on is the Cornell CS Undergraduate Linux Cluster. This means that we require that your code should run correctly on these machines. It won’t help you if it works on your computer, but not on the cluster. This cluster can be accessed remotely using SSH, which will be explained in this lab. We also cover how to set up a similar environment locally in the form of a virtual machine.

In this class, the environment that we will be developing our code in is a distribution of Linux, called Ubuntu.

What is Linux?

Linux is an operating system that is free to the public to download, since it was created as an open-source program. It was first released by Linus Torvalds in 1991. Since then, Linux has evolved through many improvements made by the computing community. It used to be considered the most secure OS because of the vast number of programmers who can fix bugs quickly. However, malicious hackers have been targeting Linux more in recent years, possibly because of its increasing popularity as server software.

The system is based on an older OS version called UNIX. To navigate through a Linux or UNIX system, you need to type in instructions to the computer using what is called a command-line prompt (nowadays, we often use graphical user interfaces (GUIs) to click on folders instead of typing commands to find those folders). You can even download software using the command-line prompt. Later in this lab, you will learn some of the most common Linux commands.

What is SSH?

SSH (Secure SHell) is a method of remotely connecting to and running commands on another computer ("remote" meaning something far away, so we are connecting to a computer far away). It can be accessed using the ssh command in a shell (like the Mac Terminal). It lets you access the resources of another computer that you may not have on your own computer. However, you cannot use SSH without an Internet connection. Cornell makes Ubuntu computers available to computer science students, which you can access remotely with SSH.

Windows does not come with an SSH client, which means you will need to install one on your computer if you are a Windows user. The setup for this is described later.

What is a Virtual Machine (VM)?

A virtual machine is an application that can imitate hardware such that you can run more than one operating system on your own computer. For example, I can run Windows on a Mac computer using a virtual machine. The VM can run while my usual operating system running without requiring a reboot. It works like any other application, sitting in a window you can minimize when you want to go on Facebook and maximize when you want to work on CS 3410 again. In case you hear this terminology, the Host OS refers to your computer, while the Guest OS is the operating system you are running on the VM.

Part 1: SSH and VM

For this class, you will have the option of either using the VirtualBox VM to run Ubuntu or using SSH to remotely connect to a Cornell Ubuntu machine. Both methods ensure that you have access to an Ubuntu machine. Using SSH may be easier to set up for some students (despite the length of the write-up), but as aforementioned, you must make sure to have an Internet connection.

Why are going through all the trouble of setting this up?

Scroll up for the Motivation.

Important Note for SSH

This is a system that will work anytime you’re connected to the internet.... on Cornell Campus. Cornell's SSH network requires being connected to the same network as the machine you’re connecting to. Never fear though, the solution to this is called a VPN - a Virtual Private Network. This is a service that routes your traffic through a server (in this case on Cornell campus) so that you can connect to the VPN even when not on Cornell Wifi.

You can find the installation and connection instructions here. After installation, follow the connection tutorial at the same page to set up the connection.

Important: If you use Windows, your computer does not come with an SSH client. Skip down to "For Windows users only", then come back here to complete the process

For everyone:

  1. Make sure you are connected to the VPN or Cornell's Wifi
  2. Enter the terminal (Cygwin users remember to use Git Bash or the Cygwin terminal) and type ssh netid@ugclinux.cs.cornell.edu, where netid is your netid
  3. Type yes to accept the new SSH target
  4. Now type the password associated with your netid
  5. You’re in! If you don’t already know how the Unix command line works, you’ll learn it later in this lab - that will be how you interact with the files on this machine.
  6. To exit the shell, type exit or hit Ctrl-D

For Windows users only:

Choose one of the two options:

Git
  1. You should already have Git installed from earlier in the semester. If you do not, install it from the official website.
  2. You’ll now need to find where Git installed its ssh client. It will either be in C:\Program Files\Git\usr\bin or C:\Program Files\Git\bin. Find which of those directories contains a file called ssh.exe, then copy that filepath
  3. Open Control Panel -> System and Security -> System, then click "Change Settings". Go to "Advanced", then click "Environment Variables"
  4. Under "User Variables" click "Path", then click "Edit"
  5. Click "New", then paste in the filepath from above (You may see just a string to edit instead - in this case simply type a semicolon separator then paste the path)
  6. Done! Now when you open the windows command line, you will be able to ssh.
Cygwin
  1. Visit https://www.cygwin.com/ and download the setup script appropriate for your computer (almost certainly the 64 bit one)
  2. Run the setup script. Select the option to download cygwin, then it will prompt you for a download site - I recommend simply selecting the top one.
  3. When it asks you to select packages, scroll down to "net" and click on "default" until it changes to "install". This will ensure that the ssh client is installed with Cygwin
  4. Hit next and allow Cygwin to complete the installation process. Hit OK to install dependencies when it asks. The installation process may take some time (~5 to 10 minutes)
  5. Now when you need to ssh, you’ll open the program "cygwin" and use the terminal it provides

It is also possible to use PuTTY given how simple it is to set up. You can download PuTTY here: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html.
As for using PuTTY, open up the installed program and on the default "Session" page, input netid@ugclinux.cs.cornell.edu for Host Name and leave the rest of the settings at their default. Then click "Open" to start your ssh session, where it will ask you for your password. You have the option to save the session in the PuTTY configuration window to make future connections faster, but this is not necessary. SSH works the same way as described in the instructions above, once you are connected through PuTTY.

For everyone:

  1. Go to the course resources page
  2. Follow the instructions under the Computing Environment section.
    Note: On the VirtualBox website, download the VirtualBox platform packages associated with your current Operating System.
  3. Make sure that you start up the OS. Note: It may take some time.

Share files from computer with VM

  1. In Virtual Box, click Devices Menu->Shared Folders->Shared Folder Settings
  2. In Shared Folder window, Click Folder+ icon on right
  3. In Add Folder window:
    • Folder path: browse to folder on your computer you want in the VM
    • Folder name (for example): HostDownloads
    • Check Auto-mount, Make Permanent.
    • Click OK to add, then click OK to close Shared Folder window
  4. Back in the VM, open terminal and run:
    • sudo adduser vm vboxsf
    • sudo reboot
  5. After reboot, folders will show under /media/

* You can learn more about VirtualBox at https://www.virtualbox.org/manual/ch01.html

Linux Command Overview

Now that we have a Linux environment to use, let’s learn some commands to make your Linux machine do your bidding! But first, we have a few more bits of knowledge for you. Note that there are some necessary tasks to complete along with the reading!

Which type of shell are we using?

We are using the bash shell, which is a typical default for Linux systems. Type in the following command into your shell to check!

echo $0

Can we use editors within the shell?

Yep. You have several at your disposal - nano, emacs, and vim. Each have their own set of commands for various behaviors like saving text, exiting, etc. If you have never used any of these editors before, we suggest getting started with nano to get accustomed to using an editor within a shell, and because the commands are written at the bottom of the nano screen, as shown in the screenshot below:

Nano Screenshot

The ^ stands for the Control key on your keyboard (this is true even for Mac users - do not use the Command key!).

Complete the following:

  1. To start nano, type nano into the shell.
  2. Type "I love math, science, and computer science!" into the nano window.
  3. To save your file, use Ctrl-O. It will prompt you for a file name. Call it science.txt (to be consistent with the naming in a tutorial we will use later).
  4. To search for the word "science", type Ctrl-W, type "science" when prompted, hit Enter, and watch the cursor jump to the start of the first occurrence of "science". To again find the same string, type Alt-W. Hit Alt-W one more time to return to the first occurrence.
  5. To cancel out of a prompt, use Ctrl-C.
  6. You should now know enough to be able to experiment with the other commands on your own. But for now, to exit, type Ctrl-X. If you did not save some changes, it will prompt you to save the file - just follow the prompts (e.g. hit Y to continue saving, and Enter if the filename shown is the one you want).
  7. To see your file again, you can type nano science.txt
  8. Keep reading to learn more commands!

How do I get files from my computer to the remote machine and back?

For this class, the best option is to use your git repository. Complete the following to get the repository on your remote machine:

  1. Use a web browser and login to your git account at https://github.coecis.cornell.edu/
  2. Navigate to your repository
  3. Click on Clone or download. You will need the link you see there (you may not be able to use copy and paste).
  4. Return back to your shell. Type in git clone your_repository_link, where your_repository_link is what you found in step 3.

You should now be able to navigate through the downloaded directory. Use the git commands to add, commit, push, pull, etc. to update your files. If you need a refresher on how to use git for the command-line prompt, see the tutorials in the Git, etc. section of the course resources page.

What are man pages?

These are manual pages, not pages for men. If you cannot remember what parameters you need for a command, you can type in man command_name to obtain a description. For example, try man mv. Use the up and down arrows to scroll, and use q to leave the screen. You can even do man man to get information about man itself.

Part 2: UNIX Commands

Read this whole part carefully before starting!

Complete Tutorials 1, 2, and 3 from the UNIX Tutorial for Beginners.

For the second command under Section 2.1 Copying Files, instead of copying the file given on the website, copy your previously made science.txt file from the unixstuff parent directory using

cp ../science.txt . (note the dots - there are four in this line!)

Save information about your Linux distribution to a file called linux_info.txt. This is to verify that you genuinely have the correct environment set up. Try running the command lsb_release -a and looking at the output. The lsb_release command prints out information about your Linux distribution (Ubuntu). LSB stands for Linux Standard Base, which is a project to help standardize the various distributions of Linux. It should look something like this:

No LSB modules are available.
Distributor ID: LinuxMint
Description:    Linux Mint 17.3 Rosa
Release:        17.3
Codename:       rosa

The above is an example of lsb_release -a output created by running the command on a different version of Linux. Yours might look very similar, but it's okay if the Linux distibution and version are different (i.e. Ubuntu).

Yes, you will see the message No LSB modules are available. From what we can figure, Ubuntu is not totally LSB compliant since it is "a considerable amount of work for little measurable benefit" according to those working on the Debian project.

Put the output of this command in a file by redirecting the output of the command, which you should have learned in Tutorial 3 of the UNIX commands. The redirection you learned redirects only stdout (standard output), which means the No LSB modules are available. line will still be output to the terminal and not to the file. This is because that line is output to stderr (standard error), and so isn't redirected with the rest of the text. This is fine - we're only looking for your file to contain the lines outputted to stdout, which means the contents of your file should look similar to the block above but missing the first line.

Some useful tips:

Intro to C Overview

Welcome to C! It is a programming language that was first introduced almost half a century ago but is still one of the most commonly used programming languages due to its speed, efficiency, and ability to closely interface with hardware. Let’s finish discussing a few more terms:

What is a high-level programming language?

Thus far, you have been learning a machine language (the binary instructions) and an assembly language. These are what we call low-level programming languages, languages that are "close to the hardware". High-level programming languages, on the other hand, are more similar to human languages and thus are easier for programmers to read and write code. C is an example of a high-level programming language.

How do we get high-level programming languages to be read by the hardware?

Basically, we have to do a few conversions, where software like compilers and assemblers convert programs to other types of programs. Here is one possible chain:

High-level program -> Compiler -> Assembly language program -> Assembler -> Machine language program -> Linker (and machine code from libraries) -> Complete machine code -> Loader -> Machine code loaded into memory -> Hardware

These pieces will be explained more in-depth during the Linkers & Loaders lecture.

What is a statically-typed language?

This is a language that requires all variables to be declared at compile time. In other words, you have to give the type of each variable (e.g. int, char) in the source code. This means that the compiler has a chance to check whether you stored the right types of values in each variable before you get to run the program, which helps to prevent bugs early on in the coding process. C is a statically-typed language. (An example of a dynamically-typed language, where the program does not catch type errors until runtime, is Python 🐍.)

Part 3: Intro to C

Now that you know how all the pieces fit together, let's get back to coding in C! On your Ubuntu machine, open up a text editor and create a new file called hello.c. Type in the following C program:

#include <stdio.h>

int main() {
    printf("Hello world! I am netid.\n");
    return 0;
}

But replace netid with your NetID. When you are done typing, save the file and exit the editor. Now you are ready to compile and run the program you just created! In your terminal, navigate to where you saved hello.c. Now run the command (the ~$ is part of the prompt and is not to be typed):

gcc -o sayhello hello.c

The C compiler (GCC) has compiled your source code hello.c into an executable named sayhello. The -o option allows us to name the resulting executable file anything we want. Without the flag, the default name given to the executable is a.out.

If this gives you any errors, make sure you are in the right working directory (use ls to confirm that hello.c is in the same directory). If you are in the right location, then you did not enter the program correctly — go back to your editor and fix the program. Otherwise, you have just compiled a C program! You can run your program by running the command:

./sayhello

And your program should run! It should print Hello world! I am netid. and do nothing else. For example, if your NetID is "abc123", then compilation and execution would look like something like this:

gcc -o sayhello hello.c
./sayhello
Hello world! I am abc123.

What does ./sayhello mean? The command ./sayhello means, "Run the executable sayhello in the current directory."

Part 4: What is your workflow?

Now that you have completed this lab, we would like to know the steps that you took to get here, and how you will be writing and testing C code for future assignments!

Think about:

Submit this as a text file called procedure.txt. Feel free to talk this through with a TA during lab or office hours!

What to Submit

Congrats, you're done! It's time to submit some of the various files you've made to CMS. In order to get these files from either the machine you are ssh'ed into or the VM onto your host machine so you can submit them, you'll likely have to push them to your Github repositories. Feel free to make a new folder to do so. Please include

  1. The science.txt file you made during the Linux Command Overview portion
  2. The linux_info.txt file you made in Part 2
  3. The hello.c file you made in Part 3, with your netid filled in
  4. The procedure.txt file that you outlined in Part 4

Fun stuff + More

There's a lot more that we would love to show you, but unfortunately cannot due to time constraints of the lab.
But, we encourage you to try some of the things shown below during your space time! :)

Below, we have a sample C program that implements Fibonacci. This may be familiar from Project 2. Let's open a text editor once again and copy the program into a new file fibonacci.c.

#include "stdio.h" // Headers for standard input/output
#include "stdlib.h" // C standard library

// From the P2 writeup.
int i_Fibonacci(int n) {
    int f1 = 0;
    int f2 = 1;
    int fi;

    if (n == 0) {
        return 0;
    }

    if (n == 1) {
        return 1;
    }

    for (int i = 2; i <= n; i++) { // declaration of i in the loop guard is only valid in C99 or higher
        fi = f1 + f2;
        f1 = f2;
        f2 = fi;
    }
    return fi;
}

// An unoptimized recursive implementation
int r_Fibonacci(int n) {
    if (n == 0 || n == 1) { // logical or
        return n;
    }
    else {
        return r_Fibonacci(n-1) + r_Fibonacci(n-2);
    }
}

int main() {
    printf("Fibonacci(12) as computed iteratively: %d\n", i_Fibonacci(12));
    printf("Fibonacci(12) as computed recursively: %d\n", r_Fibonacci(12));
}

Now, compile the program with the command:

gcc -o fibonacci.out -Wall -Werror -Wextra -std=c99 fibonacci.c

The flag -std=c99 is necessary to tell GCC to compile using the C99 language standard, rather than the C89 that it defaults to. C99 is more convenient to write, and we’ll be grading based off that. The -Wall and -Wextra flags add additional compile-time warnings, which will help catch bugs in your code. The -Werror flag tells the compiler to treat warnings as errors, causing compilation to fail until all warnings are resolved.

Now, let's run our fibonacci program!

./fibonacci.out
Fibonacci(12) as computed iteratively: 144
Fibonacci(12) as computed recursively: 144

tmux is a program that allows you to control multiple terminal sessions at the same time. This can be helpful for when you want to run commands simulaneously without having to open multiple terminal windows. It also has more powerful features such as sending commands to multiple terminal sessions at the same time, saving sessions across logins, etc.

You can set it up by following the tutorial here: https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340.