Switches and Numbers

Course Overview

CS 3410 is about how computers actually work. That puts it in contrast to other kinds of courses that at other “levels” in the computer science stack:

Classes like CS 1110, CS 2110, and CS 3110 are all about how to make computers do things. You used programming languages (Python, Java, and OCaml) to write programs without worrying to much about how those languages actually do what they do.
Classes on application topics like robotics, machine learning, and graphics are all about things computers can do. These are important, of course, because they are the reason we study computing in the first place.
Outside of CS, and below the 3410 “level,” there are many classes at Cornell on topics like electronics, chemistry, and physics that can tell you physical details of how computers work. That’s not what 3410 is about either: we will build abstractions over those physical phenomena to understand how computers work in the realm of logic.

The fundamental computational building block in the physical world is a switch. What we mean by a “switch” is: something that controls a physical phenomenon that you can abstractly think of as being in an “on” or “off” state. Some examples of switches include:

A valve controls hydraulic states, i.e., whether water is flowing or not.
A vacuum tube controls an electronic signal.
The game Turing Tumble controls signals in the form of marbles. Yes, you can build real computers out of little plastic levers.

What you think of as a “real” computer controls electronic signals. Aside from vacuum tubes, a particularly easy-to-understand type of electronic switch is a relay. To make a relay, you need:

An electromagnet (i.e., a magnet controlled by an electronic signal).
A bendy piece of metal that can be attracted or repelled by that magnet.
Another piece of metal next to that one. You position it carefully so there’s a tiny gap between the two pieces of metal. When the electromagnet is on, it either closes or opens that gap (depending on whether it attracts or repels the bendy piece of metal).
Wires hooked up to the two pieces of metal. This way, you can think of the relay as a wire that is either connected or disconnected, depending on whether the electromagnet is charged.

The point is that a relay is a switch that both controls an electronic signal and is controlled by an electronic signal. That’s a really powerful idea, because it means you can wire up a whole bunch of relays to make them control each other! And that is basically what you need to build a computer.

Transistors

Computers today are universally built out of transistors. Transistors work like relays, in the sense that they let one electronic signal control another one. The difference is that they are solid-state devices, relying on the chemistry of the materials inside of them to do the current control instead of a physically moving bendy piece of metal. But abstractly, they do exactly the same thing.

The first transistor was built in Bell Labs in 1947. These days, you can buy them on Amazon for a few pennies apiece. You can build computers “from scratch” by buying a bunch of transistors on Amazon and wiring them up carefully.

Modern computers consist of billions of transistors, manufactured together in an integrated circuit. For example, Apple’s M4 is made up of 28 billion transistors. There is an entire industry of silicon manufacturing that is dedicated to building chunks of silicon and with many, many tiny transistors and wires on them.

Abstractly speaking, however, these integrated circuits are no different from a bunch of transistors you can buy on Amazon, wired up very carefully. Which are in turn (abstractly!) the same as relays, or valves, or Turing Tumble marble levers: they are all just a bunch of switches that control each other in careful ways.

Bits

Because computers are made of switches, data is made of bits. A bit is an abstraction of a physical phenomenon that can either be “on” or “off.” The mapping between the physical phenomenon and the 0 or 1 digit is arbitrary; this is just something that humans have to make up. For example

In a hydraulic computer, maybe 0 is “no water” and 1 is “water is flowing.”
In Turing Tumble, perhaps 0 is “marble goes left” and 1 is “marble goes right.”
In an electronic computer, let’s use 0 to to mean “low voltage” and 1 to mean “high voltage.”

Binary Numbers

Armed with switches and a logical mapping, computers have a way to represent numbers! Just really small numbers: a bit suffices to represent all the integers in interval [0, 1]. It would be nice to be able to represent numbers bigger than 1.

We do that by combining multiple bits together and counting in binary, a.k.a. “base 2.”

In elementary school math class, you probably learned about “place values.” The rightmost digit in a decimal number is for the ones, the next one is for tens, and the next one is for hundreds. In other words, if you want to know what the string of decimal digits “631” means, you can multiply each digit by its place value and add the results together:

$631_{10} = 1 \times 10^0 + 3 \times 10^1 + 6 \times 10^2$

We’ll sometimes use subscripts, like $n_{b}$ , to be explicit when we are writing a number in base $b$ .

That’s the decimal, a.k.a. “base 10,” system for numerical notation. Base 2 works the same way, except all the place values are powers of 2 instead of powers of 10. So if you want to know what the string of binary digits “101” represents, we can do the same multiply-and-add dance:

$101_2 = 1 \times 2^0 + 0 \times 2^1 + 1 \times 2^2$

That’s five, so we might write $101_2 = 5_{10}$ .

Some Important Bases

We won’t be dealing with too many different bases in this class. In computer systems, only three bases are really important:

Binary (base 2).
Octal (base 8).
Hexadecimal (base 16), affectionately known as hex for short.

Octal works exactly as you might expect, i.e., we use the digits 0 through 7. For hexadecimal, we run out of normal human digits at 9 and need to invent 6 more digits. The universal convention is to use letters: so A has value 10 (in decimal), B has value 11, and F has value 15.

Converting Between Bases

Here are two strategies for converting numbers between different bases. In both algorithms, it can be helpful to write out the place values for the base you’re converting to. We’ll convert the decimal number 637 to octal as an example. In octal, the first few place values are 1, 8, 64, and 512.

Left to Right

First, compute the first digit (the most significant digit) finding the biggest place value you can that is less than that number. Then, find the largest number you can multiply by that place value. That’s your converted digit. Take that product (the place value times that largest number) and subtract it from your value. Now you have a residual value; start from the beginning of these instructions and repeat to get the rest of the digits.

Let’s try it by converting 637 to octal.

The biggest place value under 636 is 512. $512 \times 2$ doesn’t stay “under the limit,” so we have to settle for $512 \times 1$ . That means the first digit of the converted number is 1. The residual value is $637 - 512 \times 1 = 125$ .
The value that “fits under” 125 is $64 \times 1$ . So the second digit is also 1. The residual value is $125 - 64 \times 1 = 61$ .
We’re now at the second-to-least-significant digit, with place value 8. The largest multiple that “fits under” 61 is $8 \times 7$ , so the next digit is 7 and the residual value is $61 - 8 \times 7 = 5$ .
This is the ones place, so the final digit is 5.

So the converted value is $1175_8$ .

Right to Left

First, compute the least significant digit by dividing the number by the base, $b$ . Get both the quotient and remainder. The remainder is the number of ones you have, so that’s your least significant digit. The quotient is the number of $b$ s you have, so that’s the residual value that we will continue with.

Next, repeat with that residual value. Remember, you can think of that as the number of $b$ s that remain. So when we divide by $b$ , the remainder is the number of $b$ s and the quotient is the number of $b^2$ s. So the remainder is the second-to-least-significant digit, and we can continue around the loop with the quotient. Stop the loop when the residual value becomes zero.

Let’s try it again with 637.

$637 \div 8 = 79$ with remainder 5. So the least significant digit is 5.
$79 \div 8 = 9$ with remainder 7. So the next-rightmost digit is 7.
$9 \div 8 = 1$ with remainder 1. The next digit is 1.
$1 \div 8 = 0$ with remainder 1. So the final, most significant digit is 1.

Fortunately, this method gave the same answer: $1175_8$ .

Programming Language Notation

When writing, we often use the notation $1175_8$ to be explicit that we’re writing a number in base 8 (octal). Subscripts are hard to type in programming languages, so they use a different convention.

In many popular programming languages (at least Java, Python, and the language we will use in 3410: C), you can write:

0b10110 to use binary notation.
0x123abc to use hexadecimal notation.

Octal literals are a little less standardized, but in Python, you can use 0o123 (with a little letter “o”).

Addition

To add binary numbers, you can use the elementary-school algorithm for “long addition,” with carrying the one and all that. Just remember that, in binary, 1+1 = 10 and 1+1+1 (i.e., with a carried one) is 11.

Signed Numbers

This is all well and good for representing nonnegative numbers, but what if you want to represent $-10110$ ? Remember, everything must be a bit, so we can’t use the $-$ sign in our digital representation of negative numbers.

There is an “obvious” way that turns out to be problematic, and a less intuitive way that works out better from a mathematical and hardware perspective. The latter is what modern computers actually use.

Sign–Magnitude

The “obvious” way is sign–magnitude notation. The idea is to reserve the leftmost (most significant) bit for the sign: 0 means positive, 1 means negative.

For example, recall that $7_{10} = 111_{2}$ . In a 4-bit sign–magnitude representation, we would represent positive $7$ as 0111 and $-7$ as 1111.

Sign–magnitude was used in some of the earliest electronic computers. However, it has some downsides that mean that it is no longer a common way to represent integers:

It leads to more complicated circuits to implement fundamental operations like addition and subtraction. (We won’t go into why—you’ll have to trust us on this.)
Annoyingly, it has two different zeros! There is as “positive zero” (0000 in 4 bits) and a “negative zero” (1000). That just kinda feels bad; there should only be one zero, and it should be neither positive nor negative.

Two’s Complement

The modern way is two’s complement notation. In two’s complement, there is still a sign bit, and it is still the leftmost (most significant) bit in the representation. 1 in the sign bit still means negative, and 0 means positive or zero.

For the positive numbers, things work like normal. In a 4-bit representation, 0001 means 1, 0010 means 2, 0011 means 3, and so on up to 0111, which means positive 7.

The key difference is that, in two’s complement, the negative numbers grow “up from the bottom.” (In the same sense that they grow “down from zero” in sign–magnitude.) That means that 1000 (and in general, “one followed by all zeroes”) is the most negative number: with 4 bits, that’s $-8$ . Then count upward from there: so 1001 is $-7$ , 1010 is $-6$ , and so on up to 1111, which is $-1$ .

Here’s another way to think about two’s complement: start with a normal, unsigned representation and negate the place value of the most significant bit. In other words: in an unsigned representation, the MSB has place value $2^{n-1}$ . In a two’s complement representation, all the other place values remain the same, but the MSB has place value $-2^{n-1}$ instead.

Here are some cool facts about two’s complement numbers, when using $n$ bits:

The all-zeroes bit string always represents 0.
The all-ones bit string always represents $-1$ .
The biggest positive value, sometimes known as INT_MAX, is 0 followed by all ones. Its value is $2^{n-1}-1$ .
The biggest negative value, sometimes known s INT_MIN, is 1 followed by all zeroes. Its value is $-2^{n-1}$ .
Addition works the same as for normal, unsigned binary numbers. You can just ignore the fact that one of the bits is a sign bits, add the two numbers as if they were plain binary values, and you get the right answer in a two’s complement representation!
To negate a number i, you can compute ~i + 1, where ~ means “flip all the bits, so every zero becomes one and every one becomes zero.”