Abstract data types

Abstract data types (ADTs) describe a set of well-defined operations that together provide a useful problem-solving tool.

ADTs do not dictate how these operations must be implemented or how their state must be represented; they are only concerned with the specifications of operations. For this reason, it makes sense to associate ADTs with Java interfaces, since an interface declares public methods with their specifications but does not define any fields or field-dependent implementations.

Many ADTs in this course are collections, meaning they keep track of multiple values, called “elements,” of some other type. Collections tend to be agnostic to the type of their elements and are therefore declared as generic types in Java.

In order to implement an ADT for use in computation, the implementer must decide on a suitable representation for its state and implement the ADT’s behavior in terms of that representation. These representations, and the algorithms used to perform operations with them, are known as data structures. A single kind of data structure can serve as the basis for multiple different ADTs, and a single ADT can be implemented in terms of multiple data structures with different tradeoffs related to time and space efficiency.

If we want a Java object to provide the capabilities of an ADT, we ultimately need to define a class for it, typically implementing an interface associated with that ADT. A class must choose which fields (and invariants) to use to concretely represent the state of the chosen data structure, and then it must implement methods for the ADT’s operations, leveraging the algorithms associated with that data structure. Different classes may implement the same ADT’s interface using different data structures, and thanks to subtyping, they can be exchanged for each other in a program to realize their respective tradeoffs. For this reason, it is often appropriate to document time and space efficiency guarantees as part of a class’s specification.

Common collection operations

These operations are shared by many collection ADTs. In some cases, the detailed semantics of the operation are what differentiates ADTs from one another. In other cases, they are common “utility” operations that make sense for collections in general and aren’t really an ADT’s defining feature.

Add(element)

This operation adds the provided element to the collection, ensuring that the collection now contains it.

Such an operation makes sense for Bag, Queue, Stack, PriorityQueue, and Set.

In the Java Collections API, this is called add(T), which throws an exception if a bounded collection is full. The offer() method is similar but returns a boolean to indicate whether the operation succeeded on a bounded collection. (A bounded collection is one with a maximum size.)

Remove()

This operation removes an element from the collection and returns it to the client. The precise semantics of this operation (specifying which element is removed) differentiate several common ADTs from one another.

If the collection is empty, this operation will need to indicate an exceptional situation (e.g., by throwing an exception).

Such an operation makes sense for Bag, Queue, Stack, and PriorityQueue.

In the Java Collections API, this is called remove() for Queues and pop() for Stacks, which throw a NoSuchElementException if the collection is empty. The poll() method is similar, except that it returns null if the collection is empty.

Peek()

Return the next element that would be removed by Remove() without actually removing it.

If the collection is empty, this operation will need to indicate an exceptional situation (e.g., by throwing an exception).

In the Java Collections API, this is called element(). The peek() method is similar, except that it returns null if the collection is empty.

Delete(element)

Remove one occurrence of the provided element from the collection, if that element was contained in the collection.

Such an operation makes most sense for Bag and Set.

In the Java Collections API, this is called remove(Object) (which is an unfortunate overload with remove(), as the semantics are rather different).

Size()

Return the number of elements contained in the collection. Note that some collections may contain multiple copies of the same element value, which are each counted separately.

Such an operation makes sense for Bag, Queue, Stack, PriorityQueue, List, Set, and Map.

In the Java Collections API, this is called size().

IsEmpty()

Return whether the collection contains no elements.

In the Java Collections API, this is called isEmpty().

Contains(element)

Returns whether the collection contains an element equal to the provided element.

Such an operation makes most sense for Bag, List, and Set.

In the Java Collections API, this is called contains(Object).

Iterator()

Return an Iterator for enumerating all of the elements contained in the collection. Some iterators may support mutation (i.e., deleting or inserting values). It is generally unsafe to mutate a collection (except through a single iterator itself) while any iterators over it exist.

In the Java Collections API, collections implement the Iterable<T> interface, and this operation corresponds to iterator(). This means that an enhanced for-loop can be used to iterate over the elements in a collection.

Common abstract data types

Bag

A Bag keeps track of elements that are added to it and supports iteration and removal. Duplicate elements are allowed (though an implementation might not preserve distinct references for elements that compare equal to one another).

In the Java Collections API, the Collection<T> interface roughly corresponds to Bag, but there is no class specialized to Bag, since other collection types can satisfy the specifications for Bag while providing additional functionality or guarantees.

Defining operations

  • Add(element): Add an occurrence of the provided element to the collection.
  • Remove(): Remove and return an arbitrary element from the collection.
  • Iterator(): Support iteration over the elements of the collection.

Other notable operations

  • Count(element): Return the number of occurrences of the provided element in the collection.

Relevant common operations

  • Peek(), Delete(element), Size(), IsEmpty(), Contains(element)

Relevant data structures

Since a Bag’s operations are so minimally specified, almost any collection data structure can be used to implement one, with different efficiency tradeoffs for its operations.

  • Dynamic array
  • Linked chain
  • Hash table (associating a count of occurrences with each element)
  • Binary search tree (may store duplicates, or may associate a count of occurrences with each element)

Queue

A Queue keeps track of the order in which elements are added to it and removes them in the same order (FIFO: first in, first out).

In the Java Collections API, the Queue<T> interface corresponds to Queue, except that it does not require FIFO ordering.

Defining operations

  • Add(element): Add an occurrence of the provided element to the collection.
  • Remove(): Remove and return the element least recently added to the collection (FIFO).

Relevant common operations

  • Peek(), Size(), IsEmpty()

Relevant data structures

  • Linked chain (with fast tail access)
  • Ring buffer

Stack

A Stack keeps track of the order in which elements are added to it and removes them in the opposite order (LIFO: last in, first out).

In the Java Collections API, the Deque<T> interface for a double-ended queue provides Stack operations. (There is also an old Stack<T> class, but Deque is preferred.)

Defining operations

  • Add(element), aka Push(element): Add an occurrence of the provided element to the collection. Typically referred to as Push(element) when a Stack is involved (Java: Deque<T>::push(T)).
  • Remove(), aka Pop(): Remove and return the element most recently added to the collection (FIFO). Typically referred to as Pop() when a Stack is involved (Java: Deq<T>::pop()).

Relevant common operations

  • Peek(), Size(), IsEmpty()

Relevant data structures

  • Linked chain (push and pop from head)
  • Dynamic array (push and pop from end)

PriorityQueue

A PriorityQueue imposes an ordering on its elements, either by comparing them intrinsically or by associating each with a “priority” value, which can then be compared. Elements are removed in priority order: a MinQueue would remove them in ascending order, while a MaxQueue would remove them in descending order.

In the Java Collections API, the PriorityQueue<T> class corresponds to a MinQueue. It does not provide an efficient Replace(oldElement, newElement) operation.

Defining operations

  • Add(element): Add an occurrence of the provided element to the collection.
  • Remove(): Remove and return the “highest priority” element in the collection (i.e., the smallest element for a MinQueue or the largest element for a MaxQueue).

Other notable operations

  • Replace(oldElement, newElement), aka ChangePriority(element, priority): Change the priority of an element already contained in the collection (“replacement” is necessary if elements are compared intrinsically, rather than via separate associated priorities).

Relevant data structures

  • Binary heap (augmented with a Map for efficient Replace(oldElement, newElement))

Set

Cannot contain multiple occurrences of the same element. Typically optimized for a fast Contains(element) operation.

In the Java Collections API, the Set<T> interface corresponds to a Set. There is also a SortedSet<T> interface and a LinkedHashSet<T> class that constrain the iteration order. Implementing classes include HashSet<T>, LinkedHashSet<T>, and TreeSet<T>. Oracle’s implementation of HashSet has traditionally used chaining for collision resolution (but a binary search tree, rather than a linked list, is used for the chains). Oracle’s implementation of TreeSet has traditionally used a balanced red–black tree.

Defining operations

  • Add(element): Ensure that the provided element is contained in the collection.
  • Delete(element): Ensure that the provided element is not contained in the collection.
  • Contains(element): Return whether the provided element is contained in the collection. This is expected to be faster for a Set than for other collections.

Other notable operations

  • Union
  • Difference

Relevant common operations

  • Size(), IsEmpty(), Iterator()

Relevant data structures

  • Hash table
  • Binary search tree (requires comparable elements)
  • Sorted dynamic array (requires comparable elements)

List

A sequence of elements whose order is maintained. Each element is associated with an integer index, typically between 0 and one less than the list’s size (though in some conventions, the index starts at 1 instead). Elements can be queried, replaced, removed, or inserted by index; insertion and removal affects the indices of all subsequent elements.

In the Java Collections API, the List<T> interface corresponds to a List. Implementing classes include LinkedList<T> and ArrayList<T>.

Defining operations

  • At(index): Return the element at the specified index.
  • InsertAt(index, element): Insert the provided element at the specified index, pushing any elements previously at or after that index to a higher index.
  • RemoveAt(index): Remove the element at the specified index, pulling any successor elements to a lower index.
  • Iterator(): Return a ListIterator supporting iteration over the collection as well as insertion and removal from its current position. May support bidirectional iteration.

Other notable operations

  • Append(element): Add the provided element to the end of the list.
  • Prepend(element): Insert the provided element at the start of the list.
  • RemoveFirst(): Remove and return the first element in the sequence.
  • RemoveLast(): Remove and return the last element in the sequence.
  • Find(element): Return an index at which the provided element occurs.
    • Variations: FindFirst(element), FindLast(element)

Relevant common operations

  • Peek(), Delete(element), Size(), IsEmpty(), Contains(element)

Relevant data structures

  • Linked chain (poor efficiency for At(index), SetAt(index, element))
    • A doubly-linked chain supports reverse iteration and provides good efficiency when inserting or deleting from an Iterator
  • Dynamic array (poor efficiency for Prepend(element), Concatenate(List), inserting, or deleting)

Map (aka Dictionary)

Associates elements, known as “keys”, with other objects, known as “values”. Optimized for fast lookup of the value associated with a given key. Can be treated as a Set of Entries, where an Entry pairs a key with a value, but comparisons only look at its key.

In the Java Collections API, the Map<T> interface corresponds to a Map. There is also a SortedMap<T> interface and a LinkedHashMap<T> class that constrain the iteration order. Implementing classes include HashMap<T>, LinkedHashMap<T>, and TreeMap<T>. Oracle’s implementation of HashMap has traditionally used chaining for collision resolution (but a binary search tree, rather than a linked list, is used for the chains). Oracle’s implementation of TreeMap has traditionally used a balanced red–black tree.

Relevant data structures

  • Hash table
  • Binary search tree (requires comparable keys)
  • Sorted dynamic array (requires comparable keys)

Graph

Graphs are an important family of ADTs, but they tend to exhibit more variation than the above collections, so they will be detailed elsewhere. The Java Collections API does not have any types specifically related to general graphs.

Modifiers

There are several common variations of collections that can be applied to multiple ADTs. Examples include: