Discussion 7 handout

Group members (names & NetIDs)

Objectives

Implement an abstraction barrier between ADT operations and data structure functions
Explore how to preserve invariants of a binary search tree (BST)
Build intuition for tree traversals and Comparators

Context

These tasks make use of the following BinaryNode class. For simplicity, this class is specialized for chars, which lets us compare node data using <, >, and ==.

class BinaryNode {
    /** Invariant: All data in left subtree are less than this.data, and
      * all data in right subtree are greater than this.data.
      * A null subtree is empty. */
    char data;
    BinaryNode left;
    BinaryNode right;
}

Task 1: Node search

A common task when implementing an abstract data type (ADT) using a binary tree is to find the Node object in the tree containing some target value. Here is a potential specification and implementation:

/** Return the node in the BST rooted at root whose data is x, or null if x
  * is not in the tree. */
static BinaryNode findNode(BinaryNode root, char x) {
    if (root == null || root.data == x) {
        return root;
    } else if (root.data < x)   {

        return _______________________;
    } else  /* root.data > x */ {

        return _______________________;
    }
}

Fill in the blanks with recursive calls to findNode() in order to fulfill the specification, taking advantage of the BST invariant.

Check your implementation by executing it by hand on a sample tree (such as the tree depicted for the next task).

Clients of the ADT, on the other hand, should probably not have access to Node objects (that would be a recipe for “rep exposure”). But trees may be used to implement the Set ADT, and clients of Sets often want to know if a particular value is alrady contained in the Set. Consider this partial class for implementing a Set using an (encapsulated) BST:

public class BstSet {
    /** Root of the binary search tree containing all values in this set. */
    private BinaryNode root;

    /** Return whether x is an element of this set. */
    public boolean contains(char x) { ... }
}

Implement its contains() method by calling findNode(). (Yes, this implementation is so short it might seem silly, but this method represents the “abstraction barrier” between the ADT operations and its underlying data structure.)

Comparators

Imagine that, instead of storing char data, we store Song objects, and variable cmp is a Comparator<Song> that orders Songs based on year of release. The search algorithm contains the expressions root.data == x and root.data < x; what would you replace these with to handle Songs?

Task 2: Insertion

Consider the following binary search tree containing letters (comparisons follow alphabetical order):

We want to explore different ways of preserving the invariant while adding more letters to the tree. Draw a binary search tree containing these letters as well as the letters 'C', 'L', and 'E'. Feel free to move nodes around as long as the end result is a valid BST—you do not need to follow any particular algorithm.

How many nodes have a different parent than in the original tree? What is the height of your new tree? Compare with other groups in the class—who moved the fewest nodes? who had the shortest tree?

In-order traversal

The following recursive procedure implements an in-order traversal of the tree, printing each node’s value (note the “left, then self, then right” ordering that distinguishes in-order traversal from pre-order and post-order):

static void printInOrder(BinaryNode n) {
    if (n == null) { return; }
    printInOrder(n.left);
    System.out.print(n.data);
    printInOrder(n.right);
}

Execute this algorithm by hand on the tree you drew above and write what would be printed:

Is the output in alphabetical order? Should it be? If you think it should be and it’s not, that would imply an invariant violation in your BST; find it and repeat the task.

Submission

Open the assignment page for “Discussion activity 7” in CMSX
[Recorder] Find the “Group Management” section and invite each group member
[Others] Refresh the page and accept your invitation
[Recorder] Take a picture of your work and save as either a JPEG or a PDF file named “discussion_responses” (you do not need to submit your test code). After all invitations have been accepted, upload your picture along with your code as your group’s submission.
- Recommended scanning apps: Microsoft Office Lens, Adobe Scan, Genius Scan, Evernote Scannable

Ensure that your group is formed and your work submitted before the Friday evening deadline.

Tips and reminders

Discussion is not a race. Focus on the current activity being facilitated by your TA and engage your whole group to propose and explain ideas.
Elect a recorder to maintain the “master” copy of your work (others are still encouraged to jot down ideas on scratch paper). Rotate this position each week.
It is a violation of academic integrity to credit someone for work if they were not present when the work was done, and the whole group is accountable. Your CMS group must only include classmates who attended section with you on that day. Remember that our participation policies accommodate occasional absences without penalty.
It is your individual responsibility to ensure that you are grouped on CMS and that your group’s work is submitted before the deadline. Log into CMS the day after section to ensure that everything is in order, and contact your groupmates if not. It would be prudent for multiple members to photograph the group’s work.
Only one group member (typically the recorder) needs to upload a submission. But their submission must not be uploaded until after all group members have confirmed their membership in CMS (contact your TA if you have trouble grouping in CMS).

Challenge problem

If your entire group finishes early and are looking for more practice with this material, try completing this additional task. To be clear, we are not expecting anyone to submit work on challenge problems to CMSX, nor will your TA have time to discuss them in section. But you are welcome to discuss them on Ed or to bring questions and ideas to office hours.

Random replacement

Consider the following binary search tree containing letters (comparisons follow alphabetical order):

We want to explore what’s required to remove the value 'D'. Choose any value from D’s subtree (any letter in A..H other than D itself) and place it in D’s old spot. Now rearrange the remaining nodes in D’s subtree so that the BST invariant is satisfied again (there is no algorithm to follow when doing this—there are multiple right answers). Draw your resulting tree:

How many nodes are in a different location from where they were in the original tree?

Deletion algorithm

The following code implements the BST deletion algorithm from the lecture notes (for homework, compare this code to the textbook or to the supplemental reading in Myers; how well does it match up?):

/** Remove node `n` with parent node `p` from a binary search tree.
  * Requires `n` is not global root (`p != null`). */
void remove(BinaryNode n, BinaryNode p) {
    if (n.right == null && n.left == null) {
        // Prune leaf by removing parent pointer
        if (n.data < p.data) { p.left  = null; }
        else                 { p.right = null; }
    } else if (n.right == null) {
        // Splice left child onto parent
        if (n.data < p.data) { p.left  = n.left; }
        else                 { p.right = n.left; }
    } else if (n.left == null) {
        // Splice right child onto parent
        if (n.data < p.data) { p.left  = n.right; }
        else                 { p.right = n.right; }
    } else {
        // Find immediate next element (and its parent)
        BinaryNode pNext = n;
        BinaryNode nNext = n.right;
        while (nNext.left != null) {
            pNext = nNext;
            nNext = nNext.left;
        }
        // Replace data
        n.data = nNext.data;
        // Prune next element or splice its right child (will not recurse further)
        remove(nNext, pNext);
    }
}

Execute this algorithm by hand to remove 'D' from the original tree in task 2 (the arguments would be the node containing 'D' and its parent, the node containing 'I').

How many pointers were reassigned during your execution? How many data fields were reassigned?

Which tree required moving fewer nodes—the one formed with “random replacement and fixup,” or the one resulting from this algorithm?