Homework 2

Due Monday, February 12 by 11:59 PM via a direct message to the TA on slack.

This homework will give you experience using Python notebooks, scikit learn, the Digits dataset, and Perceptrons. You'll also think about what learned weights tell you about the underlying data.

Here's what to do:

Task 0: Learn about Python notebooks

If you're new to notebooks, there are lots of good resources on the web. Here's one that will get you started:

https://realpython.com/jupyter-notebook-introduction/

Task 1: Read about scikit's train/test splitter

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

Task 2: Run the perceptron implementation from scikit

Read the manual page for scikit's Perceptron class here:

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Perceptron.html

Below, write code to:

Briefly explain what the accuracies tell you about the difficulty of this problem and why the two accuracies are different.

Task 3: Use the code below to show a visualization of the learned weights.

The magnitude of the weight is conveyed by the grayscale intesity. Black means large negative weights and white means large positive weights. Note that all of the features, pixels, are positive. Also, the weights will be arranged in the image such that they get multiplied by the pixel in the digit at the same location in the digit image.

Briefly explain why the learned weights make sense for the task of classifying zero digits as the positive class and all other digits as the negative class.