Due Monday, February 12 by 11:59 PM via a direct message to the TA on slack.
This homework will give you experience using Python notebooks, scikit learn, the Digits dataset, and Perceptrons. You'll also think about what learned weights tell you about the underlying data.
Here's what to do:
If you're new to notebooks, there are lots of good resources on the web. Here's one that will get you started:
# Imports
from sklearn.linear_model import Perceptron
from sklearn.preprocessing import MinMaxScaler
from sklearn.datasets import load_digits
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
# Load the digits dataset and look at a few examples
digits = load_digits()
plt.gray()
for i in range(10):
print('Digit = %d' % i)
plt.matshow(digits.images[i])
plt.show()
Digit = 0
<Figure size 432x288 with 0 Axes>
Digit = 1
Digit = 2
Digit = 3
Digit = 4
Digit = 5
Digit = 6
Digit = 7
Digit = 8
Digit = 9
# Get the attribute vectors and the class labels
X = digits.data
y = digits.target
# Convert 10 digit problem to one in which the goal is to tell if a digit is
# 0 or not 0
y = [1 if a == 0 else 0 for a in y]
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
# Split the data into a training set and a test set
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.5,
random_state=42)
Read the manual page for scikit's Perceptron class here:
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Perceptron.html
Below, write code to:
Briefly explain what the accuracies tell you about the difficulty of this problem and why the two accuracies are different.
# YOUR CODE HERE
# If you name your classifier object clf as below then the code for the next task
# not need to be modified.
clf = Perceptron()
The magnitude of the weight is conveyed by the grayscale intesity. Black means large negative weights and white means large positive weights. Note that all of the features, pixels, are positive. Also, the weights will be arranged in the image such that they get multiplied by the pixel in the digit at the same location in the digit image.
Briefly explain why the learned weights make sense for the task of classifying zero digits as the positive class and all other digits as the negative class.
# This cell will throw an error until you write the code above to train the perceptron
plt.matshow(clf.coef_.reshape(8,8))
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-113-f1279de214be> in <module> 1 # This cell will throw an error until you write the code above to train the perceptron ----> 2 plt.matshow(clf.coef_.reshape(8,8)) AttributeError: 'Perceptron' object has no attribute 'coef_'