CMSC 471

Artificial Intelligence -- Spring 2014

HOMEWORK FOUR

Out 3/27/14, due 4/8/14




The group homework policy from the syllabus applies to this assignment.

I. Learning in the wild (15 pts.)

Russell & Norvig, Exercise 18.1 (page 763). Consider the problem faced by an infant learning to speak and understand a language. Explain how this problem fits into the general learning model. Describe the percepts nad actions of the infant, and the types of learning the infant must do. Describe the subfunctions the infant is trying to learn in terms of inputs and outputs, and available example data.

II. Information gain (15 pts.)

Russell & Norvig, Exercise 18.5 (page 764). Suppose that an attribute splits the set of examples E into subsets E_k and that each subset has p_k positive examples and n_k negative examples. Show that the attribute has strictly positive information gain unless the ratio p_k / (p_k + n_k) is the same for all k.

III. Decision tree learning (50 pts.)

The following table gives a data set for deciding whether to play or cancel a ball game, depending on the weather conditions.
 
Outlook Temp (F) Humidity (%) Windy? Class
sunny 75 70 true Play
sunny 80 90 true Don't Play
sunny 85 85 false Don't Play
sunny 72 95 false Don't Play
sunny 69 70 false Play
overcast 72 90 true Play
overcast 83 78 false Play
overcast 64 65 true Play
overcast 81 75 false Play
rain 71 80 true Don't Play
rain 65 70 true Don't Play
rain 75 80 false Play
rain 68 80 false Play
rain 70 96 false Play
  1. Information Gain (20 pts.) At the root node for a decision tree in this domain, what would the information gain associated with a split on the Outlook attribute? What would it be for a split at the root on the Humidity attribute? (Use a threshold of 75 for humidity (i.e., assume a binary split: humidity <= 75 / humidity > 75.)

  2. Gain Ratios (20 pts.) Again at the root node, what is the gain ratio associated with the Outlook attribute? What is the gain ratio for the Humidity attribute at the root (using the same threshold as in (a))?

  3. Decision Tree (10 pts.) Suppose you build a decision tree that splits on the Outlook attribute at the root node. How many children nodes are there are at the first level of the decision tree? Which branches require a further split in order to create leaf nodes with instances belonging to a single class? For each of these branches, which attribute can you split on to complete the decision tree building process at the next level (i.e., so that at level 2, there are only leaf nodes)? Draw the resulting decision tree, showing the decisions (class predictions) at the leaves.
     

III. Learning Bayes nets (20 pts.)

Consider an arbitrary Bayesian network, a complete data set for that network, and the likelihood for the data set according to the network. Give a clear explanation (i.e., an informal proof, in words) of why the likelihood of the data cannot decrease if we add a new link to the network and recompute the maximum-likelihood parameter values.