1 Exercises: Decision Trees and Ensembles [5pts]
To get warmed up and reinforce what we’ve learned, we’ll do some light exercises with decision trees – how to interpret
a decision tree and how to learn one from data.
I Q1 Drawing Decision Tree Predictions [2pts]. Consider the following decision tree:
a) Draw the decision boundaries deﬁned by this tree over the interval x1 2 [0; 30], x2 2 [0; 30]. Each leaf of
the tree is labeled with a letter. Write this letter in the corresponding region of input space.
b) Give another decision tree that is syntactically diﬀerent (i.e., has a diﬀerent structure) but deﬁnes the
same decision boundaries.
c) This demonstrates that the space of decision trees is syntactically redundant. How does this redundancy
inﬂuence learning – i.e., does it make it easier or harder to ﬁnd an accurate tree?
I Q2 Manually Learning A Decision Tree [2pts]. Consider the following training set and learn a deci-
sion tree to predict Y. Use information gain to select attributes for splits.
A B C Y
0 1 1 0
1 1 1 0
0 0 0 0
1 1 0 1
0 1 0 1
1 0 1 1
For each candidate split include the information gain in your report. Also include the ﬁnal tree and your
Now let’s consider building an ensemble of decision trees (also known as a random forest). We’ll speciﬁcally loo
how decreasing correlation can lead to further improvements in ensembling.
I Q3 Measuring Correlation in Random Forests [1pts]. We’ve provided a Python script
decision.py that trains an ensemble of 15 decision trees on the Breast Cancer classiﬁcation dataset we
used in HW1. We are using the sklearn package for the decision tree implementation as the point of this
exercise is to consider ensembling, not to implement decision trees. When run, the ﬁle displays the plot:
The non-empty cells in the upper-triangle of the ﬁgure show the correlation between predictions on the test
set for each of 15 decision tree models trained on the same training set. Variations in the correlation are due
to randomly breaking ties when selecting split attributes. The plot also reports the average correlation (a
very high 0.984 for this ensemble) and accuracy for the ensemble (majority vote) and a separately-trained
single model. Even with the high correlation, the ensemble managed to improve performance marginally.
As discussed in class, uncorrelated errors result in better ensembles. Modify the code to train the following
ensembles (each separately). Provide the resulting plots for each and describe what you observe.
a) Apply bagging by uniformly sampling train datapoints with replacement to train each ensemble member.
b) The sklearn API for the DecisionTreeClassiﬁer provides many options to modify how decision trees are
learned, including some of the techniques we discussed to increase randomness. When set less than the
number of features in the dataset, the max_features argument will cause each split to only consider a
random subset of the features. Modify line 44 to include this option at a value you decide.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: firstname.lastname@example.org 微信号:vipnxx