1. The table below provides a training data set containing 6 observations, 3 predictors, and 1
qualitative response variable.
Suppose we wish to use this data set to make a prediction for Y when X1 = X2 = X3 = 0
using K-nearest neighbors.
(a) Compute the Euclidean distance between each observation and the test point, X1 = X2 =
X3 = 0.
(b) What is our prediction with K = 1 ? Why?
(c) What is our prediction with K = 3? Why?
(d) If the Bayes decision boundary in this problem is highly nonlinear, then would we expect
the best value for K to be large or small? Why?
2. This question should be answered using the Carseats data set.
(a) Fit a multiple regression model to predict Sales using Population, Urban, and US.
(b) Provide an interpretation of each coefficient in the model. Be careful – some of the vari
ables in the model are qualitative!
(c) Write out the model in equation form, being careful to handle the qualitative variables
(d) For which of the predictors can you reject the null hypothesis H0 : βj = 0 ?
(e) On the basis of your response to the previous question, fit a smaller model that only uses
the predictors for which there is evidence of association with the outcome.
(f) How well do the models in (a) and (e) fit the data?
(g) Using the model from (e), obtain 95% confidence intervals for the coefficient(s).
(h) Is there evidence of outliers or high leverage observations in the model from (e)?
3. Suppose we have features x ∈ Rp, a two-class response, with class sizes N1, N2, and the
target coded as −N/N1, N/N2.
Show that the LDA rule classifies to class 2 if
and class 1 otherwise.
4. Use the WineQt data to build a logistic regression(response is quality). Use different regu
larization technique: None, L1 and L2. Show the accuracy and recall on the train and test
data. Does regularization improve your model performance?
5. Compare the classification performance of LDA and support vector machine on the MNIST
data. In particular, consider only the 2’s and 3’s. Show both the training and test accuracy.
6. Show for the polynomial kernel function
7. Suppose each of K-classes has an associated target tk, which is a vector of all zeros, except
a one in the k th position. Show that classifying to the largest element of yˆ amounts to
choosing the closest target, mink ∥tk− yˆ∥, if the elements of yˆ sum to one.
8. Show how to solve the generalized eigenvalue problem maxaTBa subject to aTWa = 1 by
transforming to a standard eigenvalue problem.(Assume B and W are symmetric)
9. Show that the ridge regression estimates can be obtained by ordinary least squares regres
sion on an augmented data set. We augment the centered matrix X with p additional rows
pλI , and augment y with p zeros. By introducing artificial data having response value
zero, the fitting procedure is forced to shrink the coefficients toward zero.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: email@example.com 微信号:vipnxx