这是一篇英国的机器学习理论作业代写

**Instructions**: Please answer the questions below, attach your code in the document, and insert figures

to create a single PDF file. You may search information online but you will need to write code/find

solutions to answer the questions yourself.

Grade: out of 100 points

## 1 (10 points) Classification vs. Clustering

In this question, you are provided with several scenarios. You need to identify if the given scenario is better

formulated as a classification task or a clustering task. You should also provide the reason that supports

your choice.

1. Scenario 1: Assume there are 100 graded answer sheets for a homework assignment (scores range from

0 to 100). We would like to split them into several groups where each group has similar scores.

Choice: task

Reason:

2. Assume there are 100 graded answer sheets for a homework assignment (scores range from 0 to 100).

We would like to split them into several groups where each group represents a letter grade (A, B, C,

D) following the criteria: A (90-100), B (75-90), C (60-75), D (0-60).

Choice: task

Reason:

## 2 (40 points) Basic Calculus

2.1 (20 points) Derivatives with Scalars

**2.2 (20 points) Derivatives with Vectors **

Several particular vector derivatives are useful for this course. For matrix **A ***∈ *R*M**×**M*, column vector *x **∈ *R*M *and *a **∈ *R*M *, we have

The above rules adopt a *denominator-layout *notation. For more rules, you can refer to this Wikipedia page.

Please apply the above rules and calculate following derivatives:

**3 (20 points) Metrics **

In machine learning, we have many metrics to evaluate the performance of our model. For example, in a binary classification task, there is a dataset *S *= (*x**i **, y**i*)*, i *= 1*, .., N *where each data point (x, y) contains a feature vector *x **∈ *R*M *and a ground-truth label *y **∈ {*0*, *1*}*. We have obtained a classifier *f *: R*M **→ {*0*, *1*} *to predict the label ˆ*y *of feature vector x:*y*ˆ = *f*(*x*) Assume N = 200 and we have the following *confusion matrix *to represent the result of classifier f on dataset

S:

Please follow the lecture notes to compute the metrics below:

- Please compute the
*accuracy*of the classifier*f*on dataset*S*. - Please compute the
*precision*of the classifier*f*on dataset*S*. - 4 Please compute the
*F1 score*of the classifier*f*on dataset*S*. - You may find the accuracy of current model very high. Does it mean the performance of this model is always very good? Why?

**Hint**: You may refer to other metrics you have computed.

**4 (10 points) Data Visualization **

We will be using the UCI Wine dataset for this problem and Question 5. The description of the dataset can be found at https://archive.ics.uci.edu/ml/datasets/wine. You can load the dataset using the code below (recommended), or you can download the dataset here and load it yourself. You may refer the the Jupyter notebook HW1-Q4-Q5.ipynb for some skeleton code.

- Show a scatter plot for the first 2 feature dimensions in 2-D space.Some useful instructions are shown below:

- Import several useful packages into Python:

**import ****matplotlib.pyplot ****as ****plt **

**from ****sklearn ****import **datasets

- Load Wine dataset into Python:

wine = datasets.load wine()

X = wine.data

Y = wine.target

Report ** your code **and the

**in Gradescope submission.**

*scatter plot***5 (20 points) Data Manipulation **

We have already had a glimpse of the Wine dataset in Question 4. In this question, we will still use the Wine dataset. In fact, you can see the shape of array X is (178, 13) by running X.shape, which means it contains 178 data points and 13 features per data point. You may refer the the Jupyter notebook HW1-Q4-Q5.ipynb for some skeleton code. Here, we will calculate some measures of the array X and perform some basic data manipulation:

- Show the first 2 features of the first 3 data points (i.e. first 2 columns and first 3 rows) of array X.(You can print the 3
*×*2 array).

- Calculate the mean and the variance of the 1st feature (the 1st column) of array X.
- Randomly sample 3 data points (rows) of array X by randomly choosing the row indices.Show the indices and the sampled data points.

**Hint**: You may use np.random.randint().

- Add one more feature (one more column) to the array X after the last feature. The values of the added feature for all data points are constant 1. Show the first data point (first row) of the new array.Hint: You may use np.ones() and np.hstack().

- Get a row or a column of the array X:

**pr int **X[ 0 ] *# P r i n t t h e f i r s t row o f a r r ay X. *

**pr int **X[ : , 0 ] *# P r i n t t h e f i r s t column o f a r r ay X. **# **‘**: **‘ **he re means a l l rows and **‘**0 **‘ **means column 0 . *

- Get part of the array:

**pr int **X[ 3 : 5 , 1 : 3 ] *# P r i n t 4 t h and 5 t h rows , 2nd and 3 rd columns . *

**pr int **X[ : 3 , : 2 ] *# P r i n t f i r s t 3 rows , f i r s t 2 columns . *

- You may refer to a quick tutorial using NumPy here:

http://cs231n.github.io/python-numpy-tutorial/

Report ** your code **and the

**in Gradescope submission.**

*results of data manipulation***程序辅导定制C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB**

本网站支持 Alipay WeChatPay PayPal等支付方式

**E-mail:** vipdue@outlook.com **微信号:**vipnxx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。