这是一篇英国的机器学习代写
Instructions: Please answer the questions below, attach your code in the document, and insert figures to create a single PDF file. You may search information online but you will need to write code/find solutions to answer the questions yourself.
Grade: out of 100 points
1 (10 points) Classification vs. Clustering
In this question, you are provided with several scenarios. You need to identify if the given scenario is better formulated as a classification task or a clustering task. You should also provide the reason that supports your choice.
1. Scenario 1: Assume there are 100 graded answer sheets for a homework assignment (scores range from 0 to 100). We would like to split them into several groups where each group has similar scores.
Choice: task
Reason:
2. Assume there are 100 graded answer sheets for a homework assignment (scores range from 0 to 100).
We would like to split them into several groups where each group represents a letter grade (A, B, C,D) following the criteria: A (90-100), B (75-90), C (60-75), D (0-60).
Choice: task
Reason:
2 (40 points) Basic Calculus
2.1 (20 points) Derivatives with Scalars
2.2 (20 points) Derivatives with Vectors
Several particular vector derivatives are useful for this course. For matrix A ∈ RM×M, column vector x ∈ RM and a ∈ RM , we have
The above rules adopt a denominator-layout notation. For more rules, you can refer to this Wikipedia page.
Please apply the above rules and calculate following derivatives:
3 (20 points) Metrics
In machine learning, we have many metrics to evaluate the performance of our model. For example, in a binary classification task, there is a dataset S = (xi , yi), i = 1, .., N where each data point (x, y) contains a feature vector x ∈ RM and a ground-truth label y ∈ {0, 1}. We have obtained a classifier f : RM → {0, 1} to predict the label ˆy of feature vector x:yˆ = f(x) Assume N = 200 and we have the following confusion matrix to represent the result of classifier f on dataset
S:
Please follow the lecture notes to compute the metrics below:
- Please compute the accuracy of the classifier f on dataset S.
- Please compute the precision of the classifier f on dataset S.
- 4 Please compute the F1 score of the classifier f on dataset S.
- You may find the accuracy of current model very high. Does it mean the performance of this model is always very good? Why?
Hint: You may refer to other metrics you have computed.
4 (10 points) Data Visualization
We will be using the UCI Wine dataset for this problem and Question 5. The description of the dataset can be found at https://archive.ics.uci.edu/ml/datasets/wine. You can load the dataset using the code below (recommended), or you can download the dataset here and load it yourself. You may refer the the Jupyter notebook HW1-Q4-Q5.ipynb for some skeleton code.
- Show a scatter plot for the first 2 feature dimensions in 2-D space.Some useful instructions are shown below:
- Import several useful packages into Python:
import matplotlib.pyplot as plt
from sklearn import datasets
- Load Wine dataset into Python:
wine = datasets.load wine()
X = wine.data
Y = wine.target
Report your code and the scatter plot in Gradescope submission.
5 (20 points) Data Manipulation
We have already had a glimpse of the Wine dataset in Question 4. In this question, we will still use the Wine dataset. In fact, you can see the shape of array X is (178, 13) by running X.shape, which means it contains 178 data points and 13 features per data point. You may refer the the Jupyter notebook HW1-Q4-Q5.ipynb for some skeleton code. Here, we will calculate some measures of the array X and perform some basic data manipulation:
- Show the first 2 features of the first 3 data points (i.e. first 2 columns and first 3 rows) of array X.(You can print the 3 × 2 array).
- Calculate the mean and the variance of the 1st feature (the 1st column) of array X.
- Randomly sample 3 data points (rows) of array X by randomly choosing the row indices.Show the indices and the sampled data points.
Hint: You may use np.random.randint().
- Add one more feature (one more column) to the array X after the last feature. The values of the added feature for all data points are constant 1. Show the first data point (first row) of the new array.Hint: You may use np.ones() and np.hstack().
- Get a row or a column of the array X:
pr int X[ 0 ] # P r i n t t h e f i r s t row o f a r r ay X.
pr int X[ : , 0 ] # P r i n t t h e f i r s t column o f a r r ay X. # ‘: ‘ he re means a l l rows and ‘0 ‘ means column 0 .
- Get part of the array:
pr int X[ 3 : 5 , 1 : 3 ] # P r i n t 4 t h and 5 t h rows , 2nd and 3 rd columns .
pr int X[ : 3 , : 2 ] # P r i n t f i r s t 3 rows , f i r s t 2 columns .
- You may refer to a quick tutorial using NumPy here:
http://cs231n.github.io/python-numpy-tutorial/
Report your code and the results of data manipulation in Gradescope submission.
程序辅导定制C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: vipdue@outlook.com 微信号:vipnxx
如果您使用手机请先保存二维码,微信识别。如果用电脑,直接掏出手机果断扫描。
