1. In this problem, you will code up a linear regressor based on the MSE criterion function.
You will also investigate different learning rate parameter schedules and their effect on
convergence in iterative gradient descent (GD) optimization.
Coding yourself. As in previous homework assignments, in this problem you are required to
write the code yourself; you may use only Python built-in functions, NumPy, and
matplotlib; you may use the “random” module because it is part of the Python standard
library; and you may use pandas only for reading and/or writing csv files.
Dataset. Throughout this problem, you will use a real dataset based on the “Combined
Cycle Power Plant Data Set”
from the UCI Machine Learning Repository .
We have subsampled the data (to save on computation time) and preprocessed it, so be sure
to use the dataset files h5w7_pr1_power_train.csv and h5w7_pr1_power_test.csv
provided with this homework assignment. There are a total of 500 data points, with a
split of 75% of points in the training set, and 25% of points in the test set.
The dataset has 4 real-valued features: temperature (T), ambient pressure (AP), relative
humidity (RH), and exhaust vacuum (V). The output value 𝑦! is the net hourly electrical
energy output (EP). Note that the 5th column in the dataset files is the output value. We
have normalized the input features already, so that each input feature value is in the range
0 ≤ 𝑥!” ≤ 1 .
(a) This part does not require any coding. Do the convergence criteria given in Lecture
10, page 12 for MSE classification, also apply to MSE regression? Justify your
Hint: compare the weight update formulas.
Answer the parts below as instructed, regardless of your answer to (a).
Please code the regressor as follows:
(i) Code a linear MSE regressor that uses iterative GD for optimization (LMS
algorithm); use basic sequential GD. The schedule to use for η(i) will be given
below. Hints on how to perform the random shuffle are given in Homework 4.
(ii) For the initial weight vector, let each w# = a random number drawn independent
and identically distributed (i.i.d.) according to the uniform density function
𝑈[−0.1, +0.1]. Hint: use numpy.random.uniform().
(iii) Before the first epoch (at 𝑖 = 0, call it epoch 𝑚 = 0), and at the end of each
epoch, compute and store the RMS error for epoch m as:
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: email@example.com 微信号:vipnxx