# Python代写 | ECE 520.438 & 520.638: Deep Learning HW3

ECE 520.438 & 520.638: Deep Learning

Homework 3: Perceptron, Logistic Regression

Spring 2021

1 Part1

1. 2. 3. 4. 5. 6. 7.

8.

9. 10. 11. 12.

2

What kind of decision boundary is learned by perceptrons? How can we use perceptrons to do C-class calssfication? Breifly explain stochastic gradient descent.

Let σ(a) = 1 . Show that σ′(a) = σ(a)[1 − σ(a)]. 1+e−a

What is an Epoch?
If the dataset size is 2000 samples and the batch size is 50, then an epoch will consists of how many iterations?
Breifly describe Dropout.
What is early stopping?
Why is regularization used during training a neural network?
What are different ways in which we can reduce over-fitting in neural network?
Why do we use batch normalization?

Part 2: Perceptron

ECE 520.438 & 520.638: Deep Learning Homework 3: Perceptron, Logistic Regression Spring 2021

In this problem, you will classify handwritten digits using the MNIST Database.
Please download the four files found there, these will be used for this homework. To reduce computation time, please use only the first 20,000 training images/labels and only the first 2,000 testing images/labels.

Read in the data from files. Each image is 28 × 28 pixels. An image can be viewed as a 784-dimensional vector of pixel intensities. For each image, append a ‘1’ to the beginning of each x-vector; this will act as an intercept term (for bias). Use the gradient descent algorithm for Perceptron derived in class, to classify given x ∈ R785 whether a digit’s label is k or if it is some “other” digit, i.e. k ∈/ {0, · · · , 9}. For instance, if you are classifying “2”, you would designate y = 2 as the positive class, and all other digits as the negative class. For each image, do this two-way classification for all 10 digits. You will train 10 perceptrons that will collectively learn to classify the handwritten digits in the MNIST dataset. Each perceptron will have 785 inputs and one output.

1. Report the test accuracy for each of the 10 2-way classifications on the test set.
2. For each image in the test set, report the overall test accuracy. An example will be considered labeled correctly if the perceptron classification of the true label has the highest probability. So for instance, if the true label was {2} for an image, you would

count it as correctly classified if the perceptron test of {2} vs. {0,1,3,4,5,6,7,8,9} had the highest probability of all the 10 2-way classifications.

3 Part 3: Logistic Regression

Logistic regression is a binary classification method which can be modeled as using a single neuron reading in an input vector x ∈ Rd and parameterized by weight vector w ∈ Rd, where the neuron outputs the probability of the class being y = 1 given x

P(y = 1|x) = gw(x) = 1 = σ(wT x) 1 + exp(−wT x)

P (y = 0|x) = 1 − P (y = 1|x) = 1 − gw (x).
Given {(x(i),y(i))}Ni=1, the Cross Entropy Loss function is defined as follows

N
J(w) = − 􏰅 􏰂y(i) log(gw(x(i))) + (1 − y(i)) log(1 − gw(x(i)))􏰃 ,

i=1
where N denotes the total number of training samples. We will optimize this cost function

1. Show that the gradient of the cost function with respect to the parameter w is:

∂J(w) N

∂wj

= 􏰅 x(i)(gw(x(i)) − y(i)). j

i=1

1. Using the gradient derived for Logistic Regression cross entropy loss, use gradient de- scent to classify given x ∈ R785 whether a digit’s label is k or if it is some “other” digit, i.e. k ∈/ {0, · · · , 9}. Report the test accuracy for each of the 10 2-way classifications on the test set.
2. For each image in the test set, report the overall test accuracy. An example will be considered labeled correctly if the perceptron classification of the true label has the highest probability. So for instance, if the true label was {2} for an image, you would count it as correctly classified if the perceptron test of {2} vs. {0,1,3,4,5,6,7,8,9} had the highest probability of all the 10 2-way classifications. E-mail: vipdue@outlook.com  微信号:vipnxx 