In this assignment you are asked to explore the use of neural networks for classification
and numeric prediction (you may choose to use `Javanns’ or `MultilayerPerceptron’ in
Weka). You are also asked to carry out a data mining investigation on a real-world
data file. You are required to write a report on your findings. Your assignment will be
assessed on desmontrated understanding of concepts, algorithms, methodology, analysis
of results and conclusions. Please make sure your answers are labelled correctly with the
corresponding part and sub-question numbers, to make it easier for the marker to follow.
Please stick to the required page limits (penalty will apply).
2 Learning Outcomes
This assessment relates to the following learning outcomes of the course.
CLO 1: Demonstrate advanced knowledge of data mining concepts and techniques.
CLO 2: Apply the techniques of clustering, classification, association finding, fea-
ture selection and visualisation on real world data.
CLO 3: Determine whether a real world problem has a data mining solution.
CLO 4: Apply data mining software and toolkits in a range of applications.
CLO 5: Set up a data mining process for an application, including data preparation,
modelling and evaluation
3 Assignment Details
3.1 Part 1: Classification with Neural Networks (12 marks)
This part involves predicting the Class attribute in the following file: heart-v.arff
in the directory:
The main goal is to achieve the lowest classification error with the lowest amount of
For the neural network training runs build a table with the following headings:
1. Describe the data preprcocessing tasks (including data encoding) that are required.
How many outputs and how many inputs will there be? How do you handle nu-
meric and nominal attributes? What are the normalizations requred? How do you
deal with missing values (if present)? Include your data preprocessing scripts (if
necessary) as an appendix (not part of the page count).
2. Develop a script (or elaborate a pre-processing procedure in Weka) to generate the
necessary training, validation and test data files. How do you determine when to
stop training a neural network? Include your data preparation script (if necessary)
as an appendix (not part of the page count).
3. Describe how a trained neural network determines unseen test data instance’s class
label (e.g., the \analyze” strategy in Javanns).
4. Assuming that no hidden layer is used, carry out 5 train and test runs for a network.
Comment on the limitations of this single-layer \perceptron” network, as opposed
to a network where one or more hidden layers are employed.
5. Assuming that one hidden layer is used, use Javanns (or Weka) to carry out 5 train
and test runs for a network with 5 hidden nodes. Comment on the variation in the
training runs and the degree of overfitting. Comment on the differences (if any)
you observe in results on the networks with or without the hidden layer.
6. Experiment with different numbers of hidden nodes. What seems to be the right
number of hidden nodes for this problem?
7. For a network with 5 hidden nodes, explore different combinations of learning rate
and momentum. What do you conclude?
8. Compare the classification accuracy of Javanns (or Weka MultilayerPerceptron)
with the classification accuracy of Weka J48. Comment on the pros and cons of
employing these two classifiers for classification tasks.
9. [Optional for COSC2110] Experimenting with both Javanns and Weka Multilayer-
Perceptron, what are the pros and cons of these two different software programs for
neural network training? What makes you decide to choose to use either Javanns
or Weka? Provide your reasoning.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: firstname.lastname@example.org 微信号:vipnxx