# 数据分析代写 | ITEC3040 Introduction to Data Analytics

Submission Instructions:

• This is individual assignment.
• Use eClass to submit your work.
• At the top of the each file introduce your name and student number.
• You can do the assignment by hand or use the software (for example, R, SAS, MATLAB and Python).

• Evaluation is based on the work you submitted.
1. Textbook, page 387, 8.7
1. (a)  How would you modify the basic decision tree algorithm to take into consideration the count of each generalized data tuple (i.e., of each row entry)?
2. (b)  Use your algorithm to construct a decision tree from the given data.
3. (c)  Given a data tuple having the values “systems”, “26. . . 30”, and “46–50K” for the attributes department, age, and salary, respectively, what would decision tree classification of the status for the tuple be?
4. (d)  Construct the Na ̈ıve Bayesian Classifier and do part c).
2. Given data set ‘SampledSeeds.csv’ sampled from the dataset ‘seeds Data Set’ and the attribute

information can be found at here

1. (a)  Use KNN to clssify the tuple with the values “16.17”, “15.38”, “0.8588”, “5.762”, “3.387”, “4.286” and “5.703” for the attributes of ‘ area A’, ‘perimeter P’, ‘compactness C’, ‘length of kernel’, ‘width of kernel’, ‘asymmetry coefficient’ and ‘length of kernel groove’ respectively. What would the class attribute value for this tuple be? When K=3, 5, 7 and Euclidean distance is in use.
2. (b)  Use min max normalization method to normalized each attribute value into [0,1] then redo part a). Is there difference in classification? Why? E-mail: vipdue@outlook.com  微信号:vipnxx 