Here, you will employ any linear regression skills and ideas. The suggested approach is to start with an EDA). This may give you an idea about the potentially useful predictors, transformations, and interactions, etc. Then, try other approaches to further improve your model, such as cross validations, variable selection, etc. Any skills and techniques can be used as long as it is a linear model. Be sure to set seeds for any methods relying on randomness).
The data consists of loan applications, including information on applicants and the amount of money made from each application. Your job is to predict the amount of money made by the bank for an application they are considering.
Evaluation Metric: root mean squared error (RMSE)
Discover the INSIGHT such that:
- RMSE of only 5 on more than half of the training set
- RMSE < 650 on entire training set
- RMSE < 650 on entire testing set
You need to discover the insight, using which one can make highly accurate predictions on more than half of the train / test data. While it is not easy to get a RMSE of less than 650 in this problem, once you discover that insight, you can get a RMSE of only 5 on more than half of the train data (while also being confident that you are not overfitting on that part of the train data). Note: you cannot compute RMSE on a specific part of the test data as you don’t have the test response.
- Predicted Result file with two columns: Id and Predicted. The file should be comma-separated and contain a header (Id, Predicted). Notes: the predicted values should refer to the predicted money made by the bank on a loan. See the sample submission file for an example.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: email@example.com 微信号:vipnxx