Assigned: Tue, 2 Nov (Groups of 2)
Due: 4p, Tue, 9 Nov
Group Members:
Please use the Code cells in this Jupyter notebook to answer each of the following questions. You can add additional cells for each question if that helps in organizing your solution. Please run all of the cells in your notebook and then submit it via Moodle. (There is a Run All Cells command under the Run menu.)
(1) Determine the first six warehouse design parameters listed in Ex 2 of ML-1 ("Total Lines" to "Cube Movement") using the files _HW7-Data-itemmaster.csv and HW7-Data-orderset1.csv. In addition, determine a new parameter, "Weight per Order," which is the average total weight of the units in an order. The length, width, and depth are in inches; cube is in cubic inches; and weight is in pounds.
(2) Using the orders in the file HW7-Data-orderset1.csv along with the price of \$18.00, 15.75, 16.20, 12.24, 25.43, 36.24, 18.75, and 8.45 per unit of SKUs 1 to 8, respectively, determine the total price for each individual order.
(3) Recommend the best model to use to predict demand using Ex 2 from ML-2. In addition to Orders + Visitors
used in the example, also consider Orders + Visitors + Orders*Visitors
and Orders + Visitors + Orders*Visitors + Orders^2 + Visitors^2
as possible models. Justify your recommendation by explaining the criterion or criteria and the method you used to make your recommendation:
Recommendation:
Justification:
(4) The Kaygle ML Competition: Kaggle has a ML competition from which the Kaygle competition is derived. In file HW7-Data-test.csv has the factors that you can use to make a survival prediction for 21 test passengers that were not in ML-3-Titanic-train.csv. Your prediction for each passenger should be added to the Prediction column of the table below, where 1 indicates that you are prediction that the passenger survived and 0 that they did not. There is an additional column, ID, in HW7-Data-test.csv that can be used in place of Name to identify each passenger. Your predictions will be evaluated based on their accuracy with repect to the known survival of the 21 test passengers. You can use any variations with respect to the parameter settings of DecisionTreeClassifier
used in ML-3, but no other ML methods can be used. You can also consider using new features if they are derived from the original features. All of your analysis should be derived from the data in ML-3-Titanic-train.csv and HW7-Data-test.csv and shown in the code cells below:
Record | ID | Prediction |
---|---|---|
1 | 100 | |
2 | 101 | |
3 | 102 | |
4 | 103 | |
5 | 104 | |
6 | 105 | |
7 | 106 | |
8 | 107 | |
9 | 108 | |
10 | 109 | |
11 | 110 | |
12 | 111 | |
13 | 112 | |
14 | 113 | |
15 | 114 | |
16 | 115 | |
17 | 116 | |
18 | 117 | |
19 | 118 | |
20 | 119 | |
21 | 120 |