ML 2: Regression

Packages Used: Functions from the following packages are used in this notebook for the first time:

Regression in relation to other aspects of machine learning:

image.png

Ex 1: 13-Day Lee Pharm's Bakery Data (Ex 2 in _NLOpt 3)

When this problem was solved in NL Opt 3, we were maximizing total profit. It is difficult to include a loss function other than L2 or L1 when using a regression package, and so L2 (least squares) will be used as the loss function. Solving the regression using nonlinear optimization, results in the following:

Ex 2: 7-Week Bakery Data

Add Order-Squared Term to Model (Quadratic Regression)

Add Visitor Predictor to Model (Multiple Regression)

Ex 3: Boston Housing Data

This dataset has housing data for 506 census tracts of Boston from the 1970 census. The variable MedV is the target variable. (More detailed description.)

Variable Describtion
crim per capita crime rate by town
zn proportion of residential land zoned for lots over 25,000 sq.ft
indus proportion of non-retail business acres per town
chas Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
nox nitric oxides concentration (parts per 10 million)
rm average number of rooms per dwelling
age proportion of owner-occupied units built prior to 1940
dis weighted distances to five Boston employment centres
rad index of accessibility to radial highways
tax full-value property-tax rate per USD 10,000
ptratio pupil-teacher ratio by town
b 1000(B - 0.63)^2 where B is the proportion of blacks by town
lstat percentage of lower status of the population
medv median value of owner-occupied homes in USD 1000's