Syllabus Point
- Design, develop and apply ML regression models using an OOP to predict numeric values
Including:
- linear regression
- polynomial regression
- logistic regression
Design, develop and apply ML regression models using an OOP to predict numeric values
Linear Regression
Overfitting is where the model learns the data too well - noise and outliers affect it badly
Underfitting is where the model doesn’t properly learn the relevant patterns in the data, and is too simplistic
Good fit is where the model reflects the data
y = w*x + b
- For a given input (x), the model multiplies it by a weight (w) and adds a bias (b) to make a prediction
- Cost function: takes w and b parameters and calculates loss for each data point and returns the mean square error cost
- Outputs a regressional value (predicted value)
- Gradient descent optimises w,b parameters for the model - returns w,b for the lowest cost
Loss
- Loss is the inaccuracy of the model in predicting the value
- Difference or error between the actual value and predicted value for each training value
Cost
- Cost is the sum of the loss for all data points expressed as the Mean Squared Error
- The cost function returns the MSE for a given set of w,b parameters. It uses MSE because it produces a smooth parabola for a gradient decent
Cost Function
- Measures how well a regression model’s predictions match the actual target values
- Uses MSE
- Calculates error each data point (Error = Prediction - Actual)
- Aggregates errors across all points into a single scalar value
- Goal of training is to find model parameters that minimise the cost
Gradient Descent
- Gradient Descent automates the process of optimising w,b parameters by calling the cost function and adjusting the parameters until the derivative between calls is close to zero
- Indicates that the w,b parameters can’t be further optimised
- The process that actually calculates the line that will best meet the lowest cost
- The size of the steps is the learning rate
Multi feature linear regression
Considering multiple variables
- β₀ is the y intercept (bias)
- n is the number of features
Polynomial regression
Extension of linear regression that can fit curved relationships by adding higher-degree polynomial terms.
- Useful when a simple straight line doesn’t fit the data well
- Assumes a curvilinear relationship
- Outputs a numerical prediction based on an input feature(s)
y = β₀ + β₁ x¹ + β₂ x² + … + βₙ xⁿ
- Cost function takes β₀ and βₙ (coefficient parameters), then calculates loss for each data point and returns the MSE cost
- Gradient descent optimises β₀ and βₙ parameters for the model, returning the coefficients with the lowest cost
- xⁿ (power of n) terms allow for non-linear relationships to be modelled
- If there are too many degrees (n is too big) it can cause overfitting
Logistic regression
Used for classification problems - it predicts categories instead of numbers.
- Uses s-shape curve to map values between 0 and 1 (closer to 1 indicates higher confidence in belonging to the positive class)
- Assumes classification problem
f₍w, b₎(x⁽ⁱ⁾) = g(wᵀx⁽ⁱ⁾ + b), where g(z) = 1 / (1 + e⁻ᶻ)
- Logistic cost function calculates logistic loss for each data point and returns model logistic cost (probability cost)
- Gradient descent optimises w (weight) and b (bias) parameters for the model, returning w, b with the lowest logistic cost.
- Outputs a numerical regressional value between 0–1 that is a probabilistic prediction that it is not class 0.
Related Resources
Keep Progressing
Use the lesson navigation below to move through the module sequence.
No previous content pageNext: Neural Network Models Using an OOP to Make Predictions