Using OOP and ML Regression Models to Predict Numeric Values | Programming for Automation | Learn Software

Year 12
Software Automation
Programming for Automation
Using OOP and ML Regression Models to Predict Numeric Values

Using OOP and ML Regression Models to Predict Numeric Values

Syllabus Point

Design, develop and apply ML regression models using an OOP to predict numeric values

Including:

linear regression
polynomial regression
logistic regression

Design, develop and apply ML regression models using an OOP to predict numeric values

Linear Regression

Overfitting is where the model learns the data too well - noise and outliers affect it badly

Underfitting is where the model doesn’t properly learn the relevant patterns in the data, and is too simplistic

Good fit is where the model reflects the data

y = w*x + b

For a given input (x), the model multiplies it by a weight (w) and adds a bias (b) to make a prediction
Cost function: takes w and b parameters and calculates loss for each data point and returns the mean square error cost
Outputs a regressional value (predicted value)
Gradient descent optimises w,b parameters for the model - returns w,b for the lowest cost

Loss

Loss is the inaccuracy of the model in predicting the value
Difference or error between the actual value and predicted value for each training value

Cost

Cost is the sum of the loss for all data points expressed as the Mean Squared Error
The cost function returns the MSE for a given set of w,b parameters. It uses MSE because it produces a smooth parabola for a gradient decent

Cost Function

Measures how well a regression model’s predictions match the actual target values
Uses MSE
Calculates error each data point (Error = Prediction - Actual)
Aggregates errors across all points into a single scalar value
Goal of training is to find model parameters that minimise the cost

Gradient Descent

Gradient Descent automates the process of optimising w,b parameters by calling the cost function and adjusting the parameters until the derivative between calls is close to zero
Indicates that the w,b parameters can’t be further optimised
The process that actually calculates the line that will best meet the lowest cost
The size of the steps is the learning rate

Multi feature linear regression

Considering multiple variables

β₀ is the y intercept (bias)
n is the number of features

Polynomial regression

Extension of linear regression that can fit curved relationships by adding higher-degree polynomial terms.

Useful when a simple straight line doesn’t fit the data well
Assumes a curvilinear relationship
Outputs a numerical prediction based on an input feature(s)

y = β₀ + β₁ x¹ + β₂ x² + … + βₙ xⁿ

Cost function takes β₀ and βₙ (coefficient parameters), then calculates loss for each data point and returns the MSE cost
Gradient descent optimises β₀ and βₙ parameters for the model, returning the coefficients with the lowest cost
xⁿ (power of n) terms allow for non-linear relationships to be modelled
If there are too many degrees (n is too big) it can cause overfitting

Logistic regression

Used for classification problems - it predicts categories instead of numbers.

Uses s-shape curve to map values between 0 and 1 (closer to 1 indicates higher confidence in belonging to the positive class)
Assumes classification problem

f₍w, b₎(x⁽ⁱ⁾) = g(wᵀx⁽ⁱ⁾ + b), where g(z) = 1 / (1 + e⁻ᶻ)

Logistic cost function calculates logistic loss for each data point and returns model logistic cost (probability cost)
Gradient descent optimises w (weight) and b (bias) parameters for the model, returning w, b with the lowest logistic cost.
Outputs a numerical regressional value between 0–1 that is a probabilistic prediction that it is not class 0.

Sample Answers

No sample answers added yet.

Related Resources

Keep Progressing

Use the lesson navigation below to move through the module sequence.

No previous content pageNext: Neural Network Models Using an OOP to Make Predictions