Syllabus Point
- Explore models of training ML
Including:
- supervised learning
- unsupervised learning
- semi-supervised learning
- reinforcement learning
Explore models of training ML
Supervised Learning
The algorithm is trained on a labelled dataset - input data is paired with correct output. The model learns the relationships to predict for unseen data.
Overview
- The model learns the relationships to predict for unseen data
- Example: image classification - training a model to recognise pictures of cats and dogs where images are labelled in data
- Example: spam detection
Classification
- Output variable is a category or label
- Model learns to assign input data to one of a fixed number of predefined classes
- Binary and multiclass classification
- Question: What kind of thing is this?
Regression
- Output variable is a continuous value
- Model tries to predict a number not a category
- Relationship between input features and target output is expressed as mathematical function
- Question: What value or how much?
Advantages
- Accurate results
- Clear objective
- Easy to evaluate
- Versatile (classification + regression)
Disadvantages
- Requires labelled data (costly, time consuming)
- Not scalable (struggle with complexity)
- Risk of overfitting if data is too specific or noisy
- Can't adapt to new, unseen patterns
Unsupervised Learning
A framework/type of ML where algorithms learn patterns from exclusively unlabelled data. The model's goal is to identify hidden patterns or groupings in the data without knowing the correct outputs beforehand.
Overview
- Example: clustering customers based on purchasing behaviour - model tries to group similar customers without predefined categories
Common Tasks
- Clustering: Grouping similar data points based on characteristics; exclusive and overlapping cluster, hierarchical clustering, probability clustering
- Association rule learning: Discovering relationships between different variables within the data
- Dimensionality reduction: Reducing the number of variables, while retaining important information
Used For
- Exploratory data analysis
- Image recognition
- News categorisation
- Medical imaging
- Anomaly detection
Advantages
- Handling raw data
- Various applications
- Exploring hidden patterns
- Scalability/efficiency with large unstructured datasets
Disadvantages
- Computational complexity and high volume of training data
- Longer training times
- Inaccurate results
- Human intervention required to validate output variables
- Not versatile
- Relies on possible inaccurate assumptions
Semi-Supervised Learning
Combines small amounts of labelled data with large amounts of unlabelled data. Useful when labelling data is expensive or time consuming.
Overview
- Improve model performance while reducing need for extensive labelled datasets
- Example: a few labelled images of cats/dogs and a large collection of unlabelled images
Self Training
- Train on labelled data, then use predictions on unlabelled data to create new labelled points
- These are added to the training data, and the model is retrained iteratively
Co-Training
- Use two different learning algorithms with complementary views of the data
- Each algorithm uses its predictions on unlabelled data to help the other improve
- Based on self training, it has multiple views (views have different sets of features)
- Train models for each view with small amount of labelled data
- Pseudo-labelling used on unlabelled data
- Models train one another with highest confidence pseudo-labels
- Predictions are combined to make one result
- Multiple iterations
Advantages
- Cost effective
- Improved model with less labelled data
- Faster than supervised (not as much manual labelling)
- More accurate than unsupervised
Disadvantages
- Choosing the right algorithm
- Sensitivity to label noise/errors
- Computational complexity
- Limited theoretical guarantees
Reinforcement Learning
The model learns to make decisions by interacting with an environment, and receiving feedback in the form of rewards or penalties. Learns to maximise the cumulative reward over time.
Key Principles
- Learns to maximise the cumulative reward over time
- Example: training an AI to play a game - learns which moves are rewarding and which are penalising based on outcomes
Components
- Agent: The learner or decision maker
- Environment: The external system the agent interacts with
- Actions: The choices available to the agent
- Rewards: Feedback from the environment based on the agent's actions
- Policy: The strategy the agent employs to determine its actions
Why Use It
- Useful for tasks where it is difficult to define a specific goal or provide labelled data
- Allows machines to learn complex behaviours through interaction and feedback
- Focuses on long-term reward maximisation → appropriate for scenarios where actions have prolonged consequences
Advantages
- Solving complex problems
- Finding the best sequence of known actions to achieve a goal
- Automating complex tasks
- Reduced need for labelled data
- Flexibility and combination with other techniques
- Real-time learning
Disadvantages
- Not suitable for simplified problems
- Requires significant amounts of data and computational power
- Effectiveness relies on the quality of the reward function
- Complexity in debugging and interpretation
- Sample inefficient - requires large amount of data to learn effectively
Related Resources
Keep Progressing
Use the lesson navigation below to move through the module sequence.