Skip to main content

Command Palette

Search for a command to run...

#10 SVM

Updated
6 min read
#10 SVM
A
Machine Learning Engineer and open-source developer focused on NLP, LLM applications, Retrieval-Augmented Generation (RAG), semantic search, and AI infrastructure. I enjoy building developer tools, portable AI systems, and production-ready ML pipelines using Python, FastAPI, FAISS, LangChain, TensorFlow, and PyTorch. Creator of: • RagBucket — portable executable RAG artifacts for Python • LazyTune — fast hyperparameter optimization library • AkBOT — AI portfolio chatbot using RAG Contributor to open-source projects including NumPy and LocalStack.

Support Vector Machine (SVM) is a supervised machine learning algorithm used for both classification and regression problems.
It works by finding the best boundary (hyperplane) that separates different classes.

SVM helps the model:

  • classify data using decision boundaries

  • maximize the margin between classes

  • handle high-dimensional data

  • work well for complex datasets

What is a Hyperplane?

A hyperplane is the boundary that separates classes.

For 2D data:

  • hyperplane → line

  • margin → distance between classes and boundary

The optimal hyperplane has the maximum margin.

Margin in SVM

SVM tries to maximize the distance between:

  • nearest data points

  • and the decision boundary

These nearest points are called Support Vectors.

SVM Equation

The hyperplane equation:

$$\mathbf{w^Tx + b = 0}$$

where

Symbol Meaning
(w) weight vector
(x) input features
(b) bias

Classification Rule

$$f(x)=w^Tx+b$$

prediction

$$\mathbf{f(x)\ge0 \rightarrow Class\ 1}$$

$$\mathbf{f(x)<0 \rightarrow Class\ 0}$$

Support Vectors

Support vectors are the closest points to the hyperplane.

They are important because:

  • they define the boundary

  • removing them changes the hyperplane

Types of SVM

1. Linear SVM

Used when data is linearly separable.

Example:

  • spam vs not spam

  • pass vs fail

2. Non Linear SVM

Used when data cannot be separated using a straight line.

Non linear SVM uses kernel trick

Different types of Kernel Functions -

Linear Kernel -

$$\mathbf{K(x_{i},x_{j})=x_{i}^{T}x_{j}}$$

Polynomial Kernel -

$$\mathbf{K(x_{i},x_{j}) = (x_{i}^{T}x_{j} + c)^{d}}$$

c = constant, d = degree of polynomial

RBF (Gaussian Kernel) -

$$\mathbf{K(x_{i},x_{j}) = e^{-\gamma ||x_{i}-x_{j}||^{2}}}$$

γ controls the spread

Linear SVM Non-Linear SVM
Used when data is linearly separable. Used when data is not linearly separable.
Decision boundary is a straight line (2D) or hyperplane. Decision boundary is curved or complex.
Does not require kernel functions. Uses kernel functions (RBF, Polynomial, Sigmoid).
Faster and computationally simpler. More computationally expensive.
Works well for simple datasets. Works well for complex datasets.

How does SVM handles non-linear classification problem

When the data cannot be separated by a straight line, SVM uses the Kernel Trick.

The kernel function maps the data from a lower-dimensional space to a higher-dimensional space where the classes become linearly separable. Then SVM finds an optimal hyperplane in that higher-dimensional space.

Steps:

  1. Non-linear data is given.

  2. Apply a kernel function (RBF, Polynomial, Sigmoid, etc.).

  3. Transform data into a higher dimension.

  4. Find the maximum-margin hyperplane.

  5. Use this hyperplane for classification.

Diagram

use the diagram given above in the non-linear SVM section.

Example - SVM Classification Step by Step

Suppose we have this dataset:

Student Study Hours Result
1 2 Fail
2 3 Fail
3 7 Pass
4 8 Pass

We want to classify a new student who studies: 6 hours

Step 1 — Plot the Data

Classes:

Fail → Class 0 || Pass → Class 1

Step 2 — Find the Hyperplane

SVM tries to find the best boundary between the classes.

Decision boundary lies halfway between support vectors.

$$\mathbf{Midpoint = \frac{3+7}{2}} =\frac{10}{2}=5$$

so the seperating boundary is

$$\mathbf{x=5}$$

Step 3 — Identify Support Vectors

Point Class
3 Fail
7 Pass

Step 3 — Write Hyperplane Equation

General SVM equation:

$$f(x)=w^Tx+b$$

Since this is 1D data:

$$\mathbf{wx+b=0}$$

Step 4 — Find w and b

Boundary:

$$\mathbf{x=5}$$

Rewrite:

$$\mathbf{x-5=0}$$

Compare with:

$$\mathbf{wx+b=0}$$

we get :

$$\mathbf{w=1, b = -5}$$

Step 5 — Final Decision Function

$$\mathbf{f(x)=x-5}$$

Step 6 — Classification

Rule

$$\mathbf{f(x)\ge0 \rightarrow Pass}$$

$$\mathbf{f(x)<0 \rightarrow Fail}$$

Step 7 — Predict New Student

Suppose:

$$\mathbf{x=6}$$

$$\mathbf{f(6)=6-5} =1$$

since 1 > 0

Prediction : Pass

Advantages of SVM

  • effective in high-dimensional data

  • works well with small datasets

  • powerful for classification

  • robust against overfitting

Disadvantages

  • slow for very large datasets

  • difficult to tune parameters

  • harder to interpret

  • sensitive to noisy data

Applications of SVM

  • Face detection

  • Image classification

  • Spam filtering

  • Bioinformatics

  • Handwriting recognition

Working Principle of SVM

The working principle of SVM is to find an optimal hyperplane that separates different classes with the maximum possible margin.

  • Plot the training data points in a feature space.

  • Identify all possible hyperplanes that can separate the classes.

  • Calculate the margin for each hyperplane.

  • Select the hyperplane with the largest margin.

  • The data points closest to the hyperplane are called support vectors.

  • New data points are classified based on which side of the hyperplane they fall.

Python Example

from sklearn import svm

# Features
X = [
    [2], [3], [7], [8]
]

# Labels
y = [0, 0, 1, 1]

# Create model
model = svm.SVC(kernel='linear')

# Train
model.fit(X, y)

# Predict
prediction = model.predict([[6]])

print(prediction)

Conclusion

Support Vector Machine (SVM) is a powerful supervised machine learning algorithm mainly used for classification problems. It works by finding the optimal hyperplane that separates different classes with the maximum possible margin.

SVM performs well on high-dimensional datasets and is highly effective for tasks like image classification, spam detection, handwriting recognition, and text classification. With the help of kernel functions, SVM can also solve complex non-linear problems.

Although SVM can be slower on very large datasets, it remains one of the most accurate and widely used classification algorithms in machine learning.

64 views