Support Vector Machines

Jyothi Panuganti
6 min readFeb 7, 2020

Explanation of SVM algorithm hyperplane.

The Original SVM was invented by

Vladimir Vapnik

Vapnik–Chervonenkis theory of statistical learning Vladimir Vapnik(Russia) born 6 December 1936 (age 83) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, based on the theory is a form of computational learning theory, which attempts to explain the learning process from a statistical point of view. The invention of SVMs happened when Bernhard decided to implement Vladimir’s algorithm in the three months. After some initial success of the linear algorithm, Vladimir suggested introducing products of features. Rather use the “kernel trick” of the ‘potential function’ algorithm. Vladimir initially resisted the idea because the inventors of the ‘potential functions’ algorithm ( Aizerman, Braverman, and Rozonoer) were from a competing team of his institute back in the 1960s in Russia!

Corinna Cortes

The current was proposed by Corinna Cortes(Denmark ) born 31st March 1961 is a Danish computer scientist known for her contributions to machine learning. She is currently the Head of Google Research, New York. Cortes is a recipient of the Paris Kanellakis Theory and Practice Award for her work on theoretical foundations of support vector machines.

and other inventors are given their contribution to SVM invention…

Coming to SVM (Support Vector Machines) name suggests that support the hyperplanes in the physical sense of the term.

Here you will be getting the question that “what is a hyperplane”?(herein the further study we will get to know)

What is Hyperplane? “The Hyperplane is a subspace whose dimension is one less than that of ambient Space.” For Example: If space is 3-Dimensional then its hyperplanes are the 2-dimensional planes, while if space is a 2-dimensional plane, its hyperplanes are 1-dimensional lines

Above definition of hyperplane bringing the two words which we know but I want to give accurate meaning for them so here is my trial

Dimension: The Dimension is the number of independent parameters or coordinates that are needed for defining the position that is constrained to be an object.

Source: Wikipedia

Ambient Space: An Ambient Space or Ambient Configuration is the space surrounding the object.

Now without any late, we are ready with our minimal definitions, get into Support Vector Machines:

What is SVM

SVMs are a type of supervised machine learning algorithm that provides an analysis of data for classification and regression analysis. Support Vector Machines are usually meant for finding classification, regression and outlier detection. Here in the below graph, we are the hyperplane that differentiates between two classes. It's a frontier method for segregating the two classes.

How does SVM works:

The basic principle behind the working of SVM is simple that is creating a hyperplane that separates the datasets into classes. For Suppose for a given data set, you have to classify the stars and triangles. Your main goal is to create a line that classifies the data into two classes, creating a distinction between stars and triangles. In the below figure, one can hypothesize line that separates the two classes, there can be many lines that can do this job. Therefore, there is not a single that you can agree on which perform this task

According to SVM, we have to find the points that lie closest to both the classes. These points are known as support vectors. In the next step, we find the proximity between our dividing plane and the support vectors. The distance between the points and the dividing line is known as margin. The aim of an SVM algorithm is to maximize this very margin. When the margin reaches its maximum, the hyperplane becomes the optimal one.

The SVM model tries to enlarge the distance between the two classes by creating a well-defined decision boundary.

If the data is linearly separable, we can select two parallel hyperplanes that separate the two classes of data, so that the distance them is as large as possible. The region bounded by these hyperplanes is called the “margin”, and the maximum margin hyperplane is the hyperplane that lies halfway between them.

Dealing with non-linear and inseparable planes:

Some problems cant be solved using linear hyperplane, as shown in the figure below.

In such a situation, SVM uses a kernel trick to transform the input space to higher dimensional space as shown in the figure. The data points are plotted on the x-axis and z-axis. Now you can use easily segregate these points using linear separation.

SVM Kernels

The SVM Algorithm is implemented in practice using kernel methods. A kernel transforms an input data space into the required form.SVM uses a technique called the kernel trick.

Here, the kernel takes a low dimensional input space and transform it into a higher-dimensional space. In other words the nonseparable to separable problems by adding more dimensions to it.

  • Linear Kernel: A linear kernel can be used as a normal dot product any two given observations.

Example: k(x,xi)=sum(x*xi)

  • Polynomial kernel: A polynomial kernel is a more generalized form of the linear kernel. The polynomial kernel can distinguish curved or nonlinear input space.

Example: k(x,xi)=1+sum(x*xi)^d where ‘d’ is the degree of the polynomial.

  • Radial Basis Function Kernel: The Radial basis function kernel is a popular kernel function commonly used in support vector machine classification.RBF can map an input space in infinite-dimensional space.

Example: k(x,xi)=exp(-gamma * sum((x-xi^2)) where gamma is a parameter, which ranges from 0 to 1.

Advantages of SVM:

  • They use less memory because they use a subset of training points in the decision phase.
  • SVM classifier offers good accuracy and performs faster prediction compared to the Naive Bayes algorithm
  • SVM works well with a clear margin of separation and with higher dimensional space.

Disadvantages of SVM:

  • SVM is not fit with large data set because of its high training time and also takes more time in training compared to Naive Bayes.
  • It works poorly with overlapping classes and is also sensitive to the type of kernel used.

Usage of SVM in Real-world

** Face Detection : SVMc classify parts of the image as a face and non face and create a square boundary around face.

** Text and Hypertext categorization : SVMs allow Text and hypertext categorization for both inductive and transductive models.It categorizes on the basis of the score generated and then compare with threshold value.

** Classification of images: Use of SVMs provides better search accuracy for image classification.It provides better accuracy in comparision to the traditional query based searching techniques.

** Bioinformatics: It includes protein classification and cancer classification.We Use SVM for identifying the classification of genes, patients on the basis of genes and other biological problems.

** Handwriting recognition : SVMs are used to recognize handwritten characters used widely.

References

— — — — — — — — — -Have a wonderful reading — — — — — — — — — — —

--

--

Jyothi Panuganti

Data Science Enthusiast, Blogger, content writer, and Freelancer.