Emotion Detection using CNN a Deep Learning Model

Deep learning is a type of supervised machine learning in which a model learns to perform classification tasks directly from images, text, or sound.
Deep learning is usually implemented using a neural network.
The term “deep” refers to the number of layers in the network—the more layers, the deeper the network.

Link to download dataset https://www.kaggle.com/shawon10/ckplus

How CNNs Work

A convolutional neural network can have hundreds of layers and each layer learn to detect different features of an image.
Filters are applied to each training image at different resolutions and size, and the output of each convolved image is used as the input to the next layer.
The filters can start as very simple features, such as brightness and edges, and later on it goes deep to extract complex features.
Like other neural networks, a CNN is composed of an input layer, an output layer, and many hidden layers in between.

Image Source https://in.mathworks.com/solutions/deep-learning/convolutional-neural-network.html

Workflow

There are 7 steps as in below figure. For explanation click here

Create Image Datstore

Save all images in a single folder and create sub folders for different set of samples.

imds = imageDatastore('gesture', ...
        'IncludeSubfolders',true,'LabelSource','foldernames');

Split Data for Training and validation

[imdstrain, imdsvalid]=splitEachLabel(imds,.8,'randomize');
  
     
CountLabel = imds.countEachLabel
aa=read(imds);
size(aa)

Define the Network Layers

Image Input Layer An ImageInputLayer is where you specify the image size
Convolutional Layer It is a CNN filter , where inputs are filter size and number of neurons.
Batch Normalization Layer Batch normalization layers normalize the activations and gradients propagating through a network.
ReLU Layer It is a linear rectified unit , it is used to convert negative feature to 0.
Max-Pooling Layer It is used for down sampling and to reduce redundant features.
Fully Connected Layer It is used to connect all neurons and we provide number of classes in it.
Softmax Layer It is used to find out the probability of object in the image.
Classification Layer Based on the softmax layer it identify the object from the image.

Define the convolutional neural network architecture.

layers = [
imageInputLayer([28 28 1])
convolution2dLayer(3,16,'Padding',1)
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)
convolution2dLayer(3,32,'Padding',1)
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)
convolution2dLayer(3,64,'Padding',1)
batchNormalizationLayer
reluLayer
fullyConnectedLayer(10)
softmaxLayer
classificationLayer];

Define Option for Training

If you decrease initial learning rate then accuracy will be reduced and if you increase then accuracy will be increased. But be careful it can be overfitting.

options = trainingOptions('sgdm', ...
    'InitialLearnRate',0.01, ...
    'MaxEpochs',10, ...
    'Shuffle','every-epoch', ...
    'ValidationFrequency',10, ...
    'Verbose',false, ...
    'Plots','training-progress');

Train Network Using Training Data

CNN model will be saved as convnet

convnet = trainNetwork(imdstrain,layers,options);

The training progress plot shows the mini-batch loss and accuracy and the validation loss and accuracy.

Calaculate Accuracy using Validation Dataset

YPred = classify(convnet,imdsvalid);
YValidation = imdsvalid.Labels;

accuracy = sum(YPred == YValidation)/numel(YValidation)

Plot Confusion Matrix

First input will be original output and second input will be predicted output.

Confusion matrix shows how many samples are perfectly classified and how many are miss-classified. Green color shows correctly classified and pink color shows miss-classified.

Read an image from datastore and predict the class

a=read(imdsvalid  );
class=classify(convnet,a)
imshow(a)
title(string(class))

Interface with webcam or IPcamera

%camera = webcam(1); % webcam
%camera = ipcam('http://192.168.225.88:8080/video'); % paste the same url as show in the IP Webcam app 
clear camera
camera = webcam

while true   
    im = camera.snapshot;     
    picture=rgb2gray(im);% Take a picture    
    picture = imresize(picture,[48,48]);  % Resize the picture

    label = classify(convnet, picture);        % Classify the picture
      

    image(im);     % Show the picture
    title(char(label)); % Show the label
    drawnow;   
end

To understand deep learning and emotion detection using CNN through recorded webinar kindly click here

For data science with MATLAB click the playlist link.