<h1>Face Parsing for Mobile AR Applications</h1>

<p style="text-align: center;">{youtube}{/youtube}</p>

<p style="text-align: center;"> </p>

<p style="text-align: center;">CVPR 2017. </p>

<p style="text-align: center;"> </p>

 

<p style="text-align: center;"> </p>

Project n°4: Person detection

 

 

 

The goal of the project is to detect moving persons on a video using deep learning methods. We teach the neural network to detect persons on images by learning from an annotated database. More specifically we will use the SSD method thanks to the tensorflow module on python.

 

 

 

I Convolutionnal networks.

 

 

This method uses convolutional neural networks, that is to say neural networks that have convolutional layers:

 

A convolutional layer is composed of packs of neurons, each pack only processes a certain window in the initial image. A retropropagation algorithm is used to modify the activation filters that each neurone applies to the window it is assigned to.

 

 

A pooling (agregation) layer is often applied to reduce the dimension of the output after a convolutional layer.

 

 

The outputs are then connected to a fully connected layer (a layer in which each neuron is connected to every output of the previous layer) to allow an overall processing of the information.

 

 

 

 

 

Scheme of a convolutional layer (in blue) applied on an image (in pink)

 

 

 

 

This type of network is particularly relevant as far as image recognition is concerned, as it processes information at the local scale (which is ideal for border recognition for example) allowing a much faster and adapted processing.

 

II VGG network

 

 

 

The VGG network is a a multi-layer convolutionnal network that aims to predict the probability of presence of object classes in an image. Convolutional layers are applied to the inputted image, followed by a pooling layer, then convolutional layers are applied again and so on.

 

After several iterations, each reducing the dimension of the output, fully connected layers are applied and finaly a classification layer gives the output probability for each class of object.

 

 

 

Scheme of a VGG network

 

 

 

This model of network is one of the most efficient for image recognition, it managed to attain more than 92 % of successful recognition on the image net database.

 

 

 

III SSD network

 

 

The SSD network, standing for Single Shot multibox Detector, it is a method for detecting objects in an image using a single deep neural network. It's part of the family of networks which predict the bounding boxes of objects in a given image.

 

It is a simple, end to end single network, removing many steps involved in other networks like faster RCNN which try to achieve the same task.

 

 

The SSD network uses the VGG architecture as a base. But instead of trying to classify the image after in went throught the VGG, we make the output pass throught several other convolutionnal layers and connect the output of each layer to the final fully connected layer.

 

 

A SSD scheme

 

 

 

As the pooling layers reduce the dimension of the image at each step, we get a processing for different sizes of the image, allowing a classification at many scales at the same time.

 

 

Unlike fasterRCNN, the SSD network doesn’t have to separate the localisation process from the classification process allowing a much faster processing .

 

 

 

IV Code

 

 

In the code we import ssd_vgg_300 which contain the method of a vgg, a deep neural network .

 

 

It consists in a convolutional network with filters of decreasing size so here we have :

 

6 layers of size (38*38, 19*19, 10*10, 5*5, 3*3 and 1*1).

 

Then we set the default parameters of SSD :

 

-im_shape defines the size of the image,

 

-num_classes the number of different classes that we want to use to classify the elements found in the image,

 

-The annotation of the label.

 

-The layers and the shapes.

 

-Anchors which are points in the image used to create boxes in order to detect some interests points, they act as filters.

 

-The dropout which is a value that indicates the number of neurons we keep after each step.

 

 

 

Now that we have the default parameters for the code, we define the specific paramters layers, the bbox (thanks to the anchor)

 

We need to input :

 

-An image.

 

-The number of different classes.

 

-The layers define before.

 

-The anchor_sizes and ratio defined before in the ssd_anchors_all_layers.

 

-The normalisation define with the boxes, and then if the net is training or not.

 

-The dropout.

 

-The function used to predict the next position of the persons/objects in the image the step after (it creates different possibilities of paths for the next step).

 

 

 

 

V Test of the network

The demo folder contains a set of images for testing the SSD algorithm in the main file. The notebook folder contains a minimal example of the SSD TensorFlow pipeline. Basically, the detection process is composed of two main steps:

1)Running the SSD network on the image

2)Post-processing the output (putting a rectangle on the detected object with a number which corresponds to the class the object belongs to).

 

 

The training data used in this SSD model is VOC datasets (2007 and 2012).
We test the algorithm with the my_test.py file, by importing the VVG neural network from the nets folder and providing the path of an image in the demo folder using the commands below:

path = './demo/'
image_names = sorted(os.listdir(path))
img = mpimg.imread(path + image_names[-1])
Example:

 

 

 

 

Of the seven people present in the photograph, 5 are well recognized but 2 persons are missing.

 

It seems that the code can be improved to obtain better results.

 

 

 

Conclusion:

 

 

For this project we will use the SSD architecture, which is faster than R-CNN.

 

The final objective of this project is to improve the code of ssd in order to classify elements in a video which is nothing but a sequence of image.

 

 

 

 

 

 

Cette page regroupe les supports utilisés dans le cours d'apprentissage du Master Automatique Robotique de l'Université Clermont Auvergne. 

Pour les supports liés au cours de suivi d'objets, voir la page Computer Vision courses

 

Support du TD1 : Machine Learning

Support du TD3 : Suivi de points d'intérêts  (correction if given in attached files below)

Polycompétence imagerie

Cette page présente le module de polycompétence Imagerie. Ce dernier est constitué d'un module de cours (commun avec le Master 2 Automatique Robotique et avec le Master 1 Traitement du signal et des images), de séances de TD/TP sur machines, d'un cycle de conférences et d'un projet de synthèse.

Planning des cours, tds et conférences (prévisionnel) pour 2019/2020

 Imagerie. 1/7 M1 TSI, M2 Mc, PC Imag : Dates : 26 sept. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. 2/7 M1 TSI, M2 Mc, PC Imag : Dates : 3 oct. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. 3/7 M1 TSI, M2 Mc, PC Imag : Dates : 10 oct. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. 4/7 M1 TSI, M2 Mc, PC Imag : Dates : 17 oct. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. 5/7 M1 TSI, M2 Mc, PC Imag : Dates : 24 oct. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. 6/7 M1 TSI, M2 Mc, PC Imag : Dates : 7 nov. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. 7/7 M1 TSI, M2 Mc, PC Imag : Dates : 14 nov. 2019 à 07:45 - 09:30 : Salle 127

 Imagerie. Exam M1 TSI, M2 Mc, PC Imag Dates : 21 nov. 2019 à 07:45 - 09:30 : Amphi 107

 

Imagerie. 1/5 TD PC Imag Dates : 3 oct. 2019 à 13:30 - 15:00

Imagerie. 2/5 TD PC Imag Dates : 17 oct. 2019 à 13:30 - 15:00

Imagerie. 3/5 TD PC Imag Dates : 24 oct. 2019 à 9:30 - 11:00

Imagerie. 4/5 TD PC Imag Dates : 7 nov. 2019 à 11:00 - 12:30

Imagerie. 5/5 TD PC Imag Dates : 14 nov. 2019 à 13:00 - 14:30

 

PC Imag Projet 1/8 Dates : 24 oct. 2019 à 13:30 - 17:30

PC Imag Projet 2/8 Dates : 7 nov. 2019 à 14:30 - 17:30

PC Imag Projet 3/8 Dates : 14 nov. 2019 à 14:30 - 17:30

 

PC Imag Projet 4/8 Dates : 5 déc. 2019 à 13:30 - 17:30

PC Imag Projet 5/8 Dates : 12 déc. 2019 à 13:30 - 17:30

PC Imag Projet 6/8 Dates : 19 déc. 2019 à 13:30 - 17:30

PC Imag Projet 7/8 Dates : 9 janv. 2019 à 08:00 - 12:00

PC Imag Projet 8/8 Dates : 9 janv. 2019 à 13:30 - 17:30 

Planning du cycle de conférences : 

PC Imag Conf. C. Tilmant IP/TGI Dates : 03 oct. 2019 à 15:00- 17:00

PC Imag Conf. S. Caux Uniswarm Dates : 10 oct. 2019 à 09:30 - 11:30

PC Imag Conf. V. Arvis Michelin Dates : 17 oct. 2019 à 09:30 - 12:30

PC Imag Conf. T. Feraud  Wissen Sensing Tech. Dates : 07 nov. 2019 9 jan 2020 à 09:30 - 11:30

PC Imag Conf. R. Tomczak Optomachines Dates : 12 dec. 2019 à 09:30 - 11:30

 

Projet de synthèse :

Dans le cadre de la Polycompétence imagerie, vous devez réaliser un projet de synthèse sur 32 heures. Ce projet donnera lieu à un rapport qui sera rendu en deux étapes. La première partie, décrivant le principe de la méthode et la bibliographie devra être réalisé pour le 20 Décembre 2019. Ce rapport sera complété par vos réalisations et évaluation, tout cela doit être rendu pour le 20 janvier 2020. Il faudra rendre le rapports et les codes correspondants sous la forme d'une archive zip à par mail à l'adresse thierry.chateau_at_uca.fr (remplacer _at_ par @). Dans le cas d'un fichier volumineux, passez par le système de transfert de l'ENT UCA.

Le projet 2019-2020 tourne autour de la conception du module de perception d'une application robotique de manipulation d'objets. Il vient alimenter un travail de thèse réalisé à l'Institut Pascal. La description du cahier des charges est disponible ici. 

Deux sujets sont proposés :

  1. Détection de pièces d'un jeu pour une application robotique (Télécharger le sujet)
  2. Estimation de pose pour une application robotique (Télécharger le sujet)

Le choix des groupes et des sujets devra être effectué pour le 7 Novembre. Vous devez former des groupes de 4 personnes max et remplir ce formulaire pour vous positionner

 

I joined Polytech Clermont-Ferrand (a graduate school of engineering) for teaching and Lasmea (UMR 6602, Blaise Pascal University,CNRS) for recheach in 2001, after obtaining a Ph.D (that you can download here) from the Blaise Pascal University where I worked with Pierre Bonton, Francois Collange, Joseph Alizon and Laurent Trassoudaine, and in collaboration with the Cemagref where I worked with Michel Berducat and Christophe Debain. Before that, I also obtained degrees from the same schools (Polytech Clermont Ferrand and Blaise Pascal University).Moreover, I have obtained the HDR (Habilitation à Diriger des Recherches) in september 2010, that you can download here. Lasmea became Pascal Institute in 2012. I Joined UFR Sciences department of Blaise Pascal University as full professor in September 2012 where I teach Automatic and Computer Vision. Moreover, I teach Computer Vision in Polytech CF and ISIMA (two schools of engineering) and in Robotics Research Master  of Blaise Pascal University. 

 My wife Sandrine and I are raising two wonderful kids, Gabin (born in 2002) and Lena (born in 2005).