Artificial Intelligence : Celebrity Recognition Bot with Computer Vision

Computer vision is a field of computer science that works on enabling computers to see, identify and process images in the same way that human vision does, and then provide appropriate output. It is like imparting human intelligence and instincts to a computer. In reality though, it is a difficult task to enable computers to recognize images of different objects.

Computer vision is an important subject in the field of artificial intelligence, as the computer must interpret what it sees, and then perform appropriate analysis or act accordingly.

The Microsoft Azure Computer Vision API allows us to process an image and retrieve information about it. It relies on advanced algorithms to analyze the content of the image in different ways, based on our needs. In this article I will try to walk you through the various angles on which we humans see images and how accurate the computer is in matching our visions. Hope you all would enjoy the journey!

Please note that I have purposefully left out some complex areas or haven’t deep dived into details somewhere inorder to make this article an easy read and help as a quick guide from where to start in the Microsoft Azure Cognitive space.

Prerequisites:

  • Microsoft Visual Studio 2017 (Though you can try on VS2015, but this demo is on VS2017)
  • Azure Subscription (Free subscription of 30 days will also do)
  • Basic C# programing knowledge

Getting up the Vision API in Azure

  • Login to Azure (If you do not have any subscription already then create one else login to your existing one)
  • Click the “+ Create a resource” optionCreateResource
  • Search for “Vision” in the search box and select “Computer Vision” VisionSearch
  • Click “Create”CreateVision
  • Fill up the necessary details and click Create. The Name, Resource Group and location you can choose as per your preference.createVis
  • Wait few seconds for Azure to create the service for you. Once created it will take you to it’s landing page (Quick start)visionlanding
  • Now select the “Keys” property and copy the Key 1. You can also copy Key 2 if you wish as any one will solve the purpose. Keep it in a notepad. Will need this later.visionkeys
  • Select the Overview tab and copy Endpoint. Keep it in a notepad. Will need this later.OverviewVision
  • We are done setting up the Vision API in Azure

Visual Studio Application – To see the action

  • Open visual studio 2017 and select File >> New >> Project. Select “Windows Form App (.Net Framework)” as your project type. Please note that Visual C# is my default language selection. Choose the project name as per your wish (I have given VisionAnalysis_AI)VisionForm
  • Now in order to use the Vision API, we need to add a reference to the Microsoft.ProjectOxford.Vision NuGet library. So, right click on the project >> Manage NuGet Packages. Browse for the library and install. NuGetVision
  • Add the Newtonsoft.Json NuGet library the same way. We will use this to parse the Json result later in the code
  • Create a form layout with one button (Name : btnBrowseAnalyse), FileDialog (Name: openFileDialog), picture box (Name : PictureBox) and a read only text box control (Name : txtImageAnalysis). The button is for browsing a celebrity image file using file dialog, the picture box is for showing the image and the text box for the analysis result. Simple! Form
  • Open the form’s code behind and add the following using statement at the top
using Microsoft.ProjectOxford.Vision;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
  • Open App.Config file. (Add this file to the solution if not already exists). Add the following settings under <appSettings /> section
<appSettings> <add key="VISION_API_KEY" value="Subcription key copied in notepad" />
<add key="VISION_ROOT_URI" value="Endpoint url copied in notepad" />
</appSettings>
  • Declare the following variables at the top of the class
IVisionServiceClient _visionClient;
string VISION_API_KEY = string.Empty;
string ROOT_URI = string.Empty;
  • Append the following piece of code in the constructor method of the class. This is just to read the settings from the config file at the start of the program.
VISION_API_KEY = ConfigurationManager.AppSettings["VISION_API_KEY"];
ROOT_URI = ConfigurationManager.AppSettings["VISION_ROOT_URI"] + "vision/v1.0"; //appending the version
  • Double click the button “btnBrowseAnalyse” to land directly into its click event. Add the following lines of code inside the event.
//Setting up the file dialog for image files. You can configure it to accept more file types like png, etc
openFileDialog.Filter = "JPEG Image (*.jpg) | *.jpg";
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
//Clearing the text box before the analysis
txtImageAnalysis.Text = string.Empty;
//Storing the selected image local file path to a variable
string filePath = openFileDialog.FileName;
//Initializing the image as bitmap image and showing it in the picture box Bitmap
image = new Bitmap(filePath);
PictureBox.Image = image; 

//Function to do the image analysis and ultimate results. Passing the file path as reference
doImageAnalysisAsync(filePath);
}
  • The code for the image analysis in the doImageAnalysisAsync method. It’s doing a an analysis of the image based on various params just we humans do. The method looks a bit creepy but the code is very simple.
private async Task doImageAnalysisAsync(string filePath)
{
   _visionClient = new VisionServiceClient(VISION_API_KEY, VISION_ROOT_URI);
   string[] SelectedFeatures = new string[] { "categories", "description" };

    try
    {
       using (FileStream fileStream = File.OpenRead(filePath))
       {
          //Invoking the Vision API for analyzing the image
          var result = await _visionClient.AnalyzeImageAsync(fileStream, SelectedFeatures);
          StringBuilder analysisText = new StringBuilder();

          if (result != null)
          {
             if (result.Categories.Count() &amp;amp;gt; 0)
             {
                analysisText.AppendLine().AppendLine("Categories:");
                analysisText.AppendLine("------------------------------------------------------------------------------------------------------------------------");
                analysisText.AppendLine(result.Categories[0].Name + " (Score : " + result.Categories[0].Score.ToString() + ")");

                if (obj != null &amp;amp;amp;&amp;amp;amp; obj["celebrities"] != null &amp;amp;amp;&amp;amp;amp; obj["celebrities"].Count() &amp;amp;gt; 0)
                {
                   analysisText.AppendLine().AppendLine("Celebrity (Count : " + obj["celebrities"].Count().ToString() + ")");
                   for (int i = 0; i &amp;amp;lt; obj["celebrities"].Count(); i++)
                   {
                      analysisText.AppendLine("Name : " + ((Newtonsoft.Json.Linq.JValue)obj["celebrities"][i]["name"]).Value  + " (Confidence : " +
                      ((Newtonsoft.Json.Linq.JValue)obj["celebrities"][i]["confidence"]).Value  + ")");
                   }
               }
               else
                  analysisText.AppendLine().AppendLine("Celebrity : No");
             }
          }
//Display the celebrity analysis result
txtImageAnalysis.Text = analysisText.ToString();
       }
    }
    catch (Exception ex) { MessageBox.Show(ex.Message); }
}

Congratulations! for coming this far. Your intelligent celebrity recognition bot is now ready. Build and press F5 to run the program.

Hit the browse button and select a celebrity image from your local machine. The image will show up in the picture box and the system will immediately start to analyze the image to find out the celebrity in it. If found, the details will be show else it will show “Celebrity: No”

Example: Picture of “Bill Gates” provided for checking. Voila! It’s identified with confidence rate close to 100%. Remember with naked eye we can easily identify but here your small piece of AI code is doing the trick for you.

ImageForm

Take another example. Lets give it an image containing group of people with a celebrity present in there. Can it identity from a group as well? YES! it does with near to 100% accuracy.

ImageForm - Copy

Now let’s try with a bunch of celebrities in a single image. It worked. Amazing! It managed to get 7 celebrities out of it  with names of each along with the accuracy rate. Quite Impressive.. isn’t it.

ImageForm

Well Done! Your Celebrity recognition bot is up and running.

Do share with me about your experience and what you have built upon this foundation. You can take it upto any level and integrate. I would love to hear from you.

Advertisements

Artificial Intelligence : Speech To Text Recognition Bot!

Speech recognition is a standard for modern apps; users expect to be able to speak, be understood, and be spoken to. The Microsoft Cognitive Services – Speech API allows you to easily add real-time speech recognition to your app, so it can recognize audio coming from multiple sources and convert it to text the app understands.

In this article, I would walk you through the steps for creating your first Speech-to-Text artificial intelligence in a simple C# console application using the Microsoft Speech Cognitive API.

Continue reading