Applikationen

Inject AI Fun into a Computing Project

vector illustration of two human heads, but with the faces covered by nodes

Image Source: Vitaly Art/Shutterstock.com

By Michael Parks, P.E., for Mouser Electronics

Edited May 4, 2020 (Originally published Feb 24, 2020)

For the past 100,000 years, the human brain has been the most powerful computer on the planet. Our brain is quite extraordinary, capable of logic and reasoning, but also creativity and emotion. Today, scientists continue to study our biological brains while engineers have been attempting to replicate their functionality in silicon and software.

Deep Learning (DL), a subset of Machine Learning (ML), is an emerging artificial intelligence (AI) technology that is at the heart of the second-generation Intel® NCS (NCS2). Deep neural networks rely on ML algorithms and sample training data to generate mathematical models. Generating the trained models and computation must be done with heavy-duty computing hardware. However, the trained models can in turn on run on relatively low-cost hardware such as a Raspberry Pi and the NCS2. Through inferencing, these models can make speedy predictions when presented with new, real-world inputs.

For example, a vision model can be tuned to identify red and green apples by being exposed to thousands of images of both colors of apples. The trained model then can run on an NCS2 and be fed new images via webcam, and the neural network then identifies and classifies apples placed in the field of view of the camera.

Physical Computing: Making Technology Relatable

Not long ago, the notion of creating a synthetic brain was purely in the realm of science fiction. While true general AI is still far away, it is undeniable that technology continues to advance at breakneck speed. Although the pursuit of technical achievement is worthwhile in and of itself, there is something to be said about not forgetting the artistic and creative aspects of life. Physical computing can be a bridge between intangible digital technology and the more intuitive nature of tangible objects. Adding a bit of whimsy to projects can help knock down the barriers to acceptance of new technology by society. Certainly, artificial intelligence is an innovation that could benefit from such relatability.

This project will combine the latest in cutting edge AI-powered machine vision technologies with a bit of fun and quirkiness that physical computing brings to the human-technology interface. We will utilize two NCS2s, a Raspberry Pi, a webcam, and a few servos to build a device that observes a person's face and determines what emotion they are emoting, then causing something physical to occur in the real world based on the specific emotion.

Background

If you have never tinkered with neural networks or programmed in Python, then this project will serve as a great introduction to both. Incorporating Intel's second-generation Neural Compute Stick (NCS2) into a project can be quite the learning experience. Terminology such as machine learning, deep learning, and neural networks seem to be ripped from the pages of a science fiction novel instead of a datasheet or application note.

Figure 1: Artificial Intelligence is a vastly wide and deep area of research. This project implements a Convolution Neural Network (CNN). (Source: Mouser Electronics)

The brain of the project (pun intended) is the NCS2. It is a USB 3.0 device that brings nearly plug-and-play AI inferencing to embedded system developers looking to add vision-based intelligence to their product. It is built on the Movidius™ Myriad™ X Vision Processing Unit (VPU), which is an AI-optimized chip for accelerating vision computing based on Convolutional Neural Networks (CNN). Object identification and classification are possible using this technology (i.e. identify a person's face is in the frame of a camera [identification] and whether that face is smiling or not [classification]).

If this is all brand-new to you, let's take a moment to understand the workflow involved in going from an idea to a working system that uses the NCS2 to identify and classify objects with a camera.

Train a Neural Network (NN): Many machine learning frameworks are available to train a neural network, including Caffe, TensorFlow, Kaldi, MXNet, and Open Neural Network Exchange (ONNX). Some of the frameworks can be run on a desktop computer, while others can be run as containers that leverage the cloud services such as AWS or Google Compute Engine. The bottom line is that the more processing power (CPUs and GPUs) you throw at training a NN, the faster or more robust the final model will be. This is true because, in order for a NN to be able to detect an object, it must first be trained by being exposed to images of what you wish to detect—not just a few images, but minimally thousands of images for simple objects. That's a lot of data to run through the algorithms, and thus the need for hefty processing power to train the NN. There is also the potential for a significant time from a human trainer as the dataset of training images must be labeled when utilizing a supervised learning model.
Download the Model: Training the model is only the first, albeit most time-consuming, part of getting our device to detect the new objects that it will be exposed to via the webcam. For our project, the model will detect human faces and categorize the emotion being shown. Each framework will have its own file format. Some of the more popular frameworks that will work with the Intel NCS2 include .caffemodel (Caffe), .pb (TensorFlow), .params (MXNet), .onnx (ONNX), and .nnet (Kaldi).
Now for some great news: It's possible to download pre-trained models that others have previously generated. As with leveraging any pre-built piece of software, your mileage may vary and it's possible that a model cannot perform as ideally as you would expect. But in many cases, using a pre-trained library will work just fine. This project will utilize models that detect human faces and classify what emotion they are emoting.
Prepare the Model for Inferencing Hardware: The next step is to take the trained model and prepare it to run on the chosen endpoint device. Intel® provides the OpenVINO™ Toolkit to do just that. The Model Optimizer will take the outputs of the neural network frameworks (Caffe, TensorFlow, etc.) and produce an Intermediate Representation (IR). The IR consists of two files; a .xml and .bin file. The .xml files contain the code that describes the network topology. The .bin file contains the weights and biases binary data. To put it another way, the .xml file describes how the network is interconnected while the .bin tells how much value (weight) each path is assigned.
Inferencing: Up until this point, we have not yet used the Intel NCS2. As mentioned, training of the model is done on hefty computing hardware. Inferencing, a fancy way of saying exposing a neural network to new inputs for it to recognize, classify, and process, is where the NCS2 comes into action. OpenVINO will read the IR, prepare and load the network onto the selected end device (the NCS2 in this case) and set any necessary configuration parameters. The Inference Engine (IE) aboard the NCS2 runs the deep learning model and infers input data as it is provided. It also provides a set of handy libraries to integrate outputs of the IE into custom applications.
Develop Value-Added Functionality with API: Firmware development of embedded systems has been dominated by the C programming language for decades. More recently, Python has been gaining traction for application development. Python is also beginning to make inroads in embedded development with the creation of the MicroPython and CircuitPython offshoots. For this project, we will leverage Intel's Python API to interact with the Inference Engine. Intel also provides a C++ library for those inclined to stick with C/C++. Using the Python API makes interacting with the inference engine a matter of simple function calls.

Materials

This project will utilize the 4GB Raspberry Pi 4 single-board computer as the heart of the system. Other key components will include:

Figure 2: The Raspberry Pi keeps the same form factor with significantly upgraded hardware.

Intel NCS 2 (x2)
OV5647 Camera (or USB webcam)
Servo motors (x3)

Bill of Material (BOM)

The BOM is listed below. Alternatively, you can visit mouser.com and grab all the parts you need from a pre-built shopping cart, just click here. As of this writing, the below BOM is about $370 (USD) before shipping and taxes. Table 1 lists the items in the BOM.

Table 1: AI + Physical Computing Project Bill of Materials

Order Qty.	Mouser Part #	Description
1	374-T7715DV	Wall Mount AC Adapters Interchangeable Wall Plug Power Supply 5.1VDC 15.3 WATT, Raspberry Pi 4 Compatible
1	358-RPI4MODBP4GBBULK	Single Board Computers RASPBERRY PI 4 MODEL B, 4GB, BULK
1	467- SDSDQAB-016G-J	16GB MicroSD Card
1	538-68786-0001	HDMI Cables HDMI Cable Assbly 1M Micro to STD
2	607-NC SM2485.DK	Development Boards & Kits - Other Processors Movidius Neural Compute Stk 2 MX VPU
3	619-900-00005	AC, DC & Servo Motors SERVO ASSEMBLY
1	485-3099	Video IC Development Tools Raspberry Pi Camera Board v2 - 8 MP
1	510-GS-400	PCBs & Breadboards 3.3X1.4 400 Tie Points 80 Term Clips
1	854-ZW-MF-10	Jumper Wires ZIPWIRE 10cm MALE TO FEMALE
3	855-M22-2010346	Headers & Wire Housings 3 SIL VERT PIN HDR TIN

Tools and Other Resources

Here is a list of recommended tools to have on hand in order to complete this project:

Windows-based computer running the OpenVINO Toolkit
Computer monitor or television with an HDMI port
A wireless or wired Internet connection
USB 3.0 Hub
USB Keyboard
USB Mouse
USB Webcam (optional, needed if not using OV5647 camera)
Wire strippers
Digital multimeter
Needle nose pliers

System Overview

Figure 3: Sketching is a great way to flesh out ideas.

The system has five major components.

Raspberry Pi: The latest Raspberry Pi is quite a powerhouse for a Single Board Computer (SBC). It packs a 1.5GHz Broadcom BCM2711 Quad-core Cortex-A72 64-bit SoC. We are using the model that comes with 4GB of LPDDR4-2400 SDRAM. The now-standard 40-pin GPIO header will be used to interact with the servos. The Raspberry Pi will also run the Python script, host the camera and HDMI monitor, and interface with the two NCS2 devices.
Neural Compute Sticks: The NCS2 devices will run two neural networks. One will analyze the video stream from the camera and detect when a human face is in the frame. The second network will inference what emotion the person's face is emoting.
Camera: The camera will feed a stream of images to the NCS devices for it to analyze. This can be either a USB webcam or a Raspberry Pi OV5647 camera. A command line flag is used when running the Python script to set what camera is being used.
Servos: Three servos will be used along with mechanical sliders to interact with the real-world. Specifically, different color flowers will be presented to the user based on the emotion they present to the camera—yellow flowers for a happy expression, blue for a sad expression, and red for an angry expression.
HDMI Monitor: The HDMI monitor will be used to display the output of the terminal and the camera to the user.

Building the Electronics

Figure 4: Hooking up the hardware is very straightforward.

Assembling this project is very straightforward. The only recommendation is to not plug in the USB with the NCS2 sticks until after the Raspberry Pi is fully booted. We aren't sure if this was the cause of a failed boot of the Raspberry Pi, but it did seem to prevent the Pi from booting initially. Subsequent boot-ups did not have a problem and the error could not be recreated.

Insert the micro-HDMI to HDMI adapter into Raspberry Pi.
Insert the HDMI into an HDMI port of a computer monitor or television.
Insert the MicroSD card with the latest version of Raspbian flashed onto the memory card. For instructions on how to get the Raspbian OS onto your MicroSD card, click here.
Insert the USB keyboard and mouse into the USB 2.0 ports. The USB 2.0 ports will have black plastic connectors, as opposed to blue plastic for the USB 3.0 ports.
Install the camera.
1. If using an OV5647 camera, gently open the connector, insert the ribbon cable and latch the connector closed.
2. If using a USB webcam, insert the webcam into a USB 3.0 port of the Pi.
Plug the USB C power supply into an AC outlet.
Insert the USB C power supply into the USB power jack of the Raspberry Pi.
Boot the Pi and complete the initial setup. Then shut down the Pi. The initial setup is complete.

Figure 5: Pinout of the Raspberry Pi PIO header

With the initial setup complete it is time to attach the servos. We will be attaching three servos to the Pi.

Connect a hookup wire from the GND pin on the Raspberry Pi to the GND (blue) rail on the mini-breadboard.
Connect a hookup wire from the 5V pin on the Raspberry Pi to the power (red) rail on the mini-breadboard.
Connect the GND wire of each servo to GND rail of the breadboard.
Connect the V_cc wire of each servo to the power rail of the breadboard.
Connect the control signal of the first servo to GPIO pin 12 of the Raspberry Pi. This will be the “Happy” servo.
Connect the control signal of the first servo to GPIO pin 13 of the Raspberry Pi. This will be the “Sad” servo.
Connect the control signal of the first servo to GPIO pin 18 of the Raspberry Pi. This will be the “Angry” servo.

Now that the servos are wired, it is time to do the final to interface with the two NCS devices.

Insert the two Intel Neural Compute Sticks into the USB 3.0 hub. Do NOT yet insert the USB hub into the Raspberry Pi.
Boot up the Pi again and log in.
Install the needed software to operate the NCS devices. (see the Software section of this article) and reboot the machine once again. Insert USB 3.0 hub into a USB 3.0 port on the Raspberry Pi when directed.
Follow the remainder of the tutorial in the Software section to get everything running.

Software

Figure 6: The default Raspbian OS provides all the tools needed to edit the Python script to suit the project to your needs.

In this section, we will detail the software side of the project. This project has been tested with the Buster version of Raspbian. It was tested on both a Raspberry Pi 3B+ and a Raspberry Pi 4.

If you have never flashed a version of Raspbian OS onto a microSD card, we recommend using the Etcher application, which can be found using Google, as well as this tutorial.

One last note, if your Raspberry Pi 4 fails to boot, there is a chance that the EEPROM has become corrupted. Follow these steps to correct that situation.

The rest of this guide will focus on the steps to take after you have completed the initial setup of the Raspberry Pi and have successfully connected it to the internet.

Installing OpenVINO and the Trained NN Model

Having experience navigating Linux via a command line is very beneficial to getting this project up and running on your own board. Here are the steps that we followed to get Intel OpenVINO software up and running. Do not install the Neural Compute Sticks until directed to do so.

$sudo mkdir -p /opt/intel/openvino

$cd ~/Downloads/

$wget –no-check-certificate https://download.01.org/opencv/2019/openvinotoolkit/R2/l_openvino_toolkit_runtime_raspbian_p_2019.2.242.tgz

$sudo tar -xf l_openvino_toolkit_runtime_raspbian_p_2019.2.242.tgz

--strip 1 -C /opt/intel/openvino

$sudo apt install cmake

$source /opt/intel/openvino/bin/setupvars.sh

$echo "source /opt/intel/openvino/bin/setupvars.sh" >> ~/.bashrc

To test to see if everything is working up to this point, open a new terminal. You should see the following:

[setupvars.sh] OpenVINO environment initialized

Assuming we are successful, let’s continue in the original terminal window.

$sudo usermod -a -G users "$(whoami)"

$sh /opt/intel/openvino/install_dependencies/install_NCS_udev_rules.sh

$sudo apt-get install -y python3-picamera

$sudo -H pip3 install imutils --upgrade

$git clone https://github.com/Mouser-Electronics/Emotions_and_PhysicalComputing.git

$cd Emotions_and_PhysicalComputing

Lastly, if you are using the OV5647 camera execute the following:

$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 1

Alternatively, if you are using a USB webcam execute the following:

$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 0

Next, let’s take a deeper look at the Python file.

Project Files

This following source code files can be found in this project’s GitHub repository Software folder.

main.py: The Python script is the file that we will add our project-specific code to take the outputs of the NN and depending on the emotions inferred, cause an effect in the real-world via the servos.
face-detection-retail-0004.xml: Contains the network topology of a neural network that can detect whether a human face is in an image presented to it.
face-detection-retail-0004.bin: Contains the weights and biases of a neural network that can detect whether a human face is in an image presented to it.
emotions-recognition-retail-0003.xml: Contains the network topology of a neural network that can detect various emotions in images of a human face.
emotions-recognition-retail-0003.bin: Contains the weights and biases of a neural network that can detect various emotions in images of a human face.

Libraries

The preprocessor directive #include lets us add libraries into our projects. This promotes code reuse, unless you have very specific needs, reinventing the wheel is not necessary. This project uses the following libraries:

sys: Provides variables and functions to interact with the interpreter, such as passing command line arguments to the Python script.
numpy: Also referred to as NumPy, is a package that provides advanced mathematic functions that can be used by the script.
os: Allows access to OS-dependent functionality such as interfacing with the file system and I/O.
time:This library provides time-related functions such as getting date and time from the system or setting delays via the sleep() function.
multiprocessing: This library provides a mechanism to spawn multiple processes that can run concurrently.
gpiozero: This library provides functions to interact with the 40-pin GPIO header with various actuators and sensors such as servos and LEDs.
openvino.inference_engine: Allows the Python script to interact with the Inference Engine aboard the NCS2 devices.
heapq: This library provides an implementation of a heap queue algorithm. Heap queue is a type of priority queue that implements a binary tree where the element with the smallest priority is always kept at the root.
threading: This library provides threads, allowing for concurrent threads to run at once.
pivideostream: This library provides a mechanism to interact with the camera.
imutils: This library provides a set of functions of doing image processing such as rotating, translating, and resizing.

Variables and Constants

The main.py hosts' several variables that will allow us to interact with the NCS2 devices and the servos.

These are the instances of the servo model from the gpizero library, one for each servo:

happyServo = Servo(12)

sadServo = Servo(13)

angreyServo = Servo(18)

These are variables used to extract the emotion detected by the NCS2 so that they can be used by other aspects of the Python script:

emotion = str(object_info[7])

LABELS = ["neutral", "happy", "sad", "surprise", "anger"]

The most important coding aspect of this project will be added to around line 333 in the main.py source file:

…

out self.exec_net.requests[dev].outputs["prob_emotion"].flatten()

emotion = LABELS[int(np.argmax(out))]

if emotion == "happy":

setServosHappy()

else if emotion == "sad":

setServosSad()

else if emotion == "anger":

setServosAngry()

else:

setServosNeutral()

detection_list.extend([emotion])

self.resultsEm.put([detection_list])

self.inferred_request[dev] = 0

…

Functions

setServosHappy(): When a happy face is detected, this function will set the servo on GPIO12 to the maximum position while setting GPIO13 and GPIO18 to the minimum position.
setServosSad(): When a sad face is detected, this function will set the servo on GPIO13 to the maximum position while setting GPIO12 and GPIO18 to the minimum position.
setServosAngry(): When a sad face is detected, this function will set the servo on GPIO18 to the maximum position while setting GPIO12 and GPIO13 to the minimum position.
setServosNeutral(): When no face or emotionless face (neutral) is detected, the function sets all servos the minimum position.

Project in Action

Once the project is assembled and the software is installed, it’s time to have some fun!

From a terminal, enter the following:

$cd ~/Downloads/Emotions_and_PhysicalComputing

If using an OV5647 camera:

$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 1

Alternatively, if using a USB webcam:

$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 0

Figure 7: The terminal and camera output

A small window pops up on the screen that is a video feed from the camera along with some text and boxes.
Ensuring the area is well lit, stand in front of the camera and begin to make happy, sad and angry faces.
The type of face detected should be displayed on the screen as well as the terminal window.
The servos should be functioning and moving the flowers up and down depending on the face you are emoting. AI meets physical computing!

Troubleshooting Tips

1. When running the servos for the first time, run the code with the servos connected to the Raspberry Pi but not to the mechanical components. This will ensure the servos aren't damaged if the code causes the servos to move in a way opposite of the way they are physically installed.
Problems can occur if you are using the wrong flag for the webcam versus the OV5647 camera. Remember -cm 1 for the OV5647 camera and -cm 0 for a USB webcam.
Ensure you are using low voltage servos and that you are running off the Raspberry Pi’s 5V pin and not 3V pin. Also, consider an external power supply just for the servos. Be sure to connect the grounds together.

Concept for the slider mechanism. 6 in. wide pine board cut with two opposing 45-degree cuts makes a simple sliding mechanism.

Figure 8: Concept for the slider mechanism. 6" wide pine board cut with two opposing 45-degree cuts makes a simple sliding mechanism.

Figure 9: Collage of finished project.

Author Bio

Michael Parks, P.E. is a contributing writer for Mouser Electronics and the owner of Green Shoe Garage, a custom electronics design studio and technology consultancy located in Southern Maryland. He produces the S.T.E.A.M. Power Podcast to help raise public awareness of technical and scientific matters. Michael is also a licensed Professional Engineer in the state of Maryland and holds a Master's degree in systems engineering from Johns Hopkins University.

Return to Homepage