Inject AI Fun into a Computing Project
Image Source: Vitaly Art/Shutterstock.com
By Michael Parks, P.E., for Mouser Electronics
Edited May 4, 2020 (Originally published Feb 24, 2020)
For the past 100,000 years, the human brain has been the most powerful computer on the planet. Our brain is quite
extraordinary, capable of logic and reasoning, but also creativity and emotion. Today, scientists continue to
study our biological brains while engineers have been attempting to replicate their functionality in silicon and
software.
Deep Learning (DL), a subset of Machine Learning (ML), is an emerging artificial
intelligence (AI) technology
that is at the heart of the second-generation Intel® NCS (NCS2). Deep neural networks rely on ML algorithms
and
sample training data to generate mathematical models. Generating the trained models and computation must be done
with heavy-duty computing hardware. However, the trained models can in turn on run on relatively low-cost
hardware such as a Raspberry Pi and the NCS2. Through inferencing, these models can make speedy predictions when
presented with new, real-world inputs.
For example, a vision model can be tuned to identify red and green apples by being exposed to thousands of images
of both colors of apples. The trained model then can run on an NCS2 and be fed new images via webcam, and the
neural network then identifies and classifies apples placed in the field of view of the camera.
Physical Computing: Making Technology Relatable
Not long ago, the notion of creating a synthetic brain was purely in the realm of science fiction. While true
general AI is still far away, it is undeniable that technology continues to advance at breakneck speed. Although
the pursuit of technical achievement is worthwhile in and of itself, there is something to be said about not
forgetting the artistic and creative aspects of life. Physical computing can be a bridge between intangible
digital technology and the more intuitive nature of tangible objects. Adding a bit of whimsy to projects can
help knock down the barriers to acceptance of new technology by society. Certainly, artificial intelligence is
an innovation that could benefit from such relatability.
This project will combine the latest in cutting edge AI-powered machine vision technologies with a bit of fun and
quirkiness that physical computing brings to the human-technology interface. We will utilize two NCS2s, a
Raspberry Pi, a webcam, and a few servos to build a device that observes a person's face and determines what
emotion they are emoting, then causing something physical to occur in the real world based on the specific
emotion.
Background
If you have never tinkered with neural networks or programmed in Python, then this project will serve as a great
introduction to both. Incorporating Intel's second-generation Neural Compute Stick (NCS2) into a project can be
quite the learning experience. Terminology such as machine learning, deep learning, and neural networks seem to
be ripped from the pages of a science fiction novel instead of a datasheet or application note.
Figure 1: Artificial Intelligence is a vastly wide and deep area of
research. This project implements a Convolution Neural Network (CNN). (Source: Mouser Electronics)
The brain of the project (pun intended) is the NCS2. It is a USB 3.0 device that brings nearly plug-and-play AI
inferencing to embedded system developers looking to add vision-based intelligence to their product. It is built
on the Movidius™ Myriad™ X Vision Processing Unit (VPU), which is an AI-optimized chip for
accelerating vision
computing based on Convolutional Neural Networks (CNN). Object identification and classification are possible
using this technology (i.e. identify a person's face is in the frame of a camera [identification] and whether
that face is smiling or not [classification]).
If this is all brand-new to you, let's take a moment to understand the workflow involved in going from an idea to
a working system that uses the NCS2 to identify and classify objects with a camera.
- Train a Neural Network (NN): Many machine learning frameworks are available to train a
neural network, including Caffe, TensorFlow, Kaldi, MXNet, and Open Neural Network Exchange (ONNX). Some of
the frameworks can be run on a desktop computer, while others can be run as containers that leverage the
cloud services such as AWS or Google Compute Engine. The bottom line is that the more processing power (CPUs
and GPUs) you throw at training a NN, the faster or more robust the final model will be. This is true
because, in order for a NN to be able to detect an object, it must first be trained by being exposed to
images of what you wish to detect—not just a few images, but minimally thousands of images for simple
objects. That's a lot of data to run through the algorithms, and thus the need for hefty processing power to
train the NN. There is also the potential for a significant time from a human trainer as the dataset of
training images must be labeled when utilizing a supervised learning model.
- Download the Model: Training the model is only the first, albeit most time-consuming, part
of getting our device to detect the new objects that it will be exposed to via the webcam. For our project,
the model will detect human faces and categorize the emotion being shown. Each framework will have its own
file format. Some of the more popular frameworks that will work with the Intel NCS2 include .caffemodel
(Caffe), .pb (TensorFlow), .params (MXNet), .onnx (ONNX), and .nnet (Kaldi).
Now for some great news: It's possible to download pre-trained models that others have previously generated.
As with leveraging any pre-built piece of software, your mileage may vary and it's possible that a model
cannot perform as ideally as you would expect. But in many cases, using a pre-trained library will work just
fine. This project will utilize models that detect human faces and classify what emotion they are emoting.
- Prepare the Model for Inferencing Hardware: The next step is to take the trained model and
prepare it to run on the chosen endpoint device. Intel® provides the OpenVINO™ Toolkit to do just
that. The
Model Optimizer will take the outputs of the neural network frameworks (Caffe, TensorFlow, etc.) and produce
an Intermediate Representation (IR). The IR consists of two files; a .xml and .bin file. The .xml files
contain the code that describes the network topology. The .bin file contains the weights and biases binary
data. To put it another way, the .xml file describes how the network is interconnected while the .bin tells
how much value (weight) each path is assigned.
- Inferencing: Up until this point, we have not yet used the Intel NCS2. As mentioned,
training of the model is done on hefty computing hardware. Inferencing, a fancy way of saying exposing a
neural network to new inputs for it to recognize, classify, and process, is where the NCS2 comes into
action.
OpenVINO will read the IR, prepare and load the network onto the selected end device (the NCS2 in this case)
and set any necessary configuration parameters. The Inference Engine (IE) aboard the NCS2 runs the deep
learning model and infers input data as it is provided. It also provides a set of handy libraries to
integrate outputs of the IE into custom applications.
- Develop Value-Added Functionality with API: Firmware development of embedded systems has
been dominated by the C programming language for decades. More recently, Python has been gaining traction
for application development. Python is also beginning to make inroads in embedded development with the
creation of the MicroPython and CircuitPython offshoots. For this project, we will leverage Intel's Python
API to interact with the Inference Engine. Intel also provides a C++ library for those inclined to stick
with C/C++. Using the Python API makes interacting with the inference engine a matter of simple function
calls.
Materials
This project will utilize the 4GB Raspberry Pi 4 single-board computer as the heart of the system. Other key
components will include:
Figure 2: The Raspberry Pi keeps the same form factor with significantly
upgraded hardware.
- Intel NCS 2 (x2)
- OV5647 Camera (or USB webcam)
- Servo motors (x3)
Bill of Material (BOM)
The BOM is listed below. Alternatively, you can visit mouser.com and grab all the parts you need from a pre-built
shopping cart, just click
here. As of this writing, the below BOM is about $370 (USD) before shipping and taxes. Table
1 lists the items in the BOM.
Table 1: AI + Physical Computing Project Bill of Materials
|
Order Qty.
|
Mouser Part #
|
Description
|
|
1
|
374-T7715DV
|
Wall Mount AC Adapters Interchangeable Wall Plug Power Supply 5.1VDC 15.3 WATT, Raspberry Pi 4
Compatible
|
|
1
|
358-RPI4MODBP4GBBULK
|
Single Board Computers RASPBERRY PI 4 MODEL B, 4GB, BULK
|
|
1
|
467- SDSDQAB-016G-J
|
16GB MicroSD Card
|
|
1
|
538-68786-0001
|
HDMI Cables HDMI Cable Assbly 1M Micro to STD
|
|
2
|
607-NC SM2485.DK
|
Development Boards & Kits - Other Processors Movidius Neural Compute Stk 2 MX VPU
|
|
3
|
619-900-00005
|
AC, DC & Servo Motors SERVO ASSEMBLY
|
|
1
|
485-3099
|
Video IC Development Tools Raspberry Pi Camera Board v2 - 8 MP
|
|
1
|
510-GS-400
|
PCBs & Breadboards 3.3X1.4 400 Tie Points 80 Term Clips
|
|
1
|
854-ZW-MF-10
|
Jumper Wires ZIPWIRE 10cm MALE TO FEMALE
|
|
3
|
855-M22-2010346
|
Headers & Wire Housings 3 SIL VERT PIN HDR TIN
|
Tools and Other Resources
Here is a list of recommended tools to have on hand in order to complete this project:
- Windows-based computer running the OpenVINO Toolkit
- Computer monitor or television with an HDMI port
- A wireless or wired Internet connection
- USB 3.0 Hub
- USB Keyboard
- USB Mouse
- USB Webcam (optional, needed if not using OV5647 camera)
- Wire strippers
- Digital multimeter
- Needle nose pliers
System Overview
Figure 3: Sketching is a great way to
flesh out ideas.
The system has five major components.
- Raspberry Pi: The latest Raspberry Pi is quite a powerhouse for a Single Board Computer
(SBC). It packs a 1.5GHz Broadcom BCM2711 Quad-core Cortex-A72 64-bit SoC. We are using the model that comes
with 4GB of LPDDR4-2400 SDRAM. The now-standard 40-pin GPIO header will be used to interact with the servos.
The Raspberry Pi will also run the Python script, host the camera and HDMI monitor, and interface with the
two NCS2 devices.
- Neural Compute Sticks: The NCS2 devices will run two neural networks. One will analyze the
video stream from the camera and detect when a human face is in the frame. The second network will inference
what emotion the person's face is emoting.
- Camera: The camera will feed a stream of images to the NCS devices for it to analyze. This
can be either a USB webcam or a Raspberry Pi OV5647 camera. A command line flag is used when running the
Python script to set what camera is being used.
- Servos: Three servos will be used along with mechanical sliders to interact with the
real-world. Specifically, different color flowers will be presented to the user based on the emotion they
present to the camera—yellow flowers for a happy expression, blue for a sad expression, and red for an
angry expression.
- HDMI Monitor: The HDMI monitor will be used to display the output of the terminal and the
camera to the user.
Building the Electronics
Figure 4: Hooking up the hardware is very straightforward.
Assembling this project is very straightforward. The only recommendation is to not plug in the USB with the NCS2
sticks until after the Raspberry Pi is fully booted. We aren't sure if this was the cause of a failed boot of
the Raspberry Pi, but it did seem to prevent the Pi from booting initially. Subsequent boot-ups did not have a
problem and the error could not be recreated.
- Insert the micro-HDMI to HDMI adapter into Raspberry Pi.
- Insert the HDMI into an HDMI port of a computer monitor or television.
- Insert the MicroSD card with the latest version of Raspbian flashed onto the memory card. For instructions
on how to get the Raspbian OS onto your MicroSD card, click here.
- Insert the USB keyboard and mouse into the USB 2.0 ports. The USB 2.0 ports will have black plastic
connectors, as opposed to blue plastic for the USB 3.0 ports.
- Install the camera.
- If using an OV5647 camera, gently open the connector, insert the ribbon cable and latch the
connector closed.
- If using a USB webcam, insert the webcam into a USB 3.0 port of the Pi.
- Plug the USB C power supply into an AC outlet.
- Insert the USB C power supply into the USB power jack of the Raspberry Pi.
- Boot the Pi and complete the initial setup. Then shut down the Pi. The initial setup is complete.
Figure 5: Pinout of the Raspberry Pi
PIO header
With the initial setup complete it is time to attach the servos. We will be attaching three servos to the Pi.
- Connect a hookup wire from the
GND pin on the Raspberry Pi to the GND (blue) rail
on the mini-breadboard.
- Connect a hookup wire from the
5V pin on the Raspberry Pi to the power (red) rail on the
mini-breadboard.
- Connect the
GND wire of each servo to GND rail of the breadboard.
- Connect the Vcc wire of each servo to the power rail of the breadboard.
- Connect the control signal of the first servo to
GPIO pin 12 of the Raspberry Pi. This will be
the “Happy” servo.
- Connect the control signal of the first servo to
GPIO pin 13 of the Raspberry Pi. This will be
the “Sad” servo.
- Connect the control signal of the first servo to
GPIO pin 18 of the Raspberry Pi. This will be
the “Angry” servo.
Now that the servos are wired, it is time to do the final to interface with the two NCS devices.
- Insert the two Intel Neural Compute Sticks into the USB 3.0 hub. Do NOT yet insert the USB hub into the
Raspberry Pi.
- Boot up the Pi again and log in.
- Install the needed software to operate the NCS devices. (see the Software section of this article) and
reboot the machine once again. Insert USB 3.0 hub into a USB 3.0 port on the Raspberry Pi when directed.
- Follow the remainder of the tutorial in the Software section to get everything running.
Software
Figure 6: The default Raspbian OS provides all the tools needed to edit the
Python script to suit the project to your needs.
In this section, we will detail the software side of the project. This project has been tested with the Buster
version of
Raspbian. It was tested on both a Raspberry Pi 3B+ and a Raspberry Pi 4.
If you have never flashed a version of Raspbian OS onto a microSD card, we recommend using the Etcher
application, which can be found using Google, as well as this tutorial.
One last note, if your Raspberry Pi 4 fails to boot, there is a chance that the EEPROM has become corrupted.
Follow these
steps to
correct that situation.
The rest of this guide will focus on the steps to take after you have completed the initial setup of the
Raspberry Pi and have successfully connected it to the internet.
Installing OpenVINO and the Trained NN Model
Having experience navigating Linux via a command line is very beneficial to getting this project up and running
on your own board. Here are the steps that we followed to get Intel OpenVINO software up and running. Do not
install the Neural Compute Sticks until directed to do so.
$sudo mkdir -p /opt/intel/openvino
$cd ~/Downloads/
$wget –no-check-certificate https://download.01.org/opencv/2019/openvinotoolkit/R2/l_openvino_toolkit_runtime_raspbian_p_2019.2.242.tgz
$sudo tar -xf l_openvino_toolkit_runtime_raspbian_p_2019.2.242.tgz
--strip 1 -C /opt/intel/openvino
$sudo apt install cmake
$source /opt/intel/openvino/bin/setupvars.sh
$echo "source /opt/intel/openvino/bin/setupvars.sh" >> ~/.bashrc
To test to see if everything is working up to this point, open a new terminal. You should see the following:
[setupvars.sh] OpenVINO environment initialized
Assuming we are successful, let’s continue in the original terminal window.
$sudo usermod -a -G users "$(whoami)"
$sh /opt/intel/openvino/install_dependencies/install_NCS_udev_rules.sh
$sudo apt-get install -y python3-picamera
$sudo -H pip3 install imutils --upgrade
$git clone https://github.com/Mouser-Electronics/Emotions_and_PhysicalComputing.git
$cd Emotions_and_PhysicalComputing
Lastly, if you are using the OV5647 camera execute the following:
$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 1
Alternatively, if you are using a USB webcam execute the following:
$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 0
Next, let’s take a deeper look at the Python file.
Project Files
This following source code files can be found in this project’s GitHub repository Software
folder.
main.py: The Python script is the file that we will add our project-specific code to take the
outputs of the NN and depending on the emotions inferred, cause an effect in the real-world via the servos.
face-detection-retail-0004.xml: Contains the network topology of a neural network that can
detect whether a human face is in an image presented to it.
face-detection-retail-0004.bin: Contains the weights and biases of a neural network that can
detect whether a human face is in an image presented to it.
emotions-recognition-retail-0003.xml: Contains the network topology of a neural network that
can detect various emotions in images of a human face.
emotions-recognition-retail-0003.bin: Contains the weights and biases of a neural network that
can detect various emotions in images of a human face.
Libraries
The preprocessor directive #include lets us add libraries into our projects. This promotes code
reuse, unless you
have very specific needs, reinventing the wheel is not necessary. This project uses the following libraries:
sys: Provides variables and functions to interact with the interpreter, such as passing command
line arguments to the Python script.
numpy: Also referred to as NumPy, is a package that provides advanced mathematic functions that
can be used by the script.
os: Allows access to OS-dependent functionality such as interfacing with the file system and
I/O.
time:This library provides time-related functions such as getting date and time from the system
or setting delays via the sleep() function.
multiprocessing: This library provides a mechanism to spawn multiple processes that can run
concurrently.
gpiozero: This library provides functions to interact with the 40-pin GPIO header with various
actuators and sensors such as servos and LEDs.
openvino.inference_engine: Allows the Python script to interact with the Inference Engine
aboard the NCS2 devices.
heapq: This library provides an implementation of a heap queue algorithm. Heap queue is a type
of priority queue that implements a binary tree where the element with the smallest priority is always kept
at the root.
threading: This library provides threads, allowing for concurrent threads to run at once.
pivideostream: This library provides a mechanism to interact with the camera.
imutils: This library provides a set of functions of doing image processing such as rotating,
translating,
and resizing.
Variables and Constants
The main.py hosts' several variables that will allow us to interact with the NCS2 devices and the servos.
These are the instances of the servo model from the gpizero library, one for each servo:
happyServo = Servo(12)
sadServo = Servo(13)
angreyServo = Servo(18)
These are variables used to extract the emotion detected by the NCS2 so that they can be used by other aspects of
the Python script:
emotion = str(object_info[7])
LABELS = ["neutral", "happy", "sad", "surprise", "anger"]
The most important coding aspect of this project will be added to around line 333 in the main.py source file:
…
out self.exec_net.requests[dev].outputs["prob_emotion"].flatten()
emotion = LABELS[int(np.argmax(out))]
if emotion == "happy":
setServosHappy()
else if emotion == "sad":
setServosSad()
else if emotion == "anger":
setServosAngry()
else:
setServosNeutral()
detection_list.extend([emotion])
self.resultsEm.put([detection_list])
self.inferred_request[dev] = 0
…
Functions
setServosHappy(): When a happy face is detected, this function will set the servo on
GPIO12 to
the maximum position while setting GPIO13 and GPIO18 to the minimum position.
setServosSad(): When a sad face is detected, this function will set the servo on
GPIO13 to the
maximum position while setting GPIO12 and GPIO18 to the minimum position.
setServosAngry(): When a sad face is detected, this function will set the servo on
GPIO18 to
the maximum position while setting GPIO12 and GPIO13 to the minimum position.
setServosNeutral(): When no face or emotionless face (neutral) is detected, the function sets
all servos the minimum position.
Project in Action
Once the project is assembled and the software is installed, it’s time to have some fun!
- From a terminal, enter the following:
$cd ~/Downloads/Emotions_and_PhysicalComputing
If using an OV5647 camera:
$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 1
Alternatively, if using a USB webcam:
$python3 main.py -wd 320 -ht 240 -numncs 2 -cm 0
Figure 7: The terminal and camera output
- A small window pops up on the screen that is a video feed from the camera along with some text and boxes.
- Ensuring the area is well lit, stand in front of the camera and begin to make happy, sad and angry faces.
- The type of face detected should be displayed on the screen as well as the terminal window.
- The servos should be functioning and moving the flowers up and down depending on the face you are emoting.
AI meets physical computing!
Troubleshooting Tips
- 1. When running the servos for the first time, run the code with the servos connected to the Raspberry Pi
but not to the mechanical components. This will ensure the servos aren't damaged if the code causes the
servos to move in a way opposite of the way they are physically installed.
- Problems can occur if you are using the wrong flag for the webcam versus the OV5647 camera. Remember
-cm 1 for the OV5647 camera and -cm 0 for a USB webcam.
- Ensure you are using low voltage servos and that you are running off the Raspberry Pi’s 5V pin and not
3V
pin. Also, consider an external power supply just for the servos. Be sure to connect the grounds together.
Figure 8: Concept for the slider mechanism. 6" wide pine board cut with two
opposing 45-degree cuts makes a simple sliding mechanism.
Figure 9: Collage of finished project.
Author Bio
Michael Parks, P.E. is a contributing
writer for Mouser Electronics and the owner of Green Shoe Garage, a custom electronics design studio and
technology consultancy located in Southern Maryland. He produces the S.T.E.A.M. Power Podcast to help raise
public awareness of technical and scientific matters. Michael is also a licensed Professional Engineer in the
state of Maryland and holds a Master's degree in systems engineering from Johns Hopkins University.