인피니언 HV GaN
반도체 AI 인더스트리 4.0 SDV 스마트 IoT 컴퓨터 통신 특수 가스 소재 및 장비 e4ds plus

​MCU-based AI implementation "becomes the future of embedded industry"

기사입력2021.07.26 14:16

Increasing demand for MCU-based AI capabilities
Embedded systems better suited for inference than AI training
Design cases including NPU and NN accelerators are expected to increase



According to IDC, the number of IoT devices is expected to reach 41.6 billion by 2025, and the amount of data they will generate is estimated at 79.4 zettabytes (ZB). The core of IoT technology is connectivity, which offers benefits such as automated control, easy communication between devices, and data sharing.

AI technology makes IoT technology more useful. General IoT devices only collect and share data, requiring human intervention. With the addition of AI functions, they can analyze data, learn, make decisions, and take action on their own, which can significantly reduce cloud costs.

“AI is a technology that will help unleash the full potential of the IoT,” Kavita Char, senior product marketing manager at Renesas America, wrote for Embedded.com. “AIoT devices are products that can interact with their environment.”

The combination of AI capabilities and IoT devices has opened up a new market for MCUs. There are increasing cases of MCU-based applications that include AI acceleration functions. AI-enabled MCU products are suitable for keyword spotting, sensor fusion, vibration analysis, and voice recognition. In addition, high-performance MCUs enable complex vision applications.

What AI technologies are suitable for application to IoT devices?

The Tsar cited four technologies as examples. The first is machine learning (ML). Machine learning algorithms build models based on data and allow devices to identify patterns on their own.

Machine learning vendors provide the algorithms, APIs, and tools needed to train models that can be deployed on embedded systems. These embedded systems use pre-trained models and perform inference and prediction tasks based on new input data. Applications include sensor hubs, keyword discovery, predictive maintenance, and classification.

The second is deep learning. Deep learning is a type of ML that trains a system by extracting progressively higher-level features and insights from complex input data using multiple layers of neural networks. Deep learning works with very large, diverse, and complex input data, and the system can learn repeatedly, improving the results at each step. Examples include image processing, customer service chatbots, and facial recognition.

The third is natural language processing (NLP). NLP is a field of AI that uses natural language to process interactions between systems and people. NLP helps systems understand and interpret human language, whether written or spoken, and make decisions based on it. Examples of applications include speech recognition systems, machine translation, and predictive input.

The fourth is computer vision. This is a field of AI that trains machines to collect, interpret, and understand image data and take specific actions. Machines collect digital images and videos using cameras and other devices, and use deep learning models and image analysis tools to accurately identify, classify, and view objects, and then take action based on that. It could be used for defect detection on manufacturing lines, medical diagnosis, facial recognition in retail stores and testing of self-driving cars.

◇ MCU will become a key component for implementing AIoT

In the past, AI implementation was considered to be the exclusive domain of CPUs, GPUs, and MPUs with large memory and cloud connectivity. However, recently, as demand for “edge intelligence,” which refers to storing, analyzing, and processing data at the point where it is generated, has increased across industries, MCUs have begun to be used in embedded AIoT applications.

MCU-based IoT devices AI capabilities enable real-time decision making and rapid response to specific situations. They also have the advantages of lower bandwidth requirements, lower power, lower latency, lower cost, and higher security.

Although performance has improved somewhat recently, MCUs are always used at the end, so AIoT technology must be implemented in limited resources and environments, such as narrow spaces and low power supplies. This is possible through the availability of a neural network (NN) framework.

NN is a collection of nodes arranged in layers that take input from the previous layer, add weights and bias sums to it, and produce output. The output is passed to the next layer along all outgoing connections. During training, training data is fed to the first or input layer of the network, and the output of each layer is passed to the next layer. The last or output layer produces the model's prediction. This is compared to the expected value to assess model error.

The training process involves modifying or adjusting the weights and biases of each layer of the network at each iteration, using a process called backpropagation, until the network's output is closely related to the expected value. That is, neural network training requires very high computing power and memory, which is usually done in the cloud.
▲ NN training and inference process [Image = Renesas]

The pre-trained NN model is embedded in the MCU and used as an inference engine for new incoming data based on training. Inference generation is suitable for MCUs because it has lower computing performance requirements than model training. The weights of the pre-trained NN model are fixed and can be placed in flash, which reduces the amount of SRAM required, making it especially suitable for resource-constrained MCUs.

◇ How to implement AIoT based on MCU?

There are several steps to implement AIoT on MCU. Typically, available NN framework models, such as Caffe or Tensorflow Lite, are used for MCU-based IoT devices. First, NN model training for ML is performed by AI experts in the cloud using tools provided by AI vendors. NN model optimization and MCU integration are performed using tools from AI vendors and MCU manufacturers. Through this, the MCU performs inference with a pre-trained NN model.

The first step is done entirely offline: capturing a large amount of data from the end device or application, which is then used to train the NN model. The topology of the model is defined by the AI developer to make the most of the available data and provide the output required for the application. Training at this time is performed by repeatedly passing the data set to the model with the goal of minimizing the error in the model output.

The second step is to convert a pre-trained model optimized for a specific function, such as speech recognition, to fit the MCU. First, the model is converted to a flatbuffer file. The flatbuffer file is converted to C code and transferred to the target MCU as a runtime executable file.

The MCU, equipped with a pre-trained embedded AI model, generates inferences based on the training as new data comes in from the end device. As new data classes come in, the NN model can be sent back to the cloud for retraining, and the retrained model can be programmed into the MCU via an OTA firmware upgrade.
▲ AI implementation using offline pre-training model [Image = Renesas]

There are two ways to design an MCU-based AI solution. Let's assume that the target MCU uses an Arm Cortex-M core. The first way is to run the converted NN model on the Cortex-M CPU core and accelerate it using the CMSIS-NN library. It is simple to configure because no additional hardware acceleration is required. It is suitable for simple AI applications such as keyword spotting, vibration analysis, and sensor hubs.

The second way is to add a NN accelerator or a small neural processing unit (NPU). Small NPUs accelerate ML on resource-constrained IoT devices and support compression functions that can reduce model power and size. They also provide the ability to run NN networks for audio processing, speech recognition, image classification, and detection.

Networks that are not supported by the NPU can be replaced by the basic CPU cores and accelerated by the CMSIS-NN library. In this case, the NN model runs on the NPU.

◇ Edge, AI is also the future

As MCU performance continues to improve, we will likely see full AI capabilities, including lightweight learning algorithms and inference, built directly into the MCU in the future.

“Implementing AI with resource-constrained MCUs will become more common,” said Czar. “The line between MCUs and MPUs will blur, and new applications and use cases will continue to emerge as lighter NNs emerge.”

This means that in addition to inference, we will see lightweight learning algorithms running directly on the MCU. Chief Char analyzed that this change will open up new markets and applications for MCU manufacturers, and significant investment will follow.
이수민 기자
기사 전체보기