[편집자주] 인공지능의 발전을 이끌고 있는 2개의 축이 있다. 현재 시장에서 가장 스포트라이트를 받고 있는 것은 단연 생성형 AI일 것이다. 챗 GPT가 쏘아 올린 작은 공이 스노우볼이 돼 현재 빅테크를 중심으로 초거대언어모델의 개발과 리더십 경쟁이 치열하다. 그리고 남은 1개의 축은 엣지 AI 혹은 온-디바이스 AI이다. 엣지 AI는 앞선 생성형 AI에 비해 상대적으로 대중적의 시선을 받지는 못하고 있지만 산업과 기술단에서의 실제적 활용도가 급격히 확대되고 있는 상황이다.
“All services start with ‘Perception’”
In-cabin monitoring system, sensor integration potential↑
Computer Vision, Better Service, Sensing Blind Spot Replacement
“In-car computation, edge case failures are also unacceptable”
[Editor's Note] There are two axes that are driving the development of artificial intelligence. The one that is currently receiving the most spotlight in the market is definitely generative AI. The small ball that Chat GPT launched has become a snowball, and now there is a fierce competition for leadership in the development of ultra-large language models centered around big tech. And the remaining axis is edge AI or on-device AI. Edge AI is relatively less popular than the previous generative AI, but its practical use in industry and technology is rapidly expanding.
Accordingly, we explored computer vision, a cognitive solution that is the hottest area in edge AI and the starting point of all AI-based services. We met with DeltaX CEO Kim Soo-hoon, who has expertise in computer vision and edge AI.
▲Delta X CEO Kim Soo-hoon during an interview
■ Where is computer vision technology being applied in automobiles? One of them is called the in-cabin monitoring system, which installs a camera inside the car to monitor the inside of the car and can even create a service model based on the monitored results.
Some functions are also implemented as functions required for safety areas. Originally, a camera was installed on the front of the car to analyze whether the driver was drowsy or inattentive. This solution was commonly called DMS, Driver Monitoring System, and some high-end vehicles are equipped with it.
Beyond the driver's inattention to whether his eyes are open or closed, or where he is looking, and the actions of all passengers, whether they are wearing their seatbelts or not asleep, and what they need, it is possible to control the entire vehicle through gestures using specific hand positions of the driver.
■ The future of services within smart cars when vision technology is utilized Solutions that enable conversations, such as generative models and GPT, are emerging. But, the reporter or the producer got in the car and suddenly started talking about something I had no interest in, or this was a woman.
If you say that it is a conversation that people would be interested in or a conversation that children would be interested in, they may lose interest.
Ultimately, in order for me to meet like this and have a proper conversation, I have to start with knowing and understanding who I am and what I am interested in, so all services also start with ‘perception.’
It is possible to create better services, and from another perspective, it is possible to create things that were impossible with past sensors.
For example, if there is a camera attached, it seems like there will be advertisements like this on the car.
My child is in the backseat, and he has fallen asleep. But the father is playing loud music. The camera in the back sees the backseat, so it knows it is a child and a person. When it sees that the eyes are closed and not open, it knows, “Oh, he is sleeping,” and it makes a judgment call, “(The AI) can decide that the music should be turned down.”
I think that's probably how advertisements for smart cars will look like.
■ Can computer vision solutions replace broad sensor solutions? The cameras installed in our in-cabin monitors understand the driver and all passengers as if they were humans, through the images coming through the cameras.
So, if there is a solution like this, it seems like it could replace the seat belt sensor, and if you look at level 2 or 3 these days, there are smart cruise functions like this. That means if you take your hands off the handle, an alarm will sound shortly, but the fact that it knows you have taken your hands off means that there is some kind of sensor installed.
If you open the window and get out, it tells you that the door is open, and if the door doesn't close, the sensor tells you that the door is open. In other words, there are a ton of analog and digital sensors installed all over the interior, and through them, you can find out the condition of the car and things like that.
Replacing so many sensors with a single camera opens up the possibility of enormous cost savings for automobiles, so leaving aside the issue of reliability, functionally speaking, there is an enormous possibility of integrating so many sensors into a single image sensor. It is a very attractive approach.
■ What are the considerations when developing vision technologies in the automotive field that DeltaX is focusing on? Computer vision required for automobiles is actually artificial intelligence, and a lot of development is done using a code language system called Python, and these code systems are usually run on operating systems such as Linux or Windows on computers.
However, the automotive environment is very different. First of all, the automotive environment is not an environment that can prepare and install high-performance computing devices such as PCs or servers.
Because the automobile industry is extremely cost-sensitive, the first difficulty is that artificial intelligence must run on a single small chip.
Second, since the automobile application itself is a means of transportation for people to ride, if something like a sensor error occurs, the results can be very fatal. Edge cases or corner cases are defined as cases where things usually go well but problems occur in very special cases. The automobile market does not tolerate failures in these small edge cases.
So, in the typical industry, when things that are considered “good enough” are transferred to automobiles, there are many cases where they fail to go beyond that and are applied or the product is not released.
It's a huge challenge in the automotive business to make it more perfect and to create solutions that don't fail in any case.
The fact that the operational environment is very poor means that our modules and model algorithms must be made very small and lightweight, and since they cannot run extremely heavy models in terms of size and processing, they must be optimized for lightweighting. This is also more difficult than developing, and requires a lot of experience and know-how. There are many technical difficulties in this area.
How to overcome this is a field that interests us much more than developing our own algorithm, and we are investing a lot of manpower into it to gain experience and technology.
There are a lot of computer vision companies. But if you look at the companies that do automobiles, there are only a few that can develop and optimize algorithms to run well and quickly in an automobile environment, and even make them lightweight.
DeltaX currently prides itself on being among those few.
■ Is the core competitiveness of cognitive technology in embedded environments the optimization and lightweighting of algorithms? We get asked this question a lot these days. Since there are so many different companies working on AI, the question that comes to mind is, “What makes DeltaX’s solution different?”
Perhaps the question is, have you seen similar solutions that are doing cognitive modeling, and what is the performance difference compared to them? If we want to objectively compare performance, we need to compare the difference between A and B while keeping the rest of the environment the same.
Within the automotive space, there is a clear difference in performance when running under really harsh computing environments that manufacturers demand.
First of all, although they say they “develop artificial intelligence algorithms,” only a few companies have a good understanding of the environment in which cars should operate, and the code system that allows them to operate in such environments is different.
For example, we develop new algorithms using a new code system called C++ instead of Python, and since this system is completely different, the capacity is also large. In the past, models had large capacities, but in the embedded environment, the size of the memory that can be stored is very small, so the capacity itself must be small and the operation version must be very small.
Developing a solution that satisfies all of these is a completely different story from developing the general computer vision that we know or the general solutions that you have seen in demos.
■ Lastly, a word to e4ds news readers I am still a developer and an engineer, and have been in the development field since I graduated from school. I am still a developer and engineer who is constantly developing, and I am the CEO of this company.
DeltaX is also considering an internal listing in 2025. That's also one of our timelines, and next year, I think it's important for me to wrap up R&D, the PoC we're doing now, and various investments related to it.
thank you