반도체 AI 인더스트리 4.0 SDV 스마트 IoT 컴퓨터 통신 특수 가스 소재 및 장비 e4ds plus

Processing massive IIoT data requires a time-series database, as the cloud alone struggles to handle it.

기사입력2020.01.13 15:53

It's difficult to operate a smart factory with just the cloud.
Markbase DBMS, millions of transactions per second
Real-time sensor data collection, storage, and analysis are possible.



As the number of cases applying the cloud to business increases, the number of cases adopting the cloud in smart factories is also increasing.

However, the amount of data produced by various sensors and equipment installed in smart factories reaches millions of tags per second, so the cloud alone has limitations.

Therefore, a data storage solution that can collect and store this at high speed and a solution that can store it in compression are required.

Last November, Markbase DBMS, a domestic database management system, was the first in Korea to be listed as an international standard in the TPCx-IoT field and obtained international certification.

The Transaction Processing Performance Council (TPC) is an internationally recognized organization that tests and evaluates the performance of underlying software and computing equipment, such as servers, storage, and database management systems (DBMS). TPCx-IoT is a benchmark for performance testing in IoT environments.

The Markbase DBMS, which has received this certification, is a time-series DBMS capable of collecting, storing, and analyzing millions of sensor data per second in real time, and can be applied to various fields such as IoT, smart cities, smart factories, and big data.

Markbase DBMS achieved first place with performance that was approximately 140% better than foreign DBMSs such as Couchbase and Hbase. HBase, the current leader, achieved a performance of 742,256.79 IoTps, while Markbase achieved 1,043,276.60 IoTps. IoTps is a unit measuring the speed of IoT data collection/analysis per second.
Markbase CEO Kim Seong-jin (Photo = Markbase)

We met with Kim Seong-jin, CEO of Markbase, Inc., the creator of Markbase DBMS, the first in Korea to receive international certification in the IoT field, and asked him about the necessity of databases in the edge computing era.


Q. Please give us a brief introduction about yourself and Markbase.
A. I was a founding member of Altibase, an in-memory database company, and have been developing traditional databases for 15 years. I served as CTO and CEO at that company for several years before founding Markbase in 2013. I anticipated that in the future, data generated by machines and sensors would exceed that generated by humans, leading to the need for a new type of database. This led me to found the company.


Q. Demand for smart factories is growing, with the government announcing a plan to deploy 30,000 smart factories by 2022. What role and limitations does the cloud have in smart factories?
A. The cloud is an IT infrastructure that many companies are adopting these days. There are many attempts to incorporate smart factories here, but there are several problems.

The first is cost. The amount of data generated in smart factories is virtually limitless, and storing all of this data in the cloud can be cost-prohibitive.

The second is slow responsiveness. Sending all data to the cloud to make specific business decisions can be a weakness in smart factories, where real-time performance is essential.

The third issue concerns high availability and security. No customer can afford to let their business down. If a cloud provider's infrastructure goes down for any reason or a network issue occurs, it will inevitably have a significant impact on the business. Furthermore, there's a significant psychological resistance to sending users' critical and sensitive production data to the cloud, where its location may be unknown.


Q. Edge computing, which complements cloud computing, has recently emerged as a hot topic. What is edge computing, and how does it complement cloud computing?
A. Edge computing is a concept that emerged with the goal of overcoming the limitations of cloud computing mentioned above and creating a better user experience. Its overarching philosophy is the separation of user data.

In other words, the goal is to store all data that requires real-time availability at the edge and transmit only the necessary data to the cloud. Building your business around this model not only dramatically reduces the data costs associated with cloud usage, but also provides real-time access to events and data generated at the edge.

Ultimately, a great model can be created that leverages the strengths of the cloud while compensating for its shortcomings with the edge.


Q. Cloud solution providers like AWS, Oracle, and Microsoft Azure will also be offering edge computing solutions. What level of performance do you see these solutions providing? Are they suitable for use in manufacturing environments?
A. The solutions offered by existing cloud providers are not suitable for immediate use in manufacturing environments.

While simply collecting data from the manufacturing environment into the cloud may not seem like a major issue, there are still many areas that need improvement when it comes to collecting all the data to meet actual customer requirements, separating it according to requirements, processing it, and transmitting it.
Markbase Product Overview (Image = Markbase)

In particular, existing cloud service providers have little solution for edge computing when it comes to storing and processing data on edge devices.


Q. What challenges might there be when implementing edge computing in a manufacturing environment?
A. To properly support real-world edge computing environments, several major challenges must be addressed.

First, data must be collected at high speeds from edge devices with low hardware specifications. Most edge devices have slow CPUs and limited memory, yet the actual data collected can range from hundreds to tens of thousands of records per second. A technical solution is needed to determine how to store this data.

Second, we must provide a simple and concise solution for data replication or transmission. To achieve data segregation, one of the most fundamental concepts of the edge, we must inevitably transmit or replicate data generated at the edge, particularly data from specific sensors, to another location, such as the cloud.

However, actual edge computing environments are not only characterized by unstable networks but also by multiple, discontinuous business environments. Despite these challenges, stable data transmission management is essential. Implementing this on your own would be prohibitively expensive.

Third, it's easy to manage a large number of edge devices. In a single manufacturing environment, dozens, even hundreds, of edge devices operate. Software installation, management, and updates for these devices pose significant challenges.

Fourth, real-time data monitoring and visualization. In a cloud environment, data visualization is easy and familiar. However, real-time monitoring of data entering edge devices with hardware limitations and visualization, such as drawing charts for specific data, present significant challenges.


Q. With the increasing number and performance of sensors within factories, IIoT data is growing. What are the requirements for a database that can efficiently utilize this data?
A. IIoT data is time-series data, generating thousands to millions of data per second from numerous sensors. To process this data, the following requirements must be addressed.

First, high-speed input. The database must be able to accept at least tens of thousands of sensor data items per second.

Second, high-speed extraction is essential. The database will likely contain billions of sensor data, and data extraction for specific sensors and time ranges must be possible within milliseconds.

Third, real-time statistics must be supported. To monitor trends for specific sensors over time, the ability to process massive amounts of data at millisecond resolutions is essential.

Finally, development convenience is paramount. Organizations developing traditional databases should be able to apply and develop this without significant difficulties.


Q. There's growing interest in time series databases. Why do you think that is?
A. The history of time series databases is relatively young, less than six years old. The reason for the growing interest is that it provides a solution for processing large amounts of sensor data that could not be handled by existing databases.

With the explosive growth of data production in the IIoT field, many people are actively seeking and testing time-series databases to address data processing challenges.


A statistical series is generally defined as a series of observations organized according to a set of criteria. When changes in certain observations or statistics are captured over time and serialized, this statistical series is called a time series.


Q. Could you please explain Markbase's time-series database solution and its advantages when used in smart factories?
A. Markbase is a database engine that has been in development since 2013, the early days of time-series databases, and has already entered the commercialization phase. Dozens of customers in Korea have already adopted and utilized it in their manufacturing sites.

Its greatest advantage is its ability to store and process large amounts of sensor data, previously unmanageable, in real time. Markbase overcomes the performance limitations of databases commonly used in manufacturing environments, such as MS-SQL, and is increasingly being utilized by more and more customers.


Q. Last November, Markbase was selected as the standard database for TPC's TPCx-IoT Benchmark. What does this mean?
A. Being a TPC standard is official proof that it is the most universal and universally recognized product for IoT data processing worldwide.
TPCx-IoT Overview (Image = TPC)

To be selected as a TPC standard database, not only must it pass and fully digest the relevant standard benchmark tests, but it must also receive approval from TPC members such as IBM and Oracle.

Markbase was able to be selected as the official DBMS of TPC through this complex and difficult process.


Q. Can you tell me about some representative use cases of Markbase?
A. Markbase has already been deployed in the steel industry, collecting approximately 50 billion vibration sensor data points per week and providing key data needed for analysis.

Additionally, global shipbuilders are collecting real-time information on hundreds of vessels worldwide and using it as a core data infrastructure to service them.

In addition, it is being used as a core database for processing large amounts of sensor data in manufacturing environments in the cement, paper, and pharmaceutical industries.


Q. What are Markbase's future goals?
A. Markbase is Korea's only time-series database, establishing itself as the de facto standard DBMS in the IIoT field. Going forward, we aim to play a key role in the revitalization of the domestic manufacturing industry through continued technological development.

Furthermore, we aim to establish ourselves as a new, next-generation DBMS, a competitive database essential for the manufacturing industry in the global market, and, starting in Korea, grow into a leading system software company that dominates the global IIoT market.
이수민 기자
기사 전체보기