
▲Databricks Korea CEO Jang Jeong-wook
Last year, domestic sales grew by approximately 80% and reached $1 billion in sales
Integrated data platform… Support for engineering, BI, and AI
Open Source Generative AI Model “Expected to Popularize LLM”
The public cloud is expected to grow at a rate of 15.5% annually until 2026, and the AI and big data analysis markets are also expected to grow rapidly. In the midst of this, Databricks is expanding the size of the lakehouse market, which is accelerating in Korea.
On the 29th, Databricks held its first offline press conference in Korea at the Grand InterContinental Seoul Parnas Hotel and introduced Databricks’ domestic strategy.
Databricks Korea CEO Jang Jeong-wook said, “Based on openness, we will support an integrated environment where data engineers, analysts, and scientists can manage data in an integrated governance framework, thereby contributing to improving work efficiency.” He added, “This year, we will strengthen recruitment and training to expand the competitive business partner ecosystem and transform into a trustworthy data lakehouse platform.”
Databricks officially opened its Korean branch in April of last year and recently announced the full-scale launch of its domestic business by appointing Jang Jeong-wook as its first Korean branch manager. It is the only company to be simultaneously selected as a leader in both data management and ML platforms by Gartner, and currently has over 9,000 customers.
As interest in innovation and efficiency creation through AI grows, the establishment of a data strategy is emphasized. In addition to technological innovation, the discovery of ML models that lead to actual results has become important, and securing data quality and reliability and securing the speed of data processes are important.
In line with this trend, CEO Jang said, “A platform with organically integrated systems is important,” and added, “Instead of a single cloud platform, a multi-cloud platform and open source and format-based openness will be adopted.”
The keyword for Databricks is definitely 'data'. This means that as a data-centric enterprise, we will utilize data and AI in all aspects of business, including customer management, product development, employee productivity, and business operations.
Databricks combines a data warehouse and a data lake to provide an open platform lakehouse that integrates data and AI. Lakehouse is an open, integrated data platform that stores a lot of data in the cloud and supports engineering, business intelligence (BI), and AI and machine learning (ML) for all data.
Databricks simplifies the complex architectures of traditional approaches to processing massive amounts of structured and unstructured data in batch or streaming formats.
Domestic retailers such as E-Mart 24, Amore Pacific, Gmarket, Volvo, Hanwha Systems, Musinsa, and Devsisters, as well as e-commerce and gaming industries, have adopted the Lakehouse platform. “Last year, in 2022, we achieved sales of 1 billion dollars, with a domestic growth of about 80% and the Asia-Pacific region growing by 90%,” said CEO Jang.
“Databricks will become an integrated data platform that provides efficient value to more organizations this year at a critical juncture when Korean business leaders are recognizing the value of data and AI and leveraging them to drive business innovation,” said CEO Jang.
Finally, CEO Jang added, “The market change based on the public cloud is a trend that will continue and will not change,” and “I think that the aspect of being based on openness will be the most sustainable point of differentiation in the business.”
■ Open source model 'Dolly' for popularizing ChatGPT At the event, Databricks introduced a new open source AI model, ‘Dolly’.
Databricks claimed that ChatGPT-3 is cost-effective because it can achieve similar functions to ChatGPT in just three hours of training on a single machine, despite having only 6 billion parameters compared to 175 billion parameters in ChatGPT-3.
AI companies' datasets are sensitive intellectual property and can be sensitive when transferred to third parties. The importance of open source is emphasized, as most machine learning (ML) users believe that owning the models themselves is ideal in the long term.
Databricks said, “The key to improving the quality of state-of-the-art models like ChatGPT is the command-following training data, not the larger or more finely tuned base model.” They added, “We hope that by open-sourcing Dolly’s code, it will become more widely available as a model that any business can customize and use, rather than an LLM that only a select few can implement.”