데이터 및 AI 선도기업 데이터브릭스가 ‘유니티 카탈로그(Unity Catalog)’ 기능 강화 버전을 발표했다. 이번 업데이트로 카탈로그는 아파치 아이스버그(Apache Iceberg) REST 카탈로그 API를 네이티브 지원해, Iceberg 테이블을 데이터브릭스뿐 아니라 트리노(Trino), 스노우플레이크(Snowflake), 아마존 EMR(Amazon EMR) 등 외부 엔진에서도 읽고 쓸 수 있게 했다.
Iceberg table can read and write external engines such as Trino, Snowflake, and Amazon.
Databricks, a leading data and AI company, further solidifies its leadership across the data lakehouse ecosystem with differentiated innovations including Iceberg and Delta Lake support, cross-engine governance, and business metrics integration.
On the 23rd, Databricks announced an enhanced version of its ‘Unity Catalog’ feature.
With this update, the catalog now natively supports the Apache Iceberg REST catalog API, allowing Iceberg tables to be read and written not only from Databricks, but also from external engines such as Trino, Snowflake, and Amazon EMR.
By integrating Delta Lake and Iceberg into a single governance framework, we have evolved into an open standards governance platform that encompasses a variety of table formats.
Let’s take a look at the three main features available in public preview: support for the Iceberg REST catalog API, which enables creation of managed tables and read/write operations from any Iceberg-compatible engine; and Databricks’ AI predictive optimization, which provides optimal performance for the cost.
Lakehouse Federation also natively imports Iceberg tables managed in external catalogs. Allows navigation and governance like a table.
Here, Delta Sharing can be used to securely share Iceberg tables across organizations, fundamentally eliminating silos based on data format.
What's notable about this announcement is that it significantly expands capabilities for business users in addition to technical users.
Through 'Unity Catalog Metrics', KPI and indicator definitions scattered across BI tools were integrated into the platform, and promoted to first-class data assets directly accessible via SQL.
This enables consistent interpretation of metrics across business domains, including sales, marketing, and finance teams, and allows data to be analyzed and decisions made based on the same criteria without the assistance of engineers.
The business exploration experience is also being offered in the form of an internal marketplace called 'Discover'.
You can search and receive recommendations for high-value assets such as tables, dashboards, AI agents, and Genie spaces curated by each domain, and easily find reliable data through metadata such as documents, owners, and usage status, and AI-based automatic recommendation functions. It can be used in a self-serve manner without any approval process and is currently available in private preview.
Unity Catalog adds intelligence to the entire user experience, visualizing data quality signals, usage patterns, asset associations, authentication, and retirement status.
Ask natural language questions to the built-in 'Databricks Assistant' and receive contextual and reliable answers in real time based on policy-based metrics, making your data exploration journey smarter.
“We pioneered unified governance four years ago with the Unity Catalog, and this update completes the industry’s best catalog across all open tabular formats, including Iceberg and Delta Lake,” said Matei Zaharia, Co-Founder and CTO of Databricks.
He added, “We will realize the democratization of data+AI for business users as the only platform that freely reads and writes management tables even from external engines.”