12/03/2024 | Press release | Distributed by Public on 12/03/2024 13:13
Today, we announced the next generation of Amazon SageMaker, which is a unified platform for data, analytics, and AI, bringing together widely-adopted AWS machine learning and analytics capabilities. This announcement includes Amazon SageMaker Data and AI Governance, a set of capabilities that streamline the management of data and AI assets.
Data teams often face challenges when trying to locate, access, and collaborate on data and AI models across their organizations. The process of discovering relevant assets, understanding their context, and obtaining proper access can be time-consuming and complex, potentially hindering productivity and innovation.
SageMaker Data and AI Governance offers a comprehensive set of features by providing a unified experience for cataloging, discovering, and governing data and AI assets. It's centered around SageMaker Catalog built on Amazon DataZone, providing a centralized repository that is accessible through Amazon SageMaker Unified Studio (preview). The catalog is built directly into the SageMaker platform, offering seamless integration with existing SageMaker workflows and tools, helping engineers, data scientists, and analysts to safely find and use authorized data and models through advanced search features. With the SageMaker platform, users can safeguard and protect their AI models using guardrails and implementing responsible AI policies.
Here are some of the key Data and AI governance features of SageMaker:
For seamless integration with existing processes, SageMaker Data and AI Governance provides API support, enabling programmatic access for setup and configuration.
How to use Amazon SageMaker Data and AI Governance
For this demonstration, I use a preconfigured environment. I go to the Amazon SageMaker Unified Studio (preview) console, which provides an integrated development experience for all your data and AI use cases. This is where you can create and manage projects, which serve as shared workspaces. These projects allow team members to collaborate, work with data, and develop ML models together.
Let me start with the Govern menu in the navigation bar.
New data governance capabilities called domain units and authorization policies that help you create business unit- and team-level organization and manage policies according to your business needs. With the addition of domain units, you can organize, create, search, and find data assets and projects associated with business units or teams. With authorization policies, you can set access policies for creating projects and glossaries.
Domain units also help you with self-service governance over critical actions such as publishing data assets and utilizing compute resources within Amazon SageMaker. I choose a project and navigate to the Data sources tab in the left navigation pane. You can use this section to add new or manage existing data sources for publishing data assets to the business data catalog, making them discoverable for all users.
I return to the homepage and continue exploring by choosing Data Catalog, which serves as a centralized hub where users can explore and discover all available data assets across multiple data sources within the organization. This catalog connects to various data sources, including Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and AWS Glue.
The semantic search feature helps you find relevant data assets quickly and efficiently using natural language queries, which makes data discovery more intuitive. I enter events in the Search data area.
You can apply filters based on asset type, such as AWS Glue table and Amazon Redshift.
Amazon Q Developer integration helps you interact with data using conversational language, making it easier for users to find and understand data assets. You can use example commands such as "Show me datasets that relate to events" and "Show me datasets that relate to revenue." The detailed view provides comprehensive information about each dataset, including AI-generated descriptions, data quality metrics, and data lineage, helping you understand the content and origin of the data.
The subscription process implements a controlled access mechanism where users must justify their need for data access, providing proper data governance and security. I choose Subscribe to request access.
In the pop-up window, I select a Project, provide a Reason for request such as need access, and choose Request. The request is sent to the data owner.
This final step makes sure that data access is properly governed through a structured approval workflow, maintaining data security and compliance requirements. During the owner approval process, the data owner receives a notification and can review the request details before choosing to approve or deny access, after which the requester can access the data table if approved.
Now available
Amazon SageMaker Data and AI Governance offers significant benefits for organizations looking to improve their data and AI asset management. The solution helps data scientists, engineers, and analysts overcome challenges in discovering and accessing resources by offering comprehensive features for cataloging, discovering, and governing data and AI assets, while providing security and compliance through structured approval workflows.
For pricing information, visit Amazon SageMaker pricing.
To get started with Amazon SageMaker Data and AI Governance, visit Amazon SageMaker Documentation.
- Esra