Palo Alto Networks Inc.

11/09/2024 | News release | Distributed by Public on 11/09/2024 13:20

Is Your Snowflake Data at Risk? Find and Protect Sensitive Data with DSPM

In recent months, there's been increased scrutiny of data managed in third-party applications. A spate of reported security incidents has highlighted the need for effective monitoring of sensitive data stored, accessed or processed by SaaS tools. Snowflake, a hugely popular tool that has been the target of multiple attacks recently, is at the center of many of these discussions.

Details of recent attacks against Snowflake are still somewhat murky, but several organizations seem to have been impacted, including Ticketmaster and Santander Bank. Advance Auto Parts has also revealed that 2.3 million individuals were impacted by a previous breach related to data stored in Snowflake. According to research by Mandiant, attackers stole credentials through the use of infostealer malware. Snowflake responded recently by hardening its MFA enforcement capabilities.

In this article, we will look at how organizations use Snowflake and the related risks that can arise regarding sensitive data. We will then explain how effective data security posture management (DSPM) can help organizations gain visibility into Snowflake data and apply relevant mitigations.

How Snowflake Is Used Today

Snowflake is a cloud data warehouse. The platform has become popular among business and data teams, as it's easy to deploy, easy to manage (with built-in features such as auto-scaling), and provides high performance when querying structured or semistructured data with SQL.

Snowflake can be deployed on AWS, Azure or Google Cloud. That said, it can't be deployed within the customer's cloud account on any of these platforms. Instead, Snowflake instances sit in their own cloud account, and the physical infrastructure is always managed by the data warehouse. Customers will often deploy Snowflake alongside other databases and data lakes as well as use it for OLAP workloads - analytics, dashboarding and machine learning.

Figure 1: Sample architecture and data flow

In many cases, the data in Snowflake will be a copy or subset of the data found in the organization's transactional databases, cloud storage and SaaS applications. This data can be ingested into Snowflake via batch or stream, using either Snowflake-provided tools such as Snowpipe or third-party services such as Fivetran. The external table feature also allows Snowflake to read directly from, but not write into, the customer's cloud storage (Amazon S3, Azure Blob or Google Cloud Storage).

What Are the Security Considerations When Working with Sensitive Data in Snowflake?

While Snowflake offers strong out-of-the-box security features, risks arise when sensitive data is handled by a third-party provider and continuously moved between environments and storage locations.

1. Data Stored Outside the Customer's Cloud Account

Snowflake operates as a separate SaaS platform, meaning that sensitive data is stored and processed outside the customer's public cloud deployment (e.g., Amazon Virtual Private Cloud). This can lead to security complications such as:

  • Visibility: It may be difficult to maintain an up-to-date view of where all sensitive data resides.
  • Compliance: Some regulations require data to be stored in a specific locality or control measures that may be harder to demonstrate with a third-party SaaS solution.
  • Unified security policies: Applying consistent security controls across all data assets becomes more complex when they span multiple environments.

2. Access Control

Since Snowflake is often used for analytics, the data it stores will be shared with a broad range of consumers and tools. This is a feature rather than a bug - ubiquitous access to data is part of the vision of data democratization and is generally a desired outcome for companies that adopt modern data tooling.

But overly broad permissions can lead to trouble, especially if relevant controls aren't implemented. According to reports, attackers against Snowflake previously targeted organizations with weak multifactor authentication (MFA) policies. Practically speaking, these issues are more prevalent, since Snowflake deployments are often managed by nontechnical or semi-technical teams, creating heightened risk for misconfigurations.

3. Data Exfiltration

Once sensitive data is moved into Snowflake, it can be moved out of Snowflake. The risks here are similar to those of other web-accessible, highly interconnected SaaS applications, such as:

  • Bulk data exports: Users with appropriate permissions can export large datasets, potentially leading to accidental or intentional data breaches.
  • Integration with external tools: Snowflake's ability to connect with various BI and analytics tools may create additional avenues for data to leave the platform if not properly secured.
  • Lack of DLP controls: Since Snowflake is provided as a managed infrastructure, organizations can't install their own DLP tools to restrict certain types of sensitive data from being queried or exported.

Reduce Data and Compliance Risk with DSPM for Snowflake

Prisma Cloud DSPM enables bringing Snowflake into the fold of your broader data security strategy. Rather than treating Snowflake as an isolated silo, you can see the full picture of data, risk and compliance in every cloud environment you manage or use - including Snowflake and CSPs such as Amazon Web Services (AWS) or Azure. This allows you to eliminate blind spots and ensure that adequate security controls are in place, wherever your data is stored.

Understand Your Data Flows

Prisma Cloud DSPM helps you understand how sensitive data moves in and out of Snowflake. Not requiring agents (which can't be installed on Snowflake), Prisma Cloud can identify which sensitive data is stored in Snowflake and which cloud storage can be accessed within the data warehouse via the external tables feature. You can also see the pathways - both sources and destinations - through which sensitive data is moved between Snowflake and other systems.

Understanding these flows enables you to implement appropriate security measures at each stage of the data lifecycle (such as in staging tables or ETL pipelines) and better understand risks.

Example scenario: As part of a marketing analytics project, a third-party data pipeline tool ingests customer PII into Snowflake from Azure Blob. If the data flow isn't needed for the particular use case - e.g., a marketing dashboard reading from Snowflake may not require viewable customer emails - then you can block it. If it is needed, you can verify that Snowflake data warehouses containing sensitive data have the correct security controls, such as MFA, in place.

Classify Data to Understand Risk

The "sensitivity" of data depends on several factors, including the data itself and the business context in which it is processed. For example, zip codes and credit card details might both fall under personally identifiable information (PII), but the consequences of each type of record leaking are quite different. Therefore, a full picture of data risk requires accurate, granular classification of sensitive data and mapping of the associated security and compliance risks.

Prisma Cloud DSPM provides 100+ built-in classifiers that can be applied to data stored in Snowflake. You can also easily define custom risks and classifiers, starting from existing labels.

Example scenario: As part of a GDPR compliance project, you might want to scan your environment for records that may be violating compliance requirements. For example, you can create a custom label for all "European resident PII stored outside of EU" that compliance teams can later review.

Related Article: Use Context-Aware Data Classification for a Robust Data Security Posture

Apply the Same Security Policies to Every Environment

Within Prisma Cloud DSPM, you can apply from a single interface the same security and compliance policies to Snowflake as you would to any of the databases in your own cloud account.

Managing all data security posture aspects with Prisma Cloud helps prioritize risk effectively, allows security teams to support their organizations' move to multicloud and hybrid architectures, and reduces the fragmentation and context switching that comes from working with point solutions.

Example scenario: Let's say you have a policy that requires encryption and access logging for all databases containing customer financial information. Prisma Cloud DSPM enables you to see whether this policy is currently violated in any database that stores customer data - from Snowflake to self-managed MySQL instances running on virtual machines.

Learn More

Download Securing the Data Landscape with DSPM and DDR for a more complete understanding of what DSPM is and how it can help you protect your sensitive data. And to learn specifically about Prisma Cloud DSPM for Snowflake, download the datasheet today.